TAILIEUCHUNG - Báo cáo khoa học: "Unsupervised Language Model Adaptation Incorporating Named Entity Information"

Language model (LM) adaptation is important for both speech and language processing. It is often achieved by combining a generic LM with a topic-specific model that is more relevant to the target document. Unlike previous work on unsupervised LM adaptation, this paper investigates how effectively using named entity (NE) information, instead of considering all the words, helps LM adaptation. We evaluate two latent topic analysis approaches in this paper, namely, clustering and Latent Dirichlet Allocation (LDA). . | Unsupervised Language Model Adaptation Incorporating Named Entity Information Feifan Liu and Yang Liu Department of Computer Science The University of Texas at Dallas Richardson TX USA ffliu yangl @ Abstract Language model LM adaptation is important for both speech and language processing. It is often achieved by combining a generic LM with a topic-specific model that is more relevant to the target document. Unlike previous work on unsupervised LM adaptation this paper investigates how effectively using named entity NE information instead of considering all the words helps LM adaptation. We evaluate two latent topic analysis approaches in this paper namely clustering and Latent Dirichlet Allocation LDA . In addition a new dynamically adapted weighting scheme for topic mixture models is proposed based on LDA topic analysis. Our experimental results show that the NE-driven LM adaptation framework outperforms the baseline generic LM. The best result is obtained using the LDA-based approach by expanding the named entities with syntactically filtered words together with using a large number of topics which yields a perplexity reduction of compared to the baseline generic LM. 1 Introduction Language model LM adaptation plays an important role in speech recognition and many natural language processing tasks such as machine translation and information retrieval. Statistical N-gram LMs have been widely used however they capture only local contextual information. In addition even with the increasing amount of LM training data there is often a mismatch problem because of differences in domain topics or styles. Adaptation of LM therefore is very important in order to better deal with a variety of topics and styles. Many studies have been conducted for LM adaptation. One method is supervised LM adaptation where topic information is typically available and a topic specific LM is interpolated with the generic LM Kneser and Steinbiss 1993 Suzuki and Gao 2005

Quang Bửu 97 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "A Clustering Approach for the Nearly Unsupervised Recognition of Nonliteral Language"

8 61 0

Báo cáo khoa học: "Unsupervised Language Model Adaptation Incorporating Named Entity Information"

8 76 0

Báo cáo khoa học: "A Language-Independent Unsupervised Model for Morphological Segmentation"

8 54 0

Báo cáo khoa học: "Unsupervised Discovery of Persian Morphemes"

4 54 0

Báo cáo khoa học: "Unsupervised Detection of Downward-Entailing Operators By Maximizing Classiﬁcation Certainty"

10 59 0

Báo cáo khoa học: "An Unsupervised Dynamic Bayesian Network Approach to Measuring Speech Style Accommodation"

11 52 0

Báo cáo khoa học: "Unsupervised Methods for Head Assignments"

9 42 0

Báo cáo khoa học: "Unsupervised Recognition of Literal and Non-Literal Use of Idiomatic Expressions"

9 46 0

Báo cáo khoa học: "Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation"

8 58 0

Báo cáo khoa học: "A Bayesian Approach to Unsupervised Semantic Role Induction"

11 75 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462348 61

Giới thiệu :Lập trình mã nguồn mở

14 26497 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11370 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10557 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9850 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8897 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8512 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8107 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7844 1803

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7285 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 395 3 05-01-2025

Data Structures and Algorithms - Chapter 8: Heaps

41 192 5 05-01-2025

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 240 7 05-01-2025

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 162 1 05-01-2025

5 thói quen ăn uống hủy hoại hàm răng đẹp

5 176 2 05-01-2025

The Ombudsman Enterprise and Administrative Justice

309 148 0 05-01-2025

Lập trình Java cơ bản : Luồng và xử lý file part 8

5 143 1 05-01-2025

longman english 1

5 136 0 05-01-2025

Business English Lesson – Advanced Level's archiveFinance (1)

8 118 0 05-01-2025

Sinh thái học nông nghiệp : Sinh thái học và sự phát triển Nông nghiệp part 8

8 141 0 05-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8107 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7844 1803

Ebook Chào con ba mẹ đã sẵn sàng

112 4424 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6336 1275

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8897 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3855 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3926 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4754 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11370 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4529 490