TAILIEUCHUNG - Báo cáo khoa học: "Instance Weighting for Domain Adaptation in NLP"

Domain adaptation is an important problem in natural language processing (NLP) due to the lack of labeled data in novel domains. In this paper, we study the domain adaptation problem from the instance weighting perspective. We formally analyze and characterize the domain adaptation problem from a distributional view, and show that there are two distinct needs for adaptation, corresponding to the different distributions of instances and classiﬁcation functions in the source and the target domains. . | Instance Weighting for Domain Adaptation in NLP Jing Jiang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign Urbana IL 61801 USA jiang4 czhai @ Abstract Domain adaptation is an important problem in natural language processing NLP due to the lack of labeled data in novel domains. In this paper we study the domain adaptation problem from the instance weighting perspective. We formally analyze and characterize the domain adaptation problem from a distributional view and show that there are two distinct needs for adaptation corresponding to the different distributions of instances and classification functions in the source and the target domains. We then propose a general instance weighting framework for domain adaptation. Our empirical results on three NLP tasks show that incorporating and exploiting more information from the target domain through instance weighting is effective. 1 Introduction Many natural language processing NLP problems such as part-of-speech POS tagging named entity NE recognition relation extraction and semantic role labeling are currently solved by supervised learning from manually labeled data. A bottleneck problem with this supervised learning approach is the lack of annotated data. As a special case we often face the situation where we have a sufficient amount of labeled data in one domain but have little or no labeled data in another related domain which we are interested in. We thus face the domain adaptation problem. Following Blitzer et al. 2006 we 264 call the first the source domain and the second the target domain. The domain adaptation problem is commonly encountered in NLP. For example in POS tagging the source domain may be tagged WSJ articles and the target domain may be scientific literature that contains scientific terminology. In NE recognition the source domain may be annotated news articles and the target domain may be personal blogs. Another example is personalized spam .

Ánh Mai 80 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Instance Weighting for Domain Adaptation in NLP"

8 62 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462282 61

Giới thiệu :Lập trình mã nguồn mở

14 24827 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11280 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10506 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9785 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8461 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8089 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7463 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7184 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 210 4 22-11-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 149 1 22-11-2024

Quy Trình Canh Tác Cây Bông Vải

8 148 1 22-11-2024

Bảng màu theo chữ cái – V

11 153 2 22-11-2024

CHƯƠNG 2: RỦI RO THÂM HỤT TÀI KHÓA

28 152 1 22-11-2024

Báo cáo " Bàn về hành vi pháp luật và hành vi đạo đức "

11 169 2 22-11-2024

ETHICAL CODE HANDBOOK: Demonstrate your commitment to high standards

7 139 1 22-11-2024

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 206 1 22-11-2024

báo cáo khoa học: "Malignant peripheral nerve sheath tumor arising from the greater omentum: Case report"

4 135 1 22-11-2024

Chủ đề 3 : SỰ CÂN BẰNG CỦA VẬT RẮN (4 tiết)

9 197 1 22-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8089 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7463 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6147 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3785 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4613 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11280 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4445 490