TAILIEUCHUNG - Báo cáo khoa học: "Effective Measures of Domain Similarity for Parsing"

It is well known that parsing accuracy suffers when a model is applied to out-of-domain data. It is also known that the most beneﬁcial data to parse a given domain is data that matches the domain (Sekine, 1997; Gildea, 2001). Hence, an important task is to select appropriate domains. However, most previous work on domain adaptation relied on the implicit assumption that domains are somehow given. | Effective Measures of Domain Similarity for Parsing Barbara Plank University of Groningen The Netherlands Gertjan van Noord University of Groningen The Netherlands Abstract It is well known that parsing accuracy suffers when a model is applied to out-of-domain data. It is also known that the most beneficial data to parse a given domain is data that matches the domain Sekine 1997 Gildea 2001 . Hence an important task is to select appropriate domains. However most previous work on domain adaptation relied on the implicit assumption that domains are somehow given. As more and more data becomes available automatic ways to select data that is beneficial for a new unknown target domain are becoming attractive. This paper evaluates various ways to automatically acquire related training data for a given test set. The results show that an unsupervised technique based on topic models is effective - it outperforms random data selection on both languages examined English and Dutch. Moreover the technique works better than manually assigned labels gathered from meta-data that is available for English. 1 Introduction and Motivation Previous research on domain adaptation has focused on the task of adapting a system trained on one domain say newspaper text to a particular new domain say biomedical data. Usually some amount of labeled or unlabeled data from the new domain was given - which has been determined by a human. However with the growth of the web more and more data is becoming available where each document is potentially its own domain McClosky et al. 2010 . It is not straightforward to determine 1566 which data or model in case we have several source domain models will perform best on a new unknown target domain. Therefore an important issue that arises is how to measure domain similarity . whether we can find a simple yet effective method to determine which model or data is most beneficial for an arbitrary piece of new text. .

Tú Sương 51 11 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Outcome Measures for Effective Teamwork in Inpatient Care

1 42 0

Báo cáo khoa học: "Effective Measures of Domain Similarity for Parsing"

11 36 0

Effective management of teaching techniques at pedagogical colleges in the context of the current industrial revolution 4.0 and education 4.0 situation and measures

8 38 1

Application and effective assessment of measures to enhance the quality of student management of Hanoi University of Physical Education and Sports

7 40 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462352 61

Giới thiệu :Lập trình mã nguồn mở

14 26756 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11377 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10569 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9856 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8909 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8522 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7956 1823

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7296 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 289 4 10-01-2025

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 163 3 10-01-2025

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 152 2 10-01-2025

Hướng dẫn chế độ dinh dưỡng cho người bệnh viêm khớp

5 176 2 10-01-2025

Giáo án điện tử tiểu học môn lịch sử: Cách mạng mùa thu

39 170 1 10-01-2025

Báo cáo y học: "The Factors Influencing Depression Endpoints Research (FINDER) study: final results of Italian patients with depressio"

9 157 1 10-01-2025

Valve Selection Handbook - Fourth Edition

337 150 2 10-01-2025

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 218 1 10-01-2025

CUỘC KHÁNG CHIẾN CHỐNG THỰC DÂN PHÁP KẾT THÚC (1953 - 1954)_5

11 154 1 10-01-2025

5 thói quen ăn uống hủy hoại hàm răng đẹp

5 181 2 10-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7956 1823

Ebook Chào con ba mẹ đã sẵn sàng

112 4440 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6364 1276

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8909 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3861 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3930 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4782 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11377 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4536 490