TAILIEUCHUNG - Báo cáo khoa học: "Data point selection for cross-language adaptation of dependency parsers"

We consider a very simple, yet effective, approach to cross language adaptation of dependency parsers. We ﬁrst remove lexical items from the treebanks and map part-of-speech tags into a common tagset. We then train a language model on tag sequences in otherwise unlabeled target data and rank labeled source data by perplexity per word of tag sequences from less similar to most similar to the target. We then train our target language parser on the most similar data points in the source labeled data. . | Data point selection for cross-language adaptation of dependency parsers Anders Sngaard Center for Language Technology University of Copenhagen Njalsgade 142 DK-2300 Copenhagen S soegaard@ Abstract We consider a very simple yet effective approach to cross language adaptation of dependency parsers. We first remove lexical items from the treebanks and map part-of-speech tags into a common tagset. We then train a language model on tag sequences in otherwise unlabeled target data and rank labeled source data by perplexity per word of tag sequences from less similar to most similar to the target. We then train our target language parser on the most similar data points in the source labeled data. The strategy achieves much better results than a non-adapted baseline and state-of-the-art unsupervised dependency parsing and results are comparable to more complex projection-based cross language adaptation algorithms. 1 Introduction While unsupervised dependency parsing has seen rapid progress in recent years results are still far from the results that can be achieved with supervised parsers and not yet good enough to solve real-world problems. In this paper we will be interested in an alternative strategy namely cross-language adaptation of dependency parsers. The idea is briefly put to learn how to parse Arabic for example from say a Danish treebank comparing unlabeled data from both languages. This is similar to but more difficult than most domain adaptation or transfer learning scenarios where differences between source and target distributions are smaller. Most previous work in cross-language adaptation has used parallel corpora to project dependency 682 structures across translations using word alignments Smith and Eisner 2009 Spreyer and Kuhn 2009 Ganchev et al. 2009 but in this paper we show that similar results can be achieved by much simpler means. Specifically we build on the cross-language adaptation algorithm for closely related languages developed by .

Lương Quyền 46 5 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Lecture Business data communications: Chapter 11 - Behrouz A. Forouzan

23 69 0

Real - time table plane detection using accelerometer information and organized point cloud data from kinect sensor

16 51 0

Lecture ERS 120: Principles of GIS - Part 3: Inputting Geographical Data

5 75 0

Fitting spherical objects in 3D point cloud using the geometrical constraints

13 38 2

Đánh giá giải thuật Iterative Closest Point trong xây dựng mô hình 3D

5 20 1

Báo cáo khoa học: "Data point selection for cross-language adaptation of dependency parsers"

5 33 0

Estimation of curie point depths in the Southern Vietnam continental shelf using magnetic data

13 44 1

Determination of tectonic velocities of some continuously operating reference stations (CORS) in Vietnam 2016-2018 by using precise point positioning

12 33 2

Lecture Computer networks: Lesson 45 - Hammad Khalid Khan

60 40 1

Distributed systems: Principles and Paradigms (Second Edition) - Part 2

366 21 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462353 61

Giới thiệu :Lập trình mã nguồn mở

14 26813 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11377 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10573 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9857 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8910 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8524 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7986 1826

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7298 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 396 3 12-01-2025

Báo cáo nghiên cứu khoa học " HÃY LÀM CHO HUẾ XANH HƠN VÀ ĐẸP HƠN "

6 188 3 12-01-2025

Bảng màu theo chữ cái – V

11 177 2 12-01-2025

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 165 1 12-01-2025

Valve Selection Handbook - Fourth Edition

337 150 2 12-01-2025

Word Games with English 1

65 148 1 12-01-2025

Báo cáo nghiên cứu khoa học " Đại hội XVI thông qua điều lệ Đảng cộng sản Trung Quốc những sửa đổi bổ sung mới "

4 171 1 12-01-2025

Báo cáo nghiên cứu khoa học " Sự nhất quán phát triển kinh tế thị trường XHCN trong xây dựng xã hội hài hoà của Trung Quốc và đổi mới của Việt Nam "

8 152 1 12-01-2025

CUỘC KHÁNG CHIẾN CHỐNG THỰC DÂN PHÁP KẾT THÚC (1953 - 1954)_5

11 154 1 12-01-2025

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining

101 150 1 12-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8109 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7986 1826

Ebook Chào con ba mẹ đã sẵn sàng

112 4442 1376

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6385 1279

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8910 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3862 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3930 610

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4784 567

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11377 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4537 490