TAILIEUCHUNG - Báo cáo khoa học: "Is the End of Supervised Parsing in Sight?"

How far can we get with unsupervised parsing if we make our training corpus several orders of magnitude larger than has hitherto be attempted? We present a new algorithm for unsupervised parsing using an all-subtrees model, termed U-DOP, which parses directly with packed forests of all binary trees. We train both on Penn’s WSJ data and on the (much larger) NANC corpus, showing that U-DOP outperforms a treebank-PCFG on the standard WSJ test set. While U-DOP* performs worse than state-of-the-art supervised parsers on handannotated sentences, we show that the model outperforms supervised parsers when evaluated as a language model. | Is the End of Supervised Parsing in Sight Rens Bod School of Computer Science University of St Andrews ILLC University of Amsterdam rb@ Abstract How far can we get with unsupervised parsing if we make our training corpus several orders of magnitude larger than has hitherto be attempted We present a new algorithm for unsupervised parsing using an all-subtrees model termed U-DOP which parses directly with packed forests of all binary trees. We train both on Penn s WSJ data and on the much larger NANC corpus showing that U-DOP outperforms a treebank-PCFG on the standard WSJ test set. While U-DOP performs worse than state-of-the-art supervised parsers on hand-annotated sentences we show that the model outperforms supervised parsers when evaluated as a language model in syntax-based machine translation on Europarl. We argue that supervised parsers miss the fluidity between constituents and non-constituents and that in the field of syntax-based language modeling the end of supervised parsing has come in sight. 1 Introduction A major challenge in natural language parsing is the unsupervised induction of syntactic structure. While most parsing methods are currently supervised or semi-supervised McClosky et al. 2006 Henderson 2004 Steedman et al. 2003 they depend on hand-annotated data which are difficult to come by and which exist only for a few languages. Unsupervised parsing methods are becoming increasingly important since they operate with raw unlabeled data of which unlimited quantities are available. There has been a resurgence of interest in unsupervised parsing during the last few years. Where van Zaanen 2000 and Clark 2001 induced unlabeled phrase structure for small domains like the ATIS obtaining around 40 unlabeled f-score Klein and Manning 2002 report f-score on Penn WSJ part-of-speech strings 10 words WSJ10 using a constituentcontext model called CCM. Klein and Manning 2004 further show that a hybrid approach which combines constituency .

Huy Kha 91 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Ebook Front - End handbook 2016

141 57 0

Ebook Front - End handbook 2017

168 73 0

End-to-End Security in Mobile-Cloud Computing

50 95 1

Báo cáo khoa học: "End-to-End Relation Extraction Using Distant Supervision from External Semantic Repositories"

6 67 0

Báo cáo khoa học: "An End-to-End Discriminative Approach to Machine Translation"

8 48 0

Báo cáo khoa học: "End-to-End Evaluation in Simultaneous Translation"

9 59 0

SOA End to End Security

71 54 0

Shear and bending strength of some end to end grained joints prepared from scotch pine

5 72 0

Design and development of the folded 4-mirror resonators for diode end pumped solid-state cr:Lisaf lasers

12 50 0

Tóm tắt luận án Tiến sĩ Khoa học máy tính: Giảm độ trễ End-To-End và tổng năng lượng tiêu thụ trong các mạng cảm biến không dây

28 82 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462391 61

Giới thiệu :Lập trình mã nguồn mở

14 27477 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11397 543

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10593 468

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9884 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8926 1162

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8546 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8118 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 8086 1836

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7332 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 290 4 27-01-2025

Đóng mới oto 8 chỗ ngồi part 9

10 189 3 27-01-2025

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 169 1 27-01-2025

Báo cáo " Thẩm quyền quản lí nhà nước đối với hoạt động quảng cáo thực trạng và hướng hoàn thiện "

7 217 7 27-01-2025

Bệnh sán lá gan trên gia súc và cách phòng trị

3 171 1 27-01-2025

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 221 1 27-01-2025

Chủ đề 3 : SỰ CÂN BẰNG CỦA VẬT RẮN (4 tiết)

9 223 1 27-01-2025

Xinh xinh vườn nhà

6 137 0 27-01-2025

Lịch sử Trung Quốc 5000 năm tập 3 part 2

54 162 1 27-01-2025

TRẮC NGHIỆM - CÁC BỆNH THIẾU DINH DƯỠNG THƯỜNG GẶP

32 223 2 27-01-2025

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8118 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 8086 1836

Ebook Chào con ba mẹ đã sẵn sàng

112 4490 1383

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6475 1285

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8926 1162

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3889 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3935 616

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4848 569

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11397 543

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4557 490