TAILIEUCHUNG - Báo cáo khoa học: "Detecting Errors in Automatically-Parsed Dependency Relations"

We outline different methods to detect errors in automatically-parsed dependency corpora, by comparing so-called dependency rules to their representation in the training data and ﬂagging anomalous ones. By comparing each new rule to every relevant rule from training, we can identify parts of parse trees which are likely erroneous. Even the relatively simple methods of comparison we propose show promise for speeding up the annotation process. | Detecting Errors in Automatically-Parsed Dependency Relations Markus Dickinson Indiana University md7@ Abstract We outline different methods to detect errors in automatically-parsed dependency corpora by comparing so-called dependency rules to their representation in the training data and flagging anomalous ones. By comparing each new rule to every relevant rule from training we can identify parts of parse trees which are likely erroneous. Even the relatively simple methods of comparison we propose show promise for speeding up the annotation process. 1 Introduction and Motivation Given the need for high-quality dependency parses in applications such as statistical machine translation Xu et al. 2009 natural language generation Wan et al. 2009 and text summarization evaluation Owczarzak 2009 there is a corresponding need for high-quality dependency annotation for the training and evaluation of dependency parsers Buchholz and Marsi 2006 . Furthermore parsing accuracy degrades unless sufficient amounts of labeled training data from the same domain are available . Gildea 2001 Sekine 1997 and thus we need larger and more varied annotated treebanks covering a wide range of domains. However there is a bottleneck in obtaining annotation due to the need for manual intervention in annotating a treebank. One approach is to develop automatically-parsed corpora van Noord and Bouma 2009 but a natural disadvantage with such data is that it contains parsing errors. Identifying the most problematic parses for human post-processing could combine the benefits of automatic and manual annotation by allowing a human annotator to efficiently correct automatic errors. We thus set out in this paper to detect errors in automatically-parsed data. If annotated corpora are to grow in scale and retain a high quality annotation errors which arise from automatic processing must be minimized as errors have a negative impact on training and evaluation of NLP technology see discussion .

Hùng Phong 113 10 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Detecting Errors in Automatically-Parsed Dependency Relations"

10 84 0

Báo cáo khoa học: "A Feedback-Augmented Method for Detecting Errors in the Writing of Learners of English"

8 77 0

Báo cáo khoa học: "Detecting Errors in Discontinuous Structural Annotation"

8 64 0

Báo cáo khoa học: "Detecting Errors in Part-of-Speech Annotation"

8 30 0

Báo cáo khoa học: "From detecting errors to automatically correcting them"

8 51 0

Wireless networks - Lecture 4: Error detecting and correcting techniques

24 19 1

THEME: DETECTING ACCOUNTING ERRORS

7 69 0

Lecture Programming languages (2/e): Chapter 6b - Tucker, Noonan

7 73 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462292 61

Giới thiệu :Lập trình mã nguồn mở

14 24934 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11287 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10511 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9791 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8467 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7473 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7189 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 374 3 27-11-2024

Đóng mới oto 8 chỗ ngồi part 9

10 172 3 27-11-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 181 4 27-11-2024

Bảng màu theo chữ cái – V

11 153 2 27-11-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 226 7 27-11-2024

CHƯƠNG 2: RỦI RO THÂM HỤT TÀI KHÓA

28 152 1 27-11-2024

Valve Selection Handbook - Fourth Edition

337 139 1 27-11-2024

Bệnh sán lá gan trên gia súc và cách phòng trị

3 157 1 27-11-2024

Báo cáo nghiên cứu khoa học " NÂNG QUAN HỆ KINH TẾ THƯƠNG MẠI VIỆT NAM - TRUNG QUỐC LÊN TẦM CAO THỜI ĐẠI "

8 159 1 27-11-2024

Xinh xinh vườn nhà

6 128 0 27-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7473 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6156 1259

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3790 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4618 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11287 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4454 490