TAILIEUCHUNG - Báo cáo khoa học: "High-precision Identiﬁcation of Discourse New and Unique Noun Phrases"

Coreference resolution systems usually attempt to ﬁnd a suitable antecedent for (almost) every noun phrase. Recent studies, however, show that many deﬁnite NPs are not anaphoric. The same claim, obviously, holds for the indeﬁnites as well. In this study we try to learn automatically and two classiﬁcations, , relevant for this problem. We use a small training corpus (MUC-7), but also acquire some data from the Internet. Combining our classiﬁers sequentially, we achieve precision and recall for discourse new entities. . | High-precision Identification of Discourse New and Unique Noun Phrases Olga Uryupina Computational Linguistics Saarland University Building 17 Postfach 15 11 50 66041 Saarbrucken Germany ourioupi@ Abstract Coreference resolution systems usually attempt to find a suitable antecedent for almost every noun phrase. Recent studies however show that many definite NPs are not anaphoric. The same claim obviously holds for the indefinites as well. In this study we try to learn automatically two classifications discourse-new and unique relevant for this problem. We use a small training corpus MUC-7 but also acquire some data from the Internet. Combining our classifiers sequentially we achieve precision and recall for discourse new entities. We expect our classifiers to provide a good prefiltering for coreference resolution systems improving both their speed and performance. 1 Introduction Most coreference resolution systems proceed in the following way they first identify all the possible markables for example noun phrases and then check one by one candidate pairs markablei markable-j trying to find out whether the members of those pairs can be coreferent. As the final step the pairs are ranked using a scoring algorithm in order to find an appropriate partition of all the markables into coreference classes. Those approaches require substantial processing in the worst case one has to check candi date pairs where is the total number of mark-ables found by the system. However R. Vieira and M. Poesio have recently shown in Vieira and Poesio 2000 that such an exhaustive search is not needed because many noun phrases are not anaphoric at all about 50 of definite NPs in their corpus have no prior referents. Obviously this number is even higher if one takes into account all the other types of NPs for example indefinites are almost always non-anaphoric. We can conclude that a coreference resolution engine might benefit a lot from a pre-filtering algorithm for

Việt Tiến 85 7 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462342 61

Giới thiệu :Lập trình mã nguồn mở

14 26076 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11348 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10552 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9843 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8891 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8506 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7756 1792

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7271 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

B2B Content Marketing: 2012 Benchmarks, Budgets & Trends

17 229 3 28-12-2024

Đóng mới oto 8 chỗ ngồi part 9

10 179 3 28-12-2024

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 156 3 28-12-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 146 2 28-12-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 191 4 28-12-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 161 1 28-12-2024

Sử dụng mô hình ARCH và GARCH để phân tích và dự báo về giá cổ phiếu trên thị trường chứng khoán

24 1073 2 28-12-2024

Đề tài " Dự báo về tác động của Tổ chức Thương mại Thế giới WTO đối với các doanh nghiệp xuất khẩu vừa và nhỏ Việt Nam – Những giải pháp đề xuất "

72 187 2 28-12-2024

ETHICAL CODE HANDBOOK: Demonstrate your commitment to high standards

7 148 1 28-12-2024

Báo cáo nghiên cứu khoa học " Vai trò chính quyền địa phương trong phát triển kinh tế : khu chuyên doanh gốm sứ ( Trung Quốc ) và Bát Tràng ( Việt Nam )("

11 214 1 28-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7756 1792

Ebook Chào con ba mẹ đã sẵn sàng

112 4409 1371

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6290 1266

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8891 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3841 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3920 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4712 565

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11348 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4510 490