TAILIEUCHUNG - Báo cáo khoa học: "The Same-head Heuristic for Coreference"

We investigate coreference relationships between NPs with the same head noun. It is relatively common in unsupervised work to assume that such pairs are coreferent– but this is not always true, especially if realistic mention detection is used. We describe the distribution of noncoreferent same-head pairs in news text, and present an unsupervised generative model which learns not to link some samehead NPs using syntactic features, improving precision. | The Same-head Heuristic for Coreference Micha Elsner and Eugene Charniak Brown Laboratory for Linguistic Information Processing BLLIP Brown University Providence RI 02912 melsner ec @ Abstract We investigate coreference relationships between NPs with the same head noun. It is relatively common in unsupervised work to assume that such pairs are coreferent- but this is not always true especially if realistic mention detection is used. We describe the distribution of noncoreferent same-head pairs in news text and present an unsupervised generative model which learns not to link some samehead NPs using syntactic features improving precision. 1 Introduction Full NP coreference the task of discovering which non-pronominal NPs in a discourse refer to the same entity is widely known to be challenging. In practice however most work focuses on the subtask of linking NPs with different head words. Decisions involving NPs with the same head word have not attracted nearly as much attention and many systems especially unsupervised ones operate under the assumption that all same-head pairs corefer. This is by no means always the case-there are several systematic exceptions to the rule. In this paper we show that these exceptions are fairly common and describe an unsupervised system which learns to distinguish them from coreferent same-head pairs. There are several reasons why relatively little attention has been paid to same-head pairs. Primarily this is because they are a comparatively easy subtask in a notoriously difficult area Stoyanov et al. 2009 shows that among NPs headed by common nouns those which have an exact match earlier in the document are the easiest to resolve variant MUC score .82 on MUC-6 and while those with partial matches are quite a bit harder .53 by far the worst performance is on those without any match at all .27 . This effect is magnified by most popular metrics for coreference which reward finding links within large clusters more than they .

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.