Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3 , and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. . | Coreference Resolution across Corpora Languages Coding Schemes and Preprocessing Information Marta Recasens CLiC - University of Barcelona Gran Via 585 Barcelona Spain mrecasens@ub.edu Abstract This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system as measured by MUC B3 and CEAF. By varying separately three parameters language annotation scheme and preprocessing information and applying the same coreference resolution system the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition coding schemes and features. They also expose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement. 1 Introduction The task of coreference resolution which aims to automatically identify the expressions in a text that refer to the same discourse entity has been an increasing research topic in NLP ever since MUC-6 made available the first coreferentially annotated corpus in 1995. Most research has centered around the rules by which mentions are allowed to corefer the features characterizing mention pairs the algorithms for building coreference chains and coreference evaluation methods. The surprisingly important role played by different aspects of the corpus however is an issue to which little attention has been paid. We demonstrate the extent to which a system will be evaluated as performing differently depending on parameters such as the corpus language the way coreference relations are defined in the corresponding coding scheme and the nature and source of preprocessing information. This paper unpacks these issues by running the same system a prototype entity-based architec- Eduard Hovy USC Information Sciences Institute 4676 Admiralty Way Marina del Rey CA USA hovy@isi.edu ture called CISTELL on different corpus .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.