Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3 , and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. . | Coreference Resolution across Corpora Languages Coding Schemes and Preprocessing Information Marta Recasens CLiC - University of Barcelona Gran Via 585 Barcelona Spain mrecasens@ub.edu Abstract This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system as measured by MUC B3 and CEAF. By varying separately three parameters language annotation scheme and preprocessing information and applying the same coreference resolution system the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition coding schemes and features. They also expose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement. 1 Introduction The task of coreference resolution which aims to automatically identify the expressions in a text that refer to the same discourse entity has been an increasing research topic in NLP ever since MUC-6 made available the first coreferentially annotated corpus in 1995. Most research has centered around the rules by which mentions are allowed to corefer the features characterizing mention pairs the algorithms for building coreference chains and coreference evaluation methods. The surprisingly important role played by different aspects of the corpus however is an issue to which little attention has been paid. We demonstrate the extent to which a system will be evaluated as performing differently depending on parameters such as the corpus language the way coreference relations are defined in the corresponding coding scheme and the nature and source of preprocessing information. This paper unpacks these issues by running the same system a prototype entity-based architec- Eduard Hovy USC Information Sciences Institute 4676 Admiralty Way Marina del Rey CA USA hovy@isi.edu ture called CISTELL on different corpus .