Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
A method for resolving the ellipses that appear in Japanese dialogues is proposed. This method resolves not only the subject ellipsis, but also those in object and other grammatical cases. In this approach, a machine-learning algorithm is used to select the attributes necessary for a resolution. A decision tree is built, and used as the actual ellipsis resolver. The results of blind tests have shown that the proposed method was able to provide a resolution accuracy of 91.7% for indirect objects, and 78.7% for subjects with a verb predicate. . | Feasibility study for Ellipsis Resolution in Dialogues by Machine-Learning Technique YAMAMOTO Kazuhide and SUMITA Eiichiro ATR Interpreting Telecommunications Research Laboratories E-mail yamamoto@itl.atr.co.jp Abstract A method for resolving the ellipses that appear in Japanese dialogues is proposed. This method resolves not only the subject ellipsis but also those in object and other grammatical cases. In this approach a machine-learning algorithm is used to select the attributes necessary for a resolution. A decision tree is built and used as the actual ellipsis resolver. The results of blind tests have shown that the proposed method was able to provide a resolution accuracy of 91.7 for indirect objects and 78.7 for subjects with a verb predicate. By investigating the decision tree we found that topic-dependent attributes are necessary to obtain high performance resolution and that indispensable attributes vary according to the grammatical case. The problem of data size relative to decision-tree training is also discussed. 1 Introduction In machine translation systems it is necessary to resolve ellipses when the source language doesn t express the subject or other grammatical cases and the target must express it. The problem of ellipsis resolution is also troublesome in information extraction and other natural language processing fields. Several approaches have been proposed to resolve ellipses which consist of endophoric intrasentential or anaphoric ellipses and ex-ophoric or extrasentential ellipses. One of the major approaches for endophoric ellipsis in theoretical basis utilizes the centering theory. However its application to complex sentences has not been established because most studies have only investigated its effectiveness with successive simple sentences. Several studies of this problem have been made using the empirical approach. Among them Murata and Nagao 1997 proposed a scoring approach W here each constraint is manually scored with an estimation