Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
An effective procedure for automatically acquiring a new set of disambiguation rules for an existing deterministic parser on the basis of tagged text is presented. Performance of the automatically acquired rules is much better than the existing handwritten disambiguation rules. The success of the acquired rules depends on using the linguistic information encoded in the parser; enhancements to various components of the parser improves the acquired rule set. This work suggests a path toward more robust and comprehensive syntactic analyzers. . | ACQUIRING DISAMBIGUATION RULES FROM TEXT Donald Hindle AT T Bell Laboratories 600 Mountain Avenue Murray Hill NJ 07974-2070 Abstract An effective procedure for automatically acquiring a new set of disambiguation rules for an existing deterministic parser on the basis of tagged text is presented. Performance of the automatically acquired rules is much better than the existing handwritten disambiguation rules. The success of the acquired rules depends on using the linguistic information encoded in the parser enhancements to various components of the parser improves the ac-quừed rule set. This work suggests a path toward more robust and comprehensive syntactic analyzers. 1 Introduction One of the most serious obstacles to developing parsers to effectively analyze unrestricted English is the difficulty of creating sufficiently comprehensive grammars. While it is possible to develop toy grammars for particular theoretically interesting problems the sheer variety of forms in English together with the complexity of interaction that arises in a typical syntactic analyzer makes each enhancement of parser coverage increasingly difficult. There is no question that we are still quite far from syntactic analyzers that even begin to adequately model the grammatical variety of English. To go beyond the current generation of hand built grammars for syntactic analysis it will be necessary to develop means of acquừing some of the needed grammatical information from the regularities that appear in large corpora of naturally occurring text. This paper describes an implemented training procedure for automatically acquiring symbolic rules for a deterministic parser on the basis of unrestricted textual input. In particular I describe experiments in automatically acqmring a set of rules for disambiguation of lexical category part of speech . Performance of the acquired rule set is much better than the set of rules for lexical disambiguation written for the parser by hand over a period of