Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We present an LFG-DOP parser which uses fragments from LFG-annotated sentences to parse new sentences. Experiments with the Verbmobil and Homecentre corpora show that (1) Viterbi n best search performs about 100 times faster than Monte Carlo search while both achieve the same accuracy; (2) the DOP hypothesis which states that parse accuracy increases with increasing fragment size is confirmed for LFG-DOP; (3) LFGDOP's relative frequency estimator performs worse than a discounted frequency estimator; and (4) LFG-DOP significantly outperforms TreeDOP if evaluated on tree structures only. (Neumann & Flickinger 1999), and Montague Grammar (Bonnema et al. 1997; Bod 1999) | An Improved Parser for Data-Oriented Lexical-Functional Analysis Rens Bod Informatics Research Institute University of Leeds Leeds LS2 9JT UK Institute for Logic Language and Computation University of Amsterdam rens@scs.leeds.ac.uk Abstract We present an LFG-DOP parser which uses fragments from LFG-annotated sentences to parse new sentences. Experiments with the Verbmobil and Homecentre corpora show that 1 Viterbi n best search performs about 100 times faster than Monte Carlo search while both achieve the same accuracy 2 the DOP hypothesis which states that parse accuracy increases with increasing fragment size is confirmed for LFG-DOP 3 LFG-DOP s relative frequency estimator performs worse than a discounted frequency estimator and 4 LFG-DOP significantly outperforms Tree-DOP if evaluated on tree structures only. 1 Introduction Data-Oriented Parsing DOP models learn how to provide linguistic representations for an unlimited set of utterances by generalizing from a given corpus of properly annotated exemplars. They operate by decomposing the given representations into arbitrarily large fragments and recomposing those pieces to analyze new utterances. A probability model is used to choose from the collection of different fragments of different sizes those that make up the most appropriate analysis of an utterance. DOP models have been shown to achieve state-of-the-art parsing performance on benchmarks such as the Wall Street Journal corpus see Bod 2000a . The original DOP model in Bod 1993 was based on utterance analyses represented as surface trees and is equivalent to a Stochastic Tree-Substitution Grammar. But the model has also been applied to several other grammatical frameworks e.g. Tree-Insertion Grammar Hoogweg 2000 Tree-Adjoining Grammar Neumann 1998 Lexical-Functional Grammar Bod Kaplan 1998 Cormons 1999 Head-driven Phrase Structure Grammar Neumann Flickinger 1999 and Montague Grammar Bonnema et al. 1997 Bod 1999 . Most probability models for DOP use the .