Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
be used for efficiency by providing a best-first search heuristic to order the parsing agenda. This paper proposes an agenda-based probabilistic chart parsing algorithm which is both robust and efficient. The algorithm, 7)icky 1, is considered robust because it will potentially generate all constituents produced by a pure bottom-up parser and rank these constituents by likelihood. The efficiency of the algorithm is achieved through a technique called probabilistic prediction, which helps the algorithm avoid worst-case behavior. . | Efficiency Robustness and Accuracy in Picky Chart Parsing David M. Magerman Stanford University Stanford CA 94305 magerm an@cs. st anfor d. edu Carl Weir Paramax Systems Paoli PA 19301 weir@prc.unisys. com ABSTRACT This paper describes Picky a probabilistic agenda-based chart parsing algorithm which uses a technique called probabilistic prediction to predict which grammar rules are likely to lead to an acceptable parse of the input. Using a subopti-mal search method Picky significantly reduces the number of edges produced by CKY-like chart parsing algorithms while maintaining the robustness of pure bottom-up parsers and the accuracy of existing probabilistic parsers. Experiments using Picky demonstrate how probabilistic modelling can impact upon the efficiency robustness and accuracy of a parser. 1. Introduction This paper addresses the question Why should we use probabilistic models in natural language understanding There are many answers to this question only a few of which are regularly addressed in the literature. The first and most common answer concerns ambiguity resolution. A probabilistic model provides a clearly defined preference rule for selecting among grammatical alternatives i.e. the highest probability interpretation is selected . However this use of probabilistic models assumes that we already have efficient methods for generating the alternatives in the first place. While we have O n3 algorithms for determining the grammaticality of a sentence parsing as a component of a natural language understanding tool involves more than simply determining all of the grammatical interpretations of an input. In order for a natural language system to process input efficiently and robustly it must process all intelligible sentences grammatical or not while not significantly reducing the system s efficiency. This observation suggests two other answers to the central question of this paper. Probabilistic models offer a convenient scoring method for partial .