Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Discriminative reranking is one method for constructing high-performance statistical parsers (Collins, 2000). A discriminative reranker requires a source of candidate parses for each sentence. This paper describes a simple yet novel method for constructing sets of 50-best parses based on a coarse-to-fine generative parser (Charniak, 2000). This method generates 50-best lists that are of substantially higher quality than previously obtainable. We used these parses as the input to a MaxEnt reranker (Johnson et al., 1999; Riezler et al., 2002) that selects the best parse from the set of parses for each sentence, obtaining an f-score of 91.0% on sentences. | Coarse-to-fine n-best parsing and MaxEnt discriminative reranking Eugene Charniak and Mark Johnson Brown Laboratory for Linguistic Information Processing BLLIP Brown University Providence RI 02912 mj ec @cs.brown.edu Abstract Discriminative reranking is one method for constructing high-performance statistical parsers Collins 2000 . A discriminative reranker requires a source of candidate parses for each sentence. This paper describes a simple yet novel method for constructing sets of 50-best parses based on a coarse-to-fine generative parser Charniak 2000 . This method generates 50-best lists that are of substantially higher quality than previously obtainable. We used these parses as the input to a MaxEnt reranker Johnson et al. 1999 Riezler et al. 2002 that selects the best parse from the set of parses for each sentence obtaining an f-score of 91.0 on sentences of length 100 or less. 1 Introduction We describe a reranking parser which uses a regularized MaxEnt reranker to select the best parse from the 50-best parses returned by a generative parsing model. The 50-best parser is a probabilistic parser that on its own produces high quality parses the maximum probability parse trees according to the parser s model have an f-score of 0.897 on section 23 of the Penn Treebank Charniak 2000 which is still state-of-the-art. However the 50 best i.e. the 50 highest probability parses of a sentence often contain considerably better parses in terms of f-score this paper describes a 50-best parsing al gorithm with an oracle f -score of 96.8 on the same data. The reranker attempts to select the best parse for a sentence from the 50-best list of possible parses for the sentence. Because the reranker only has to consider a relatively small number of parses per sentences it is not necessary to use dynamic programming which permits the features to be essentially arbitrary functions of the parse trees. While our reranker does not achieve anything like the oracle f -score the parses