Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We revisit the idea of history-based parsing, and present a history-based parsing framework that strives to be simple, general, and flexible. We also provide a decoder for this probability model that is linear-space, optimal, and anytime. A parser based on this framework, when evaluated on Section 23 of the Penn Treebank, compares favorably with other stateof-the-art approaches, in terms of both accuracy and speed. NP DT that NN. | Exploring the Potential of Intractable Parsers Mark Hopkins Dept. of Computational Linguistics Saarland University Saarbrucken Germany mhopkins@coli.uni-sb.de Jonas Kuhn Dept. of Computational Linguistics Saarland University Saarbrucken Germany jonask@coli.uni-sb.de Abstract We revisit the idea of history-based parsing and present a history-based parsing framework that strives to be simple general and flexible. We also provide a decoder for this probability model that is linear-space optimal and anytime. A parser based on this framework when evaluated on Section 23 of the Penn Treebank compares favorably with other state-of-the-art approaches in terms of both accuracy and speed. 1 Introduction Much of the current research into probabilistic parsing is founded on probabilistic context-free grammars PCFGs Collins 1996 Charniak 1997 Collins 1999 Charniak 2000 Charniak 2001 Klein and Manning 2003 . For instance consider the parse tree in Figure 1. One way to decompose this parse tree is to view it as a sequence of applications of CFG rules. For this particular tree we could view it as the application of rule NP NP PP followed by rule NP DT NN followed by rule DT that and so forth. Hence instead of analyzing P tree we deal with the more modular P NP NP PP NP DT NN DT that NN money PP IN NP IN in NP DT NN DT the NN market Obviously this joint distribution is just as difficult to assess and compute with as P tree . However there exist cubic-time dynamic programming algorithms to find the most likely parse if we assume that all CFG rule applications are marginally NP NP PP DT NN IN NP that money in DT W the market Figure 1 Example parse tree. independent of one another. The problem of course with this simplification is that although it is computationally attractive it is usually too strong of an independence assumption. To mitigate this loss of context without sacrificing algorithmic tractability typically researchers annotate the nodes of the parse tree with contextual .