TAILIEUCHUNG - Báo cáo khoa học: "Theoretical Evaluation of Estimation Methods for Data-Oriented Parsing"

We analyze estimation methods for DataOriented Parsing, as well as the theoretical criteria used to evaluate them. We show that all current estimation methods are inconsistent in the “weight-distribution test”, and argue that these results force us to rethink both the methods proposed and the criteria used. | Theoretical Evaluation of Estimation Methods for Data-Oriented Parsing Willem Zuidema Institute for Logic Language and Computation University of Amsterdam Plantage Muidergracht 24 1018 TV Amsterdam the Netherlands. jzuidema@ Abstract We analyze estimation methods for Data-Oriented Parsing as well as the theoretical criteria used to evaluate them. We show that all current estimation methods are inconsistent in the weight-distribution test and argue that these results force us to rethink both the methods proposed and the criteria used. 1 Introduction Stochastic Tree Substitution Grammars henceforth STSGs are a simple generalization of Probabilistic Context Free Grammars where the productive elements are not rewrite rules but elementary trees of arbitrary size. The increased flexibility allows STSGs to model a variety of syntactic and statistical dependencies using relatively complex primitives but just a single and extremely simple global rule substitution. STSGs can be seen as Stochastic Tree Adjoining Grammars without the adjunction operation. STSGs are the underlying formalism of most instantiations of an approach to statistical parsing known as Data-Oriented Parsing Scha 1990 Bod 1998 . In this approach the subtrees of the trees in a tree bank are used as elementary trees of the grammar. In most DOP models the grammar used is an STSG with in principle all subtrees1 of the trees in the tree bank as elementary trees. For disambiguation the best parse tree is taken to be the most probable parse according to the weights of the grammar. Several methods have been proposed to decide on the weights based on observed tree frequencies 1A subtree t of a parse tree t is a tree such that every node i in t equals a node i in t and i either has no daughters or the same daughter nodes as i. in a tree bank. The first such method is now known as DOP1 Bod 1993 . In combination with some heuristic constraints on the allowed subtrees it has been remarkably successful on

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.