TAILIEUCHUNG - Báo cáo khoa học: "A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing"

Via an oracle experiment, we show that the upper bound on accuracy of a CCG parser is significantly lowered when its search space is pruned using a supertagger, though the supertagger also prunes many bad parses. Inspired by this analysis, we design a single model with both supertagging and parsing features, rather than separating them into distinct models chained together in a pipeline. | A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing Michael Auli School of Informatics University of Edinburgh Adam Lopez hltcoE Johns Hopkins University alopez@ Abstract Via an oracle experiment we show that the upper bound on accuracy of a CCG parser is significantly lowered when its search space is pruned using a supertagger though the supertagger also prunes many bad parses. Inspired by this analysis we design a single model with both supertagging and parsing features rather than separating them into distinct models chained together in a pipeline. To overcome the resulting increase in complexity we experiment with both belief propagation and dual decomposition approaches to inference the first empirical comparison of these algorithms that we are aware of on a structured natural language processing problem. On CCGbank we achieve a labelled dependency F-measure of on gold POS tags and on automatic part-of-speeoch tags the best reported results for this task. 1 Introduction Accurate and efficient parsing of Combinatorial Categorial Grammar CCG Steedman 2000 is a longstanding problem in computational linguistics due to the complexities associated its mild context sensitivity. Even for practical CCG that are strongly context-free Fowler and Penn 2010 parsing is much harder than with Penn Treebank-style context-free grammars with vast numbers of nonterminal categories leading to increased grammar constants. Where a typical Penn Treebank grammar may have fewer than 100 nonterminals Hockenmaier and Steedman 2002 we found that a CCG grammar derived from CCGbank contained over 1500. The 470 same grammar assigns an average of 22 lexical categories per word Clark and Curran 2004a resulting in an enormous space of possible derivations. The most successful approach to CCG parsing is based on a pipeline strategy 2 . First we tag or multitag each word of the sentence with a lexical

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.