TAILIEUCHUNG - Báo cáo khoa học: "Robust Conversion of CCG Derivations to Phrase Structure Trees"

We propose an improved, bottom-up method for converting CCG derivations into PTB-style phrase structure trees. In contrast with past work (Clark and Curran, 2009), which used simple transductions on category pairs, our approach uses richer transductions attached to single categories. Our conversion preserves more sentences under round-trip conversion ( vs. ) and is more robust. | Robust Conversion of CCG Derivations to Phrase Structure Trees Jonathan K. Kummerfeld Computer Science Division University of California Berkeley Berkeley CA 94720 USA jkk klein @ Dan Klein James R. Curran Ỉ0-lab School of IT University of Sydney Sydney NSW 2006 Australia j ames@ Abstract We propose an improved bottom-up method for converting CCG derivations into PTB-style phrase structure trees. In contrast with past work Clark and Curran 2009 which used simple transductions on category pairs our approach uses richer transductions attached to single categories. Our conversion preserves more sentences under round-trip conversion vs. and is more robust. In particular unlike past methods ours does not require ad-hoc rules over non-local features and so can be easily integrated into a parser. 1 Introduction Converting the Penn Treebank PTB Marcus et al. 1993 to other formalisms such as HPSG Miyao et al. 2004 LFG Cahill et al. 2008 LTAG Xia 1999 and CCG Hockenmaier 2003 is a complex process that renders linguistic phenomena in formalism-specific ways. Tools for reversing these conversions are desirable for downstream parser use and parser comparison. However reversing conversions is difficult as corpus conversions may lose information or smooth over PTB inconsistencies. Clark and Curran 2009 developed a CCG to PTB conversion that treats the CCG derivation as a phrase structure tree and applies hand-crafted rules to every pair of categories that combine in the derivation. Because their approach does not exploit the generalisations inherent in the CCG formalism they must resort to ad-hoc rules over non-local features of the CCG constituents being combined when a fixed pair of CCG categories correspond to multiple PTB structures . Even with such rules they correctly convert only of gold CCGbank derivations. 105 Our conversion assigns a set of bracket instructions to each word based on its CCG category then follows the CCG .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.