TAILIEUCHUNG - Báo cáo khoa học: "Optimal Constituent Alignment with Edge Covers for Semantic Projection"

Given a parallel corpus, semantic projection attempts to transfer semantic role annotations from one language to another, typically by exploiting word alignments. In this paper, we present an improved method for obtaining constituent alignments between parallel sentences to guide the role projection task. Our extensions are twofold: (a) we model constituent alignment as minimum weight edge covers in a bipartite graph, which allows us to find a globally optimal solution efficiently; (b) we propose tree pruning as a promising strategy for reducing alignment noise. . | Optimal Constituent Alignment with Edge Covers for Semantic Projection Sebastian Padó Computational Linguistics Saarland University Saarbrucken Germany pado@ Mirella Lapata School of Informatics University of Edinburgh Edinburgh UK mlap@ Abstract Given a parallel corpus semantic projection attempts to transfer semantic role annotations from one language to another typically by exploiting word alignments. In this paper we present an improved method for obtaining constituent alignments between parallel sentences to guide the role projection task. Our extensions are twofold a we model constituent alignment as minimum weight edge covers in a bipartite graph which allows us to find a globally optimal solution efficiently b we propose tree pruning as a promising strategy for reducing alignment noise. Experimental results on an English-German parallel corpus demonstrate improvements over state-of-the-art models. 1 Introduction Recent years have witnessed increased interest in data-driven methods for many natural language processing NLP tasks ranging from part-of-speech tagging to parsing and semantic role labelling. The success of these methods is due partly to the availability of large amounts of training data annotated with rich linguistic information. Unfortunately such resources are largely absent for almost all languages except English. Given the data requirements for supervised learning and the current paucity of suitable data for many languages methods for generating annotations semi- auto-matically are becoming increasingly popular. Annotation projection tackles this problem by leveraging parallel corpora and the high-accuracy tools . parsers taggers available for a few languages. Specifically through the use of word alignments annotations are transfered from resource-rich languages onto low density ones. The projection process can be decomposed into three steps a determining the units of projection these are typically words but can .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.