TAILIEUCHUNG - Báo cáo khoa học: "Empirical Lower Bounds on the Complexity of Translational Equivalence ∗"

This paper describes a study of the patterns of translational equivalence exhibited by a variety of bitexts. The study found that the complexity of these patterns in every bitext was higher than suggested in the literature. These findings shed new light on why “syntactic” constraints have not helped to improve statistical translation models, including finitestate phrase-based models, tree-to-string models, and tree-to-tree models. | Empirical Lower Bounds on the Complexity of Translational Equivalence Benjamin Wellington Computer Science Dept. New York University New York NY 10003 lastname @ Sonjia Waxmonsky Computer Science Dept. University of Chicago 1 Chicago IL 60637 wax@ I. Dan Melamed Computer Science Dept. New York University New York NY 10003 lastname @ Abstract This paper describes a study of the patterns of translational equivalence exhibited by a variety of bitexts. The study found that the complexity of these patterns in every bitext was higher than suggested in the literature. These findings shed new light on why syntactic constraints have not helped to improve statistical translation models including finite-state phrase-based models tree-to-string models and tree-to-tree models. The paper also presents evidence that inversion transduction grammars cannot generate some translational equivalence relations even in relatively simple real bitexts in syntactically similar languages with rigid word order. Instructions for replicating our experiments are at http GenPar ACL06 1 Introduction Translational equivalence is a mathematical relation that holds between linguistic expressions with the same meaning. The most common explicit representations of this relation are word alignments between sentences that are translations of each other. The complexity of a given word alignment can be measured by the difficulty of decomposing it into its atomic units under certain constraints detailed in Section 2. This paper describes a study of the distribution of alignment complexity in a variety of bitexts. The study considered word alignments both in isolation and in combination with independently generated parse trees for one or both sentences in each pair. Thus the study Thanks to David Chiang Liang Huang the anonymous reviewers and members of the NYU Proteus Project for helpful feedback. This research was supported by NSF grant s 0238406 and .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.