TAILIEUCHUNG - Báo cáo khoa học: "Soft Syntactic Constraints for Hierarchical Phrased-Based Translation"

In adding syntax to statistical MT, there is a tradeoff between taking advantage of linguistic analysis, versus allowing the model to exploit linguistically unmotivated mappings learned from parallel training data. A number of previous efforts have tackled this tradeoff by starting with a commitment to linguistically motivated analyses and then finding appropriate ways to soften that commitment. We present an approach that explores the tradeoff from the other direction, starting with a context-free translation model learned directly from aligned parallel text, and then adding soft constituent-level constraints based on parses of the source language. . | Soft Syntactic Constraints for Hierarchical Phrased-Based Translation Yuval Marton and Philip Resnik Department of Linguistics and the Laboratory for Computational Linguistics and Information Processing CLIP at the Institute for Advanced Computer Studies UMIACS University of Maryland College Park MD 20742-7505 USA ymarton resnik @t Abstract In adding syntax to statistical MT there is a tradeoff between taking advantage of linguistic analysis versus allowing the model to exploit linguistically unmotivated mappings learned from parallel training data. A number of previous efforts have tackled this tradeoff by starting with a commitment to linguistically motivated analyses and then finding appropriate ways to soften that commitment. We present an approach that explores the tradeoff from the other direction starting with a context-free translation model learned directly from aligned parallel text and then adding soft constituent-level constraints based on parses of the source language. We obtain substantial improvements in performance for translation from Chinese and Arabic to English. 1 Introduction The statistical revolution in machine translation beginning with Brown et al. 1993 in the early 1990s replaced an earlier era of detailed language analysis with automatic learning of shallow source-target mappings from large parallel corpora. Over the last several years however the pendulum has begun to swing back in the other direction with researchers exploring a variety of statistical models that take advantage of source- and particularly target-language syntactic analysis . Cowan et al. 2006 Zoll-mann and Venugopal 2006 Marcu et al. 2006 Galley et al. 2006 and numerous others . Chiang 2005 distinguishes statistical MT approaches that are syntactic in a formal sense go- ing beyond the finite-state underpinnings of phrasebased models from approaches that are syntactic in a linguistic sense . taking advantage of a priori language knowledge in the form

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.