TAILIEUCHUNG - Báo cáo khoa học: "Scalable Inference and Training of Context-Rich Syntactic Translation Models"

Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley et al., 2004) from aligned tree-string pairs, and present two main extensions of their approach: first, instead of merely computing a single derivation that minimally explains a sentence pair, we construct a large number of derivations that include contextually richer rules, and account for multiple interpretations of unaligned words. . | Scalable Inference and Training of Context-Rich Syntactic Translation Models Michel Galley Jonathan Graehl Kevin Knight Daniel Marcu Steve DeNeefe Wei Wang- and Ignacio Thayer Columbia University University of Southern California Language Weaver Inc. Dept. of Computer Science Information Sciences Institute 4640 Admiralty Way New York NY 10027 Marina del Rey CA 90292 Marina del Rey CA 90292 galley@ graehl knight marcu sdeneefe @ wwang@ thayer@ Abstract Statistical MT has made great progress in the last few years but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper we take the framework for acquiring multi-level syntactic translation rules of Galley et al. 2004 from aligned tree-string pairs and present two main extensions of their approach first instead of merely computing a single derivation that minimally explains a sentence pair we construct a large number of derivations that include contextually richer rules and account for multiple interpretations of unaligned words. Second we propose probability estimates and a training procedure for weighting these rules. We contrast different approaches on real examples show that our estimates based on multiple derivations favor phrasal re-orderings that are linguistically better motivated and establish that our larger rules provide a BLEU point increase over minimal rules. 1 Introduction While syntactic approaches seek to remedy wordordering problems common to statistical machine translation SMT systems many of the earlier models particularly child re-ordering models fail to account for human translation behavior. Galley et al. 2004 alleviate this modeling problem and present a method for acquiring millions of syntactic transfer rules from bilingual corpora which we review below. Here we make the following new contributions 1 we show how to acquire larger rules that .

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.