TAILIEUCHUNG - Báo cáo khoa học: "A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation"

Inspired by previous preprocessing approaches to SMT, this paper proposes a novel, probabilistic approach to reordering which combines the merits of syntax and phrase-based SMT. Given a source sentence and its parse tree, our method generates, by tree operations, an n-best list of reordered inputs, which are then fed to standard phrase-based decoder to produce the optimal translation. Experiments show that, for the NIST MT-05 task of Chinese-toEnglish translation, the proposal leads to BLEU improvement of . . | A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation Chi-Ho Li Dongdong Zhang Mu Li Ming Zhou Minghui Li Yi Guan Microsoft Research Asia Beijing China chl dozhang@ muli mingzhou@ Harbin Institute of Technology Harbin China mhli@ guanyi@ Abstract Inspired by previous preprocessing approaches to SMT this paper proposes a novel probabilistic approach to reordering which combines the merits of syntax and phrase-based SMT. Given a source sentence and its parse tree our method generates by tree operations an n-best list of reordered inputs which are then fed to standard phrase-based decoder to produce the optimal translation. Experiments show that for the NIST MT-05 task of Chinese-to-English translation the proposal leads to BLEU improvement of . 1 Introduction The phrase-based approach has been considered the default strategy to Statistical Machine Translation SMT in recent years. It is widely known that the phrase-based approach is powerful in local lexical choice and word reordering within short distance. However long-distance reordering is problematic in phrase-based SMT. For example the distancebased reordering model Koehn et al. 2003 allows a decoder to translate in non-monotonous order under the constraint that the distance between two phrases translated consecutively does not exceed a limit known as distortion limit. In theory the distortion limit can be assigned a very large value so that all possible reorderings are allowed yet in practise it is observed that too high a distortion limit not only harms efficiency but also translation performance Koehn et al. 2005 . In our own exper-720 iment setting the best distortion limit for Chinese-English translation is 4. However some ideal translations exhibit reorderings longer than such distortion limit. Consider the sentence pair in NIST MT-2005 test set shown in figure 1 a after translating the word V mend the decoder .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.