TAILIEUCHUNG - Báo cáo khoa học: "Forest Rescoring: Faster Decoding with Integrated Language Models ∗"

Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on both phrase-based and syntax-based MT systems. In both cases, our methods achieve significant speed improvements, often by more than a factor of ten, over the conventional beam-search method at the same levels of search error and translation accuracy. . | Forest Rescoring Faster Decoding with Integrated Language Models Liang Huang University of Pennsylvania Philadelphia PA 19104 lhuang3@ David Chiang USC Information Sciences Institute Marina del Rey CA 90292 chiang@ Abstract Efficient decoding has been a fundamental problem in machine translation especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on both phrase-based and syntax-based MT systems. In both cases our methods achieve significant speed improvements often by more than a factor of ten over the conventional beam-search method at the same levels of search error and translation accuracy. 1 Introduction Recent efforts in statistical machine translation MT have seen promising improvements in output quality especially the phrase-based models Och and Ney 2004 and syntax-based models Chiang 2005 Galley et al. 2006 . However efficient decoding under these paradigms especially with integrated language models LMs remains a difficult problem. Part of the complexity arises from the expressive power of the translation model for example a phrase- or word-based model with full reordering has exponential complexity Knight 1999 . The language model also if fully integrated into the decoder introduces an expensive overhead for maintaining target-language boundary words for dynamic The authors would like to thank Dan Gildea Jonathan Graehl Mark Johnson Kevin Knight Daniel Marcu Bob Moore and Hao Zhang. L. H. was partially supported by NSF ITR grants IIS-0428020 while visiting USC ISI and EIA-0205456 at UPenn. D. C. was partially supported under the GALE DARPA program contract HR0011-06-C-0022. 144 programming Wu 1996 Och and Ney 2004 . In practice one must prune the search space aggressively to reduce it to a reasonable size. A much simpler alternative method to incorporate the LM is .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.