TAILIEUCHUNG - Báo cáo khoa học: "Phrase-Based Statistical Machine Translation as a Traveling Salesman Problem"

An efficient decoding algorithm is a crucial element of any statistical machine translation system. Some researchers have noted certain similarities between SMT decoding and the famous Traveling Salesman Problem; in particular (Knight, 1999) has shown that any TSP instance can be mapped to a sub-case of a word-based SMT model, demonstrating NP-hardness of the decoding task. In this paper, we focus on the reverse mapping, showing that any phrase-based SMT decoding problem can be directly reformulated as a TSP. The transformation is very natural, deepens our understanding of the decoding problem, and allows direct use of any of the. | Phrase-Based Statistical Machine Translation as a Traveling Salesman Problem Mikhail Zaslavskiy Mines ParisTech Institut Curie 77305 Fontainebleau France Marc Dymetman Nicola Cancedda Xerox Research Centre Europe 38240 Meylan France @ Abstract An efficient decoding algorithm is a crucial element of any statistical machine translation system. Some researchers have noted certain similarities between SMT decoding and the famous Traveling Salesman Problem in particular Knight 1999 has shown that any TSP instance can be mapped to a sub-case of a word-based SMT model demonstrating NP-hardness of the decoding task. In this paper we focus on the reverse mapping showing that any phrase-based SMT decoding problem can be directly reformulated as a TSP. The transformation is very natural deepens our understanding of the decoding problem and allows direct use of any of the powerful existing TSP solvers for SMT decoding. We test our approach on three datasets and compare a TSP-based decoder to the popular beam-search algorithm. In all cases our method provides competitive or better performance. 1 Introduction Phrase-based systems Koehn et al. 2003 are probably the most widespread class of Statistical Machine Translation systems and arguably one of the most successful. They use aligned sequences of words called biphrases as building blocks for translations and score alternative candidate translations for the same source sentence based on a log-linear model of the conditional probability of target sentences given the source sentence p T a S - exp V Xkhk S a T 1 Zs k where the hk are features that is functions of the source string S of the target string T and of the This work was conducted during an internship at XRCE. alignment a where the alignment is a representation of the sequence of biphrases that where used in order to build T from S The Xk s are weights and ZS is a normalization factor that guarantees .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.