TAILIEUCHUNG - Báo cáo khoa học: "Machine Translation System Combination using ITG-based Alignments∗"

Given several systems’ automatic translations of the same sentence, we show how to combine them into a confusion network, whose various paths represent composite translations that could be considered in a subsequent rescoring step. We build our confusion networks using the method of Rosti et al. (2007), but, instead of forming alignments using the tercom script (Snover et al., 2006), we create alignments that minimize invWER (Leusch et al., 2003), a form of edit distance that permits properly nested block movements of substrings. . | Machine Translation System Combination using ITG-based Alignments Damianos Karakos Jason Eisner Sanjeev Khudanpur Markus Dreyer Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218 damianos eisner khudanpur dreyer @ Abstract Given several systems automatic translations of the same sentence we show how to combine them into a confusion network whose various paths represent composite translations that could be considered in a subsequent rescoring step. We build our confusion networks using the method of Rosti et al. 2007 but instead of forming alignments using the tercom script Snover et al. 2006 we create alignments that minimize invWER Leusch et al. 2003 a form of edit distance that permits properly nested block movements of substrings. Oracle experiments with Chinese newswire and weblog translations show that our confusion networks contain paths which are significantly better in terms of BLEU and TER than those in tercom-based confusion networks. 1 Introduction Large improvements in machine translation MT may result from combining different approaches to MT with mutually complementary strengths. System-level combination of translation outputs is a promising path towards such improvements. Yet there are some significant hurdles in this path. One must somehow align the multiple outputs to identify where different hypotheses reinforce each other and where they offer alternatives. One must then This work was partially supported by the DARPA GALE program Contract No HR0011-06-2-0001 . Also we would like to thank the IBM Rosetta team for the availability of several MT system outputs. use this alignment to hypothesize a set of new composite translations and select the best composite hypothesis from this set. The alignment step is difficult because different MT approaches usually reorder the translated words differently. Training the selection step is difficult because identifying the best hypothesis relative to a known reference .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.