TAILIEUCHUNG - Báo cáo khoa học: "Optimizing Word Alignment Combination For Phrase Table Training"

Combining word alignments trained in two translation directions has mostly relied on heuristics that are not directly motivated by intended applications. We propose a novel method that performs combination as an optimization process. Our algorithm explicitly maximizes the effectiveness function with greedy search for phrase table training or synchronized grammar extraction. Experimental results show that the proposed method leads to significantly better translation quality than existing methods. . | Optimizing Word Alignment Combination For Phrase Table Training Yonggang Deng and Bowen Zhou IBM . Watson Research Center Yorktown Heights NY 10598 USA ydeng zhou @ Abstract Combining word alignments trained in two translation directions has mostly relied on heuristics that are not directly motivated by intended applications. We propose a novel method that performs combination as an optimization process. Our algorithm explicitly maximizes the effectiveness function with greedy search for phrase table training or synchronized grammar extraction. Experimental results show that the proposed method leads to significantly better translation quality than existing methods. Analysis suggests that this simple approach is able to maintain accuracy while maximizing coverage. 1 Introduction Word alignment is the process of identifying word-to-word links between parallel sentences. It is a fundamental and often a necessary step before linguistic knowledge acquisitions such as training a phrase translation table in phrasal machine translation MT system Koehn et al. 2003 or extracting hierarchial phrase rules or synchronized grammars in syntax-based translation framework. Most word alignment models distinguish translation direction in deriving word alignment matrix. Given a parallel sentence word alignments in two directions are established first and then they are combined as knowledge source for phrase training or rule extraction. This process is also called symmetrization. It is a common practice in most state of the art MT systems. Widely used alignment models such as IBM Model serial Brown et al. 1993 and HMM all assume one-to-many alignments. Since many-to-many links are commonly observed in natural language symmetrization is able to make up for this modeling limitation. On the other hand combining two directional alignments practically can lead to improved performance. Symmetrization can also be realized during alignment model training Liang et al. 2006 Zens et

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.