TAILIEUCHUNG - Báo cáo khoa học: "Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages"

We present a novel method to improve word alignment quality and eventually the translation performance by producing and combining complementary word alignments for low-resource languages. Instead of focusing on the improvement of a single set of word alignments, we generate multiple sets of diversified alignments based on different motivations, such as linguistic knowledge, morphology and heuristics. | Diversify and Combine Improving Word Alignment for Machine Translation on Low-Resource Languages Bing Xiang Yonggang Deng and Bowen Zhou IBM T. J. Watson Research Center Yorktown Heights NY 10598 bxiang ydeng zhou @ Abstract We present a novel method to improve word alignment quality and eventually the translation performance by producing and combining complementary word alignments for low-resource languages. Instead of focusing on the improvement of a single set of word alignments we generate multiple sets of diversified alignments based on different motivations such as linguistic knowledge morphology and heuristics. We demonstrate this approach on an English-to-Pashto translation task by combining the alignments obtained from syntactic reordering stemming and partial words. The combined alignment outperforms the baseline alignment with significantly higher F-scores and better translation performance. 1 Introduction Word alignment usually serves as the starting point and foundation for a statistical machine translation SMT system. It has received a significant amount of research over the years notably in Brown et al. 1993 Ittycheriah and Roukos 2005 Fraser and Marcu 2007 Hermjakob 2009 . They all focused on the improvement of word alignment models. In this work we leverage existing aligners and generate multiple sets of word alignments based on complementary information then combine them to get the final alignment for phrase training. The resource required for this approach is little compared to what is needed to build a reasonable discriminative alignment model for example. This makes the approach especially appealing for SMT on low-resource languages. Most of the research on alignment combination in the past has focused on how to combine the alignments from two different directions source-to-target and target-to-source. Usually people start from the intersection of two sets of alignments and gradually add links in the union based on certain heuristics

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.