TAILIEUCHUNG - Báo cáo khoa học: "Pivot Language Approach for Phrase-Based Statistical Machine Translation"

This paper proposes a novel method for phrase-based statistical machine translation by using pivot language. To conduct translation between languages Lf and Le with a small bilingual corpus, we bring in a third language Lp, which is named the pivot language. For Lf-Lp and Lp-Le, there exist large bilingual corpora. Using only Lf-Lp and Lp-Le bilingual corpora, we can build a translation model for Lf-Le. The advantage of this method lies in that we can perform translation between Lf and Le even if there is no bilingual corpus available for this language pair. . | Pivot Language Approach for Phrase-Based Statistical Machine Translation Hua Wu and Haifeng Wang Toshiba China Research and Development Center 5 F. Tower W2 Oriental Plaza East Chang An Ave. Dong Cheng District Beijing 100738 China wuhua wanghaifeng @ Abstract This paper proposes a novel method for phrase-based statistical machine translation by using pivot language. To conduct translation between languages Lf and Le with a small bilingual corpus we bring in a third language Lp which is named the pivot language. For Lf-Lp and Lp-Le there exist large bilingual corpora. Using only Lf-Lp and Lp-Le bilingual corpora we can build a translation model for Lf-Le. The advantage of this method lies in that we can perform translation between Lf and Le even if there is no bilingual corpus available for this language pair. Using BLEU as a metric our pivot language method achieves an absolute improvement of relative as compared with the model directly trained with 5 000 Lf-Le sentence pairs for French-Spanish translation. Moreover with a small Lf-Le bilingual corpus available our method can further improve the translation quality by using the additional Lf-Lp and Lp-Le bilingual corpora. 1 Introduction For statistical machine translation SMT phrasebased methods Koehn et al. 2003 Och and Ney 2004 and syntax-based methods Wu 1997 Al-shawi et al. 2000 Yamada and Knignt 2001 Melamed 2004 Chiang 2005 Quick et al. 2005 Mellebeek et al. 2006 outperform word-based methods Brown et al. 1993 . These methods need large bilingual corpora. However for some lan guages pairs only a small bilingual corpus is available which will degrade the performance of statistical translation systems. To solve this problem this paper proposes a novel method for phrase-based SMT by using a pivot language. To perform translation between languages Lf and Le we bring in a pivot language Lp for which there exist large bilingual corpora for language pairs Lf-Lp and Lp-Le. With the

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.