TAILIEUCHUNG - Báo cáo khoa học: "Bridging Morpho-Syntactic Gap between Source and Target Sentences for English-Korean Statistical Machine Translation"

Often, Statistical Machine Translation (SMT) between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessary function words, or by reordering source sentences. However, the removal of function words can cause a serious loss in information. In this paper, we present a possible method of bridging the morpho-syntactic gap for EnglishKorean SMT. | Bridging Morpho-Syntactic Gap between Source and Target Sentences for English-Korean Statistical Machine Translation Gumwon Hong Seung-Wook Lee and Hae-Chang Rim Department of Computer Science Engineering Korea University Seoul 136-713 Korea gwhong swlee rim @ Abstract Often Statistical Machine Translation SMT between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessary function words or by reordering source sentences. However the removal of function words can cause a serious loss in information. In this paper we present a possible method of bridging the morpho-syntactic gap for English-Korean SMT. In particular the proposed method tries to transform a source sentence by inserting pseudo words and by reordering the sentence in such a way that both sentences have a similar length and word order. The proposed method achieves increase in BLEU score over baseline phrase-based system. 1 Introduction Phrase-based SMT models have performed reasonably well on languages where the syntactic structures are very similar including languages such as French and English. However Collins et al. 2005 demonstrated that phrase-based models have limited potential when applied to languages that have a relatively different word order such is the case between German and English. They proposed a clause restructuring method for reordering German sentences in order to resemble the order of English sentences. By modifying the source sentence structure into the target sentence structure they argued that they could solve the decoding problem by use of completely monotonic translation. The translation from English to Korean can be more difficult than the translation of other language pairs for the following reasons First Korean is language isolate that is it has little ge nealogical relations with other natural Second the word order in Korean is relatively free because the functional .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.