TAILIEUCHUNG - Báo cáo khoa học: "A Class-Based Agreement Model for Generating Accurately Inflected Translations"

When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. | A Class-Based Agreement Model for Generating Accurately Inflected Translations Spence Green Computer Science Department Stanford University spenceg@ John DeNero Google denero@ Abstract When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number the output commonly contains morpho-syntactic agreement errors. To address this issue we present a target-side class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation our model yields a BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1 Introduction Languages vary in the degree to which surface forms reflect grammatical relations. English is a weakly inflected language it has a narrow verbal paradigm restricted nominal inflection plurals and only the vestiges of a case system. Consequently translation into English which accounts for much of the machine translation MT literature Lopez 2008 often involves some amount of morpho-syntactic dimensionality reduction. Less attention has been paid to what happens during translation rom English richer grammatical features such as gender dual number and overt case are effectively latent variables that must be inferred during decoding. Consider the output of Google Translate for the simple English sentence in Fig. 1. The correct translation is a monotone mapping of the input. However in Arabic SVO word order requires both gender and number agreement between the subject OjL the car and verb ị ỹj go . The MT system selects the correct verb stem but with masculine inflection. Although the translation has 146 1 èp 1 1 @ IaA Yj é Q .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.