Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We address the problem of translating from morphologically poor to morphologically rich languages by adding per-word linguistic information to the source language. We use the syntax of the source sentence to extract information for noun cases and verb persons and annotate the corresponding words accordingly. In experiments, we show improved performance for translating from English into Greek and Czech. For English–Greek, we reduce the error on the verb conjugation from 19% to 5.4% and noun case agreement from 9% to 6%. . | Enriching Morphologically Poor Languages for Statistical Machine Translation Eleftherios Avramidis Philipp Koehn e.avramidis@sms.ed.ac.uk pkoehn@inf.ed.ac.uk School of Informatics University of Edinburgh 2 Baccleuch Place Edinburgh EH8 9LW UK Abstract We address the problem of translating from morphologically poor to morphologically rich languages by adding per-word linguistic information to the source language. We use the syntax of the source sentence to extract information for noun cases and verb persons and annotate the corresponding words accordingly. In experiments we show improved performance for translating from English into Greek and Czech. For English-Greek we reduce the error on the verb conjugation from 19 to 5.4 and noun case agreement from 9 to 6 . 1 Introduction Traditional statistical machine translation methods are based on mapping on the lexical level which takes place in a local window of a few words. Hence they fail to produce adequate output in many cases where more complex linguistic phenomena play a role. Take the example of morphology. Predicting the correct morphological variant for a target word may not depend solely on the source words but require additional information about its role in the sentence. Recent research on handling rich morphology has largely focused on translating from rich morphology languages such as Arabic into English Habash and Sadat 2006 . There has been less work on the opposite case translating from English into morphologically richer languages. In a study of translation quality for languages in the Europarl corpus Koehn 2005 reports that translating into morphologically richer languages is more difficult than translating from them. There are intuitive reasons why generating richer morphology from morphologically poor languages is harder. Take the example of translating noun phrases from English to Greek or German Czech etc. . In English a noun phrase is rendered the same if it is the subject or the object. However .