Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We define noun phrase translation as a subtask of machine translation. This enables us to build a dedicated noun phrase translation subsystem that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features. We achieved 65.5% translation accuracy in a German-English translation task vs. 53.2% with IBM Model 4. | Feature-Rich Statistical Translation of Noun Phrases Philipp Koehn and Kevin Knight Information Sciences Institute Department of Computer Science University of Southern California koehn@isi.edu knight@isi.edu Abstract We define noun phrase translation as a subtask of machine translation. This enables us to build a dedicated noun phrase translation subsystem that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features. We achieved 65.5 translation accuracy in a German-English translation task vs. 53.2 with IBM Model 4. 1 Introduction Recent research in machine translation challenges us with the exciting problem of combining statistical methods with prior linguistic knowledge. The power of statistical methods lies in the quick acquisition of knowledge from vast amounts of data while linguistic analysis both provides a fitting framework for these methods and contributes additional knowledge sources useful for finding correct translations. We present work that successfully defines a subtask of machine translation the translation of noun phrases. We demonstrate through analysis and experiments that it is feasible and beneficial to treat noun phrase translation as a subtask. This opens the path to dedicated modeling of other types of syntactic constructs e.g. verb clauses where issues of subcategorization of the verb play a big role. Focusing on a narrower problem allows not only more dedicated modeling but also the use of computationally more expensive methods. We go on to tackle the task of noun phrase translation in a maximum entropy reranking framework. Treating translation as a reranking problem instead of as a search problem enables us to use features over the full translation pair. We integrate both empirical and symbolic knowledge sources as features into our system which outperforms the best known methods in statistical machine translation. Previous work on defining subtasks within .