Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Lexicalized reordering models play a crucial role in phrase-based translation systems. They are usually learned from the word-aligned bilingual corpus by examining the reordering relations of adjacent phrases. Instead of just checking whether there is one phrase adjacent to a given phrase, we argue that it is important to take the number of adjacent phrases into account for better estimations of reordering models. We propose to use a structure named reordering graph, which represents all phrase segmentations of a sentence pair, to learn lexicalized reordering models efficiently. . | Learning Lexicalized Reordering Models from Reordering Graphs Jinsong Su Yang Liu Yajuan Lu Haitao Mi Qun Liu Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences P.O. Box 2704 Beijing 100190 China sujinsong yliu lvyajuan htmi liuqun @ict.ac.cn Abstract Lexicalized reordering models play a crucial role in phrase-based translation systems. They are usually learned from the word-aligned bilingual corpus by examining the reordering relations of adjacent phrases. Instead of just checking whether there is one phrase adjacent to a given phrase we argue that it is important to take the number of adjacent phrases into account for better estimations of reordering models. We propose to use a structure named reordering graph which represents all phrase segmentations of a sentence pair to learn lex-icalized reordering models efficiently. Experimental results on the NIST Chinese-English test sets show that our approach significantly outperforms the baseline method. 1 Introduction Phrase-based translation systems Koehn et al. 2003 Och and Ney 2004 prove to be the state-of-the-art as they have delivered translation performance in recent machine translation evaluations. While excelling at memorizing local translation and reordering phrase-based systems have difficulties in modeling permutations among phrases. As a result it is important to develop effective reordering models to capture such non-local reordering. The early phrase-based paradigm Koehn et al. 2003 applies a simple distance-based distortion penalty to model the phrase movements. More recently many researchers have presented lexicalized reordering models that take advantage of lexical information to predict reordering Tillmann 2004 Xiong et al. 2006 Zens and Ney 2006 Koehn et Figure 1 Occurrence of a swap with different numbers of adjacent bilingual phrases only one phrase in a and three phrases in b . Black squares denote word alignments and gray rectangles