Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
In this work, we introduce the TESLACELAB metric (Translation Evaluation of Sentences with Linear-programming-based Analysis – Character-level Evaluation for Languages with Ambiguous word Boundaries) for automatic machine translation evaluation. For languages such as Chinese where words usually have meaningful internal structure and word boundaries are often fuzzy, TESLA-CELAB acknowledges the advantage of character-level evaluation over word-level evaluation. | Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries Chang Liu and HweeTouNg Department of Computer Science National University of Singapore 13 Computing Drive Singapore 117417 liuchan1 nght @comp.nus.edu.sg Abstract In this work we introduce the TESLA-CELAB metric Translation Evaluation of Sentences with Linear-programming-based Analysis - Character-level Evaluation for Languages with Ambiguous word Boundaries for automatic machine translation evaluation. For languages such as Chinese where words usually have meaningful internal structure and word boundaries are often fuzzy TESLA-CELAB acknowledges the advantage of character-level evaluation over word-level evaluation. By reformulating the problem in the linear programming framework TESLA-CELAB addresses several drawbacks of the character-level metrics in particular the modeling of synonyms spanning multiple characters. We show empirically that TESLA-CELAB significantly outperforms characterlevel BLEU in the English-Chinese translation evaluation tasks. 1 Introduction Since the introduction of BLEU Papineni et al. 2002 automatic machine translation MT evaluation has received a lot of research interest. The Workshop on Statistical Machine Translation WMT hosts regular campaigns comparing different machine translation evaluation metrics Callison-Burch et al. 2009 Callison-Burch et al. 2010 Callison-Burch et al. 2011 . In the WMT shared tasks many new generation metrics such as METEOR Banerjee and Lavie 2005 TER Snover et al. 2006 and TESLA Liu et al. 2010 have consistently outperformed BLEU as judged by the correlations with human judgments. 921 The research on automatic machine translation evaluation is important for a number of reasons. Automatic translation evaluation gives machine translation researchers a cheap and reproducible way to guide their research and makes it possible to compare machine translation methods across different studies. In addition machine translation