Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Translation needs have greatly increased during the last years. In many situations, text to be translated constitutes an unbounded stream of data that grows continually with time. An effective approach to translate text documents is to follow an interactive-predictive paradigm in which both the system is guided by the user and the user is assisted by the system to generate error-free translations. Unfortunately, when processing such unbounded data streams even this approach requires an overwhelming amount of manpower. . | Active learning for interactive machine translation Jesus Gonzalez-Rubio and Daniel Ortiz-Martinez and Francisco Casacuberta D. de Sistemas Informaticos y Computation U. Politecnica de Valencia C. de Vera s n 46022 Valencia Spain jegonzalez dortiz fcn @dsic.upv.es Abstract Translation needs have greatly increased during the last years. In many situations text to be translated constitutes an unbounded stream of data that grows continually with time. An effective approach to translate text documents is to follow an interactive-predictive paradigm in which both the system is guided by the user and the user is assisted by the system to generate error-free translations. Unfortunately when processing such unbounded data streams even this approach requires an overwhelming amount of manpower. Is in this scenario where the use of active learning techniques is compelling. In this work we propose different active learning techniques for interactive machine translation. Results show that for a given translation quality the use of active learning allows us to greatly reduce the human effort required to translate the sentences in the stream. 1 Introduction Translation needs have greatly increased during the last years due to phenomena such as globalization and technologic development. For example the European Parliament1 translates its proceedings to 22 languages in a regular basis or Project Syndicate2 that translates editorials into different languages. In these and many other examples data can be viewed as an incoming unbounded stream since it grows continually with time Levenberg et al. 2010 . Manual translation of such streams of data is extremely expensive given the huge volume of translation required 1 http www.europarl.europa.eu 2http project-syndicate.org therefore various automatic machine translation methods have been proposed. However automatic statistical machine translation SMT systems are far from generating error-free translations and their outputs usually .