TAILIEUCHUNG - Báo cáo khoa học: "Extending the BLEU MT Evaluation Method with Frequency Weightings"

We present the results of an experiment on extending the automatic method of Machine Translation evaluation BLUE with statistical weights for lexical items, such as scores. We show that this extension gives additional information about evaluated texts; in particular it allows us to measure translation Adequacy, which, for statistical MT systems, is often overestimated by the baseline BLEU method. The proposed model uses a single human reference translation, which increases the usability of the proposed method for practical purposes. . | Extending the BLEU MT Evaluation Method with Frequency Weightings Bogdan Babych Centre for Translation Studies University of Leeds Leeds LS2 9JT UK bogdan@ Anthony Hartley Centre for Translation Studies University of Leeds Leeds LS2 9JT UK Abstract We present the results of an experiment on extending the automatic method of Machine Translation evaluation BLUE with statistical weights for lexical items such as scores. We show that this extension gives additional information about evaluated texts in particular it allows us to measure translation Adequacy which for statistical MT systems is often overestimated by the baseline BLEU method. The proposed model uses a single human reference translation which increases the usability of the proposed method for practical purposes. The model suggests a linguistic interpretation which relates frequency weights and human intuition about translation Adequacy and Fluency. 1. Introduction Automatic methods for evaluating different aspects of MT quality - such as Adequacy Fluency and Informativeness - provide an alternative to an expensive and time-consuming process of human MT evaluation. They are intended to yield scores that correlate with human judgments of translation quality and enable systems machine or human to be ranked on this basis. Several such automatic methods have been proposed in recent years. Some of them use human reference translations . the BLEU method Papineni et al. 2002 which is based on comparison of N-gram models in MT output and in a set of human reference translations. However a serious problem for the BLEU method is the lack of a model for relative importance of matched and mismatched items. Words in text usually carry an unequal informational load and as a result are of differing importance for translation. It is reasonable to expect that the choices of right translation equivalents for certain key items such as expressions denoting principal events event

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.