TAILIEUCHUNG - Báo cáo khoa học: "Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics"

In this paper we describe two new objective automatic evaluation methods for machine translation. The first method is based on longest common subsequence between a candidate translation and a set of reference translations. Longest common subsequence takes into account sentence level structure similarity naturally and identifies longest co-occurring insequence n-grams automatically. | Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics Chin-Yew Lin and Franz Josef Och Information Sciences Institute University of Southern California 4676 Admiralty Way Marina del Rey CA 90292 USA cyl och @ Abstract In this paper we describe two new objective automatic evaluation methods for machine translation. The first method is based on longest common subsequence between a candidate translation and a set of reference translations. Longest common subsequence takes into account sentence level structure similarity naturally and identifies longest co-occurring insequence n-grams automatically. The second method relaxes strict n-gram matching to skipbigram matching. Skip-bigram is any pair of words in their sentence order. Skip-bigram cooccurrence statistics measure the overlap of skip-bigrams between a candidate translation and a set of reference translations. The empirical results show that both methods correlate with human judgments very well in both adequacy and fluency. 1 Introduction Using objective functions to automatically evaluate machine translation quality is not new. Su et al. 1992 proposed a method based on measuring edit distance Levenshtein 1966 between candidate and reference translations. Akiba et al. 2001 extended the idea to accommodate multiple references. NieBen et al. 2000 calculated the length-normalized edit distance called word error rate WER between a candidate and multiple reference translations. Leusch et al. 2003 proposed a related measure called position-independent word error rate PER that did not consider word position . using bag-of-words instead. Instead of error measures we can also use accuracy measures that compute similarity between candidate and reference translations in proportion to the number of common words between them as suggested by Melamed 1995 . An n-gram co-occurrence measure Bleu proposed by Papineni et al. 2001 that calculates co-occurrence .

Bá Thành 72 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Evaluation of the effectiveness of automatic sprinkling system for shallot cultivation in Vinh Chau district, Soc Trang province

7 74 0

Báo cáo khoa học: "An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method"

8 93 0

Báo cáo khoa học: "Automatic Evaluation of Linguistic Quality in Multi-Document Summarization"

11 56 0

Báo cáo khoa học: "Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level"

6 63 0

Báo cáo khoa học: "Automatic Evaluation Method for Machine Translation using Noun-Phrase Chunking"

10 59 0

Báo cáo khoa học: "Dependency-based Evaluation for Automatic Summaries"

9 58 0

Báo cáo khoa học: "The Contribution of Linguistic Features to Automatic Machine Translation Evaluation"

9 73 0

Báo cáo khoa học: "Collecting a Why-question corpus for development and evaluation of an automatic QA-system"

9 80 0

Báo cáo khoa học: "Automatic Evaluation of Sentence-Level Fluency Andrew Mutton∗"

8 51 0

Báo cáo khoa học: "A Re-examination on Features in Regression Based Approach to Automatic MT Evaluation"

6 54 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462285 61

Giới thiệu :Lập trình mã nguồn mở

14 24844 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10508 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9785 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8463 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7185 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 132 2 23-11-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 226 7 23-11-2024

Hướng dẫn chế độ dinh dưỡng cho người bệnh viêm khớp

5 159 2 23-11-2024

Sử dụng mô hình ARCH và GARCH để phân tích và dự báo về giá cổ phiếu trên thị trường chứng khoán

24 1064 2 23-11-2024

Báo cáo " Thẩm quyền quản lí nhà nước đối với hoạt động quảng cáo thực trạng và hướng hoàn thiện "

7 196 7 23-11-2024

Valve Selection Handbook - Fourth Edition

337 139 1 23-11-2024

ĐỀ TÀI " ĐÁNH GIÁ HIỆU QUẢ HOẠT ĐỘNG KINH DOANH NGOẠI HỐI CỦA NGÂN HÀNG THƯƠNG MẠI CỔ PHẦN XUẤT NHẬP KHẨU VIỆT NAM "

51 144 3 23-11-2024

Báo cáo nghiên cứu khoa học " Sự nhất quán phát triển kinh tế thị trường XHCN trong xây dựng xã hội hài hoà của Trung Quốc và đổi mới của Việt Nam "

8 138 1 23-11-2024

Chủ đề 3 : SỰ CÂN BẰNG CỦA VẬT RẮN (4 tiết)

9 198 1 23-11-2024

Phạm trù Chủ nghĩa cá nhân của tư tưởng phương Tây trong sự lý giải của Phan Khôi _1

9 117 0 23-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7465 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6149 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3786 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4614 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11281 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4447 490