TAILIEUCHUNG - Báo cáo khoa học: "Improved Discriminative Bilingual Word Alignment"

For many years, statistical machine translation relied on generative models to provide bilingual word alignments. In 2005, several independent efforts showed that discriminative models could be used to enhance or replace the standard generative approach. Building on this work, we demonstrate substantial improvement in word-alignment accuracy, partly though improved training methods, but predominantly through selection of more and better features. Our best model produces the lowest alignment error rate yet reported on Canadian Hansards bilingual data. . | Improved Discriminative Bilingual Word Alignment Robert C. Moore Wen-tau Yih Andreas Bode Microsoft Research Redmond WA 98052 USA bobmoore scottyhi abode @ Abstract For many years statistical machine translation relied on generative models to provide bilingual word alignments. In 2005 several independent efforts showed that discriminative models could be used to enhance or replace the standard generative approach. Building on this work we demonstrate substantial improvement in word-alignment accuracy partly though improved training methods but predominantly through selection of more and better features. Our best model produces the lowest alignment error rate yet reported on Canadian Hansards bilingual data. 1 Introduction Until recently almost all work in statistical machine translation was based on word alignments obtained from combinations of generative prob-abalistic models developed at IBM by Brown et al. 1993 sometimes augmented by an HMM-based model or Och and Ney s Model 6 Och and Ney 2003 . In 2005 however several independent efforts Liu et al. 2005 Fraser and Marcu 2005 Ayan et al. 2005 Taskar et al. 2005 Moore 2005 Ittycheriah and Roukos 2005 demonstrated that discriminatively trained models can equal or surpass the alignment accuracy of the standard models if the usual unlabeled bilingual training corpus is supplemented with human-annotated word alignments for only a small subset of the training data. The work cited above makes use of various training procedures and a wide variety of features. Indeed whereas it can be difficult to design a factorization of a generative model that incorporates all the desired information it is relatively easy to add arbitrary features to a discriminative model. We take advantage of this building on our existing framework Moore 2005 to substantially reduce the alignment error rate AER we previously reported given the same training and test data. Through a careful choice of features and modest improvements in .

Nguyên Phong 57 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Discriminative Lexicon Adaptation for Improved Character Accuracy – A New Direction in Chinese Language Modeling"

9 54 0

Báo cáo khoa học: "Improved Discriminative Bilingual Word Alignment"

8 48 0

An improved discriminative filter bank selection approach for motor imagery EEG signal classification using mutual information

13 49 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462343 61

Giới thiệu :Lập trình mã nguồn mở

14 26232 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11352 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10553 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9844 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8892 1161

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8508 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7786 1798

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7279 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Gia đình vui nhộn

4 392 3 31-12-2024

Báo cáo nghiên cứu khoa học " KẾT QUẢ NGHIÊN CỨU BƯỚC ĐẦU VỀ THIÊN ĐỊCH CHÂN KHỚP TRÊN CÂY THANH TRÀ Ở THỪA THIÊN HUẾ "

7 279 4 31-12-2024

Data Structures and Algorithms - Chapter 8: Heaps

41 188 5 31-12-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 164 1 31-12-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 192 4 31-12-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 235 7 31-12-2024

Bệnh sán lá gan trên gia súc và cách phòng trị

3 164 1 31-12-2024

5 thói quen ăn uống hủy hoại hàm răng đẹp

5 171 2 31-12-2024

The Ombudsman Enterprise and Administrative Justice

309 144 0 31-12-2024

Sáng kiến kinh nghiệm môn mỹ thuật

5 179 1 31-12-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8101 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7786 1798

Ebook Chào con ba mẹ đã sẵn sàng

112 4412 1374

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6322 1274

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8892 1161

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3846 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3921 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4724 566

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11352 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4511 490