TAILIEUCHUNG - Báo cáo khoa học: "Optimizing Word Alignment Combination For Phrase Table Training"

Combining word alignments trained in two translation directions has mostly relied on heuristics that are not directly motivated by intended applications. We propose a novel method that performs combination as an optimization process. Our algorithm explicitly maximizes the effectiveness function with greedy search for phrase table training or synchronized grammar extraction. Experimental results show that the proposed method leads to signiﬁcantly better translation quality than existing methods. . | Optimizing Word Alignment Combination For Phrase Table Training Yonggang Deng and Bowen Zhou IBM . Watson Research Center Yorktown Heights NY 10598 USA ydeng zhou @ Abstract Combining word alignments trained in two translation directions has mostly relied on heuristics that are not directly motivated by intended applications. We propose a novel method that performs combination as an optimization process. Our algorithm explicitly maximizes the effectiveness function with greedy search for phrase table training or synchronized grammar extraction. Experimental results show that the proposed method leads to significantly better translation quality than existing methods. Analysis suggests that this simple approach is able to maintain accuracy while maximizing coverage. 1 Introduction Word alignment is the process of identifying word-to-word links between parallel sentences. It is a fundamental and often a necessary step before linguistic knowledge acquisitions such as training a phrase translation table in phrasal machine translation MT system Koehn et al. 2003 or extracting hierarchial phrase rules or synchronized grammars in syntax-based translation framework. Most word alignment models distinguish translation direction in deriving word alignment matrix. Given a parallel sentence word alignments in two directions are established first and then they are combined as knowledge source for phrase training or rule extraction. This process is also called symmetrization. It is a common practice in most state of the art MT systems. Widely used alignment models such as IBM Model serial Brown et al. 1993 and HMM all assume one-to-many alignments. Since many-to-many links are commonly observed in natural language symmetrization is able to make up for this modeling limitation. On the other hand combining two directional alignments practically can lead to improved performance. Symmetrization can also be realized during alignment model training Liang et al. 2006 Zens et

Nghĩa Hòa 47 4 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Optimizing Word Alignment Combination For Phrase Table Training"

4 37 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 461856 55

Giới thiệu :Lập trình mã nguồn mở

14 22583 57

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10880 529

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10043 445

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9510 104

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8267 1124

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8215 423

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7862 2220

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 6664 253

Vật lý hạt cơ bản (1)

29 5764 85

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Giáo án mầm non chương trình đổi mới: Đề tài: Ôn xác định vị trí trên – dưới, trước- sau của đối tượng khác.

8 352 3 24-04-2024

Báo cáo khoa học: Loss of kinase activity in Mycobacterium tuberculosis multidomain protein Rv1364c

14 234 0 24-04-2024

Mass Transfer in Multiphase Systems and its Applications Part 19

40 255 1 24-04-2024

Oreilly learning the vi Editor phần 4

19 228 0 24-04-2024

Management and Services Part 1

10 155 0 24-04-2024

Công nghiệp gang thép Việt Nam : Một giai đoạn phát triển và chuyển đổi chính sách mới part 5

6 194 0 24-04-2024

QUẢN LÝ CHẤT LƯỢNG KHÔNG KHÍ

75 136 0 24-04-2024

XỬ TRÍ CHẤN THƯƠNG SỌ NÃO KÍN

1 113 1 24-04-2024

Bài Tiểu Luận Chuyên Đề Tổ Chức Hoạt Động Nhận Thức Trong Dạy Học Vật Lý " Định Luật Ôm Cho Các Loại Đoạn Mạch Chứa Nguồn Điện"

10 150 3 24-04-2024

Christmas Meditations on the Twelve Holy Days

173 103 0 24-04-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 7862 2220

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 5667 1347

Ebook Chào con ba mẹ đã sẵn sàng

112 3757 1230

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 5295 1134

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8267 1124

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3480 641

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 10880 529

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3677 525

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4038 514

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4118 480