TAILIEUCHUNG - Báo cáo khoa học: "Alignment Model Adaptation for Domain-Specific Word Alignment"

This paper proposes an alignment adaptation approach to improve domain-specific (in-domain) word alignment. The basic idea of alignment adaptation is to use out-of-domain corpus to improve in-domain word alignment results. In this paper, we first train two statistical word alignment models with the large-scale out-of-domain corpus and the small-scale in-domain corpus respectively, and then interpolate these two models to improve the domain-specific word alignment. Experimental results show that our approach improves domain-specific word alignment in terms of both precision and recall, achieving a relative error rate reduction of as compared with the state-of-the-art technologies. . | Alignment Model Adaptation for Domain-Specific Word Alignment WU Hua WANG Haifeng LIU Zhanyi Toshiba China Research and Development Center 5 F. Tower W2 Oriental Plaza East Chang An Ave. Dong Cheng District Beijing 100738 China wuhua wanghaifeng liuzhanyi @ Abstract This paper proposes an alignment adaptation approach to improve domain-specific in-domain word alignment. The basic idea of alignment adaptation is to use out-of-domain corpus to improve in-domain word alignment results. In this paper we first train two statistical word alignment models with the large-scale out-of-domain corpus and the small-scale in-domain corpus respectively and then interpolate these two models to improve the domain-specific word alignment. Experimental results show that our approach improves domain-specific word alignment in terms of both precision and recall achieving a relative error rate reduction of as compared with the state-of-the-art technologies. 1 Introduction Word alignment was first proposed as an intermediate result of statistical machine translation Brown et al. 1993 . In recent years many researchers have employed statistical models Wu 1997 Och and Ney 2003 Cherry and Lin 2003 or association measures Smadja et al. 1996 Ahrenberg et al. 1998 Tufis and Barbu 2002 to build alignment links. In order to achieve satisfactory results all of these methods require a large-scale bilingual corpus for training. When the large-scale bilingual corpus is not available some researchers use existing dictionaries to improve word alignment Ker and Chang 1997 . However only a few studies Wu and Wang 2004 directly address the problem of domain-specific word alignment when neither the large-scale domain-specific bilingual corpus nor the domain-specific translation dictionary is available. In this paper we address the problem of word alignment in a specific domain in which only a small-scale corpus is available. In the domain-specific in-domain corpus there are .

Hoàng Linh 84 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Alignment Model Adaptation for Domain-Specific Word Alignment"

8 60 0

Báo cáo khoa học: "An Alignment Algorithm using Belief Propagation and a Structure-Based Distortion Model"

9 56 0

Finite-element analysis of the proximal tibial sclerotic bone and different alignment in total knee arthroplasty

9 28 1

Báo cáo khoa học: "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation"

9 70 0

Báo cáo khoa học: "An Unsupervised Model for Joint Phrase Alignment and Extraction"

10 58 0

Báo cáo khoa học: "A Tree Sequence Alignment-based Tree-to-Tree Translation Model"

9 46 0

Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web"

8 67 0

Báo cáo khoa học: "Improving IBM Word-Alignment Model "

8 52 1

Báo cáo khoa học: "A Probability Model to Improve Word Alignment"

8 45 0

General continuous-time Markov model of sequence evolution via insertions/deletions: Local alignment probability computation

21 40 1

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462282 61

Giới thiệu :Lập trình mã nguồn mở

14 24826 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11280 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10506 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9784 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8461 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8089 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7463 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7184 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

THE ANTHROPOLOGY OF ONLINE COMMUNITIES BY Samuel M.Wilson and Leighton C. Peterson

19 210 4 22-11-2024

Đóng mới oto 8 chỗ ngồi part 9

10 171 3 22-11-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 149 1 22-11-2024

Quy Trình Canh Tác Cây Bông Vải

8 148 1 22-11-2024

Chương 10: Các phương pháp tính quá trình quá độ trong mạch điện tuyến tính

57 225 7 22-11-2024

Color Atlas of Ophthamology

165 131 2 22-11-2024

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 146 1 22-11-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 146 1 22-11-2024

Báo cáo y học: "The Factors Influencing Depression Endpoints Research (FINDER) study: final results of Italian patients with depressio"

9 139 1 22-11-2024

báo cáo khoa học: "Malignant peripheral nerve sheath tumor arising from the greater omentum: Case report"

4 135 1 22-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8089 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7463 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6147 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3785 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4613 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11280 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4445 490