TAILIEUCHUNG - Báo cáo khoa học: "Reﬁned Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach"

Typically, the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information, which often leads to problems in performing a correct word sense disambiguation. One way to deal with this problem within the statistical framework is to use maximum entropy methods. In this paper, we present how to use this type of information within a statistical machine translation system. We show that it is possible to signiﬁcantly decrease training and test corpus perplexity of the translation models. . | Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach Ismael García Varea Dpto. de Informatica Univ. de Castilla-La Mancha Campus Universitario s n 02071 Albacete Spain ivarea@ Franz J. Och and Hermann Ney Lehrstuhl fur Inf. VI RWTH Aachen Ahornstr. 55 D-52056 Aachen Germany och ney @ Francisco Casacuberta Dpto. de Sist. Inf. y Comp. Inst. Tecn. de Inf. UPV Avda. de Los Naranjos s n 46071 Valencia Spain fcn@ Abstract Typically the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information which often leads to problems in performing a correct word sense disambiguation. One way to deal with this problem within the statistical framework is to use maximum entropy methods. In this paper we present how to use this type of information within a statistical machine translation system. We show that it is possible to significantly decrease training and test corpus perplexity of the translation models. In addition we perform a rescoring of V-Best lists using our maximum entropy model and thereby yield an improvement in translation quality. Experimental results are presented on the so-called Verbmobil Task . 1 Introduction Typically the lexicon models used in statistical machine translation systems are only single-word based that is one word in the source language corresponds to only one word in the target language. Those lexicon models lack from context information that can be extracted from the same parallel corpus. This additional information could be Simple context information information of the words surrounding the word pair Syntactic information part-of-speech information syntactic constituent sentence mood Semantic information disambiguation information . from WordNet cur-rent previous speech or dialog act. To include this additional information within the statistical framework we use the maximum entropy approach. This .

Hoa Tiên 62 8 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Reﬁned Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach"

8 43 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462286 61

Giới thiệu :Lập trình mã nguồn mở

14 24867 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11283 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10510 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9786 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8465 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7467 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7186 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Đóng mới oto 8 chỗ ngồi part 9

10 171 3 24-11-2024

báo cáo hóa học:" Increased androgen receptor expression in serous carcinoma of the ovary is associated with an improved survival"

6 150 3 24-11-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 149 1 24-11-2024

Quy Trình Canh Tác Cây Bông Vải

8 148 2 24-11-2024

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 146 1 24-11-2024

BÀI GIẢNG Biến Đổi Năng Lượng Điện Cơ - TS. Hồ Phạm Huy

137 146 1 24-11-2024

Báo cáo " Thẩm quyền quản lí nhà nước đối với hoạt động quảng cáo thực trạng và hướng hoàn thiện "

7 196 7 24-11-2024

ETHICAL CODE HANDBOOK: Demonstrate your commitment to high standards

7 139 1 24-11-2024

Báo cáo nghiên cứu khoa học " Đại hội XVI thông qua điều lệ Đảng cộng sản Trung Quốc những sửa đổi bổ sung mới "

4 155 1 24-11-2024

Sáng kiến kinh nghiệm môn mỹ thuật

5 163 1 24-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7467 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4364 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6151 1258

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3787 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4615 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11283 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4449 490