TAILIEUCHUNG - Báo cáo khoa học: "Improving the IBM Alignment Models Using Variational Bayes"

Bayesian approaches have been shown to reduce the amount of overﬁtting that occurs when running the EM algorithm, by placing prior probabilities on the model parameters. We apply one such Bayesian technique, variational Bayes, to the IBM models of word alignment for statistical machine translation. | Improving the IBM Alignment Models Using Variational Bayes Darcey Riley and Daniel Gildea Computer Science Dept. University of Rochester Rochester NY 14627 Abstract Bayesian approaches have been shown to reduce the amount of overfitting that occurs when running the EM algorithm by placing prior probabilities on the model parameters. We apply one such Bayesian technique variational Bayes to the IBM models of word alignment for statistical machine translation. We show that using variational Bayes improves the performance of the widely used GIZA software as well as improving the overall performance of the Moses machine translation system in terms of BLEU score. 1 Introduction The IBM Models of word alignment Brown et al. 1993 along with the Hidden Markov Model HMM Vogel et al. 1996 serve as the starting point for most current state-of-the-art machine translation systems both phrase-based and syntax-based Koehn et al. 2007 Chiang 2005 Galley et al. 2004 . Both the IBM Models and the HMM are trained using the EM algorithm Dempster et al. 1977 . Recently Bayesian techniques have become widespread in applications of EM to natural language processing tasks as a very general method of controlling overfitting. For instance Johnson 2007 showed the benefits of such techniques when applied to HMMs for unsupervised part of speech tagging. In machine translation Blunsom et al. 2008 and DeNero et al. 2008 use Bayesian techniques to learn bilingual phrase pairs. In this setting which involves finding a segmentation of the input sentences into phrasal units it is particularly important to control the tendency of EM to choose longer phrases 306 which explain the training data well but are unlikely to generalize. However most state-of-the-art machine translation systems today are built on the basis of wordlevel alignments of the type generated by GIZA from the IBM Models and the HMM. Overfitting is also a problem in this context and improving these word alignment systems could be of .

Thu Nguyệt 57 5 pdf

Upload

Bấm vào đây để xem trước nội dung

Tải xuống

TÀI LIỆU LIÊN QUAN

Feedback and feedforward: Focal points for improving academic performance

9 88 2

Design and simulation analysis of an electrostatic actuator for improving the performance of scanning probe nanolithography

10 84 0

Cloud based learning system for improving students’ programming skills and self-efficacy

23 81 0

Proposals for improving evaluation systems in higher education: An approach from the model 'Working with People'

17 105 0

Doctoral dissertation abstract: Improving IA organization at Baoviet Holdings

27 86 2

Abstract in economics: Methods of improving argricultural development efficiency in mountainous areas of Thanh Hoa province

27 101 2

Optimal plate-fin design for improving heat dissipation performance of automobile lamp reflectors

12 65 1

Research paper an investigation into improving vocabulary for the freshmen: Problems and solutions

38 49 6

Improving energy efficient QOS performance for heterogeneous MANET

12 84 0

Improving effectiveness of parallel machine scheduling with earliness and tardiness costs: A case study

18 69 0

TÀI LIỆU XEM NHIỀU

Một Case Về Hematology (1)

8 462295 61

Giới thiệu :Lập trình mã nguồn mở

14 24942 79

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11287 542

Câu hỏi và đáp án bài tập tình huống Quản trị học

14 10513 466

Phân tích và làm rõ ý kiến sau: “Bài thơ Tự tình II vừa nói lên bi kịch duyên phận vừa cho thấy khát vọng sống, khát vọng hạnh phúc của Hồ Xuân Hương”

3 9791 108

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Tiểu luận: Nội dung tư tưởng Hồ Chí Minh về đạo đức

16 8467 426

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7474 1763

Đề tài: Dự án kinh doanh thời trang quần áo nữ

17 7190 268

TỪ KHÓA LIÊN QUAN

TÀI LIỆU MỚI ĐĂNG

Data Structures and Algorithms - Chapter 8: Heaps

41 172 5 27-11-2024

Giáo trình phân tích phương trình vi phân viết dưới dạng thuật toán đặc tính của hệ thống p1

5 149 1 27-11-2024

Báo cáo nghiên cứu nông nghiệp " Biofertiliser inoculant technology for the growth of rice in Vietnam: Developing technical infrastructure for quality assurance and village production for farmers "

12 133 2 27-11-2024

Báo cáo nghiên cứu nông nghiệp " Field control of pest fruit flies in Vietnam "

14 182 4 27-11-2024

báo cáo hóa học:" Perceptions of rewards among volunteer caregivers of people living with AIDS working in faith-based organizations in South Africa: a qualitative study"

10 147 1 27-11-2024

OPEN SOURCE ERP REASONABLE TOOLS FOR MANUFACTURING SMEs?

1 142 1 27-11-2024

Xinh xinh vườn nhà

6 128 0 27-11-2024

Báo cáo lâm nghiệp: "Assessment of the effects of below-zero temperatures on photosynthesis and chlorophyll a fluorescence in leaf discs of Eucalyptus globulu"

4 131 0 27-11-2024

TRẮC NGHIỆM - CÁC BỆNH THIẾU DINH DƯỠNG THƯỜNG GẶP

32 201 2 27-11-2024

Phạm trù Chủ nghĩa cá nhân của tư tưởng phương Tây trong sự lý giải của Phan Khôi _1

9 119 0 27-11-2024

TÀI LIỆU HOT

Mẫu đơn thông tin ứng viên ngân hàng VIB

8 8090 2279

Giáo trình Tư tưởng Hồ Chí Minh - Mạch Quang Thắng (Dành cho bậc ĐH - Không chuyên ngành Lý luận chính trị)

152 7474 1763

Ebook Chào con ba mẹ đã sẵn sàng

112 4366 1369

Ebook Tuyển tập đề bài và bài văn nghị luận xã hội: Phần 1

62 6158 1259

Ebook Facts and Figures – Basic reading practice: Phần 1 – Đặng Tuấn Anh (Dịch)

249 8876 1160

Giáo trình Văn hóa kinh doanh - PGS.TS. Dương Thị Liễu

561 3791 680

Giáo trình Sinh lí học trẻ em: Phần 1 - TS Lê Thanh Vân

122 3909 609

Giáo trình Pháp luật đại cương: Phần 1 - NXB ĐH Sư Phạm

274 4620 562

Tiểu luận: Tư tưởng Hồ Chí Minh về xây dựng nhà nước trong sạch vững mạnh

13 11287 542

Bài tập nhóm quản lý dự án: Dự án xây dựng quán cafe

35 4455 490