TAILIEUCHUNG - Báo cáo khoa học: "A Discriminative Latent Variable Model for Statistical Machine Translation"

Large-scale discriminative machine translation promises to further the state-of-the-art, but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple, equivalent translations. We present a translation model which models derivations as a latent variable, in both training and decoding, and is fully discriminative and globally optimised. Results show that accounting for multiple derivations does indeed improve performance. Additionally, we show that regularisation is essential for maximum conditional likelihood models in order to avoid degenerate solutions. . | A Discriminative Latent Variable Model for Statistical Machine Translation Phil Blunsom Trevor Cohn and Miles Osborne School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh Eh8 9LW uK pblunsom tcohn miles @ Abstract Large-scale discriminative machine translation promises to further the state-of-the-art but has failed to deliver convincing gains over current heuristic frequency count systems. We argue that a principle reason for this failure is not dealing with multiple equivalent translations. We present a translation model which models derivations as a latent variable in both training and decoding and is fully discriminative and globally optimised. Results show that accounting for multiple derivations does indeed improve performance. Additionally we show that regularisation is essential for maximum conditional likelihood models in order to avoid degenerate solutions. 1 Introduction Statistical machine translation SMT has seen a resurgence in popularity in recent years with progress being driven by a move to phrase-based and syntax-inspired approaches. Progress within these approaches however has been less dramatic. We believe this is because these frequency count based1 models cannot easily incorporate non-independent and overlapping features which are extremely useful in describing the translation process. Discriminative models of translation can include such features without making assumptions of independence or explicitly modelling their interdependence. However while discriminative models promise much they have not been shown to deliver significant gains 1We class approaches using minimum error rate training Och 2003 frequency count based as these systems re-scale a handful of generative features estimated from frequency counts and do not support large sets of non-independent features. over their simpler cousins. We argue that this is due to a number of inherent problems that discriminative models for SMT must address in .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.