TAILIEUCHUNG - Báo cáo khoa học: "Supertagged Phrase-Based Statistical Machine Translation"

Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. We describe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. Despite the differences between these two approaches, the supertaggers give similar improvements. In addition to supertagging, we also explore the utility of a surface. | Supertagged Phrase-Based Statistical Machine Translation Hany Hassan School of Computing Dublin City University Dublin 9 Ireland hhasan@ Khalil Sima an Language and Computation University of Amsterdam Amsterdam The Netherlands simaan@ Andy Way School of Computing Dublin City University Dublin 9 Ireland away@ Abstract Until quite recently extending Phrase-based Statistical Machine Translation PBSMT with syntactic structure caused system performance to deteriorate. In this work we show that incorporating lexical syntactic descriptions in the form of supertags can yield significantly better PBSMT systems. Wede-scribe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. Despite the differences between these two approaches the supertaggers give similar improvements. In addition to supertagging we also explore the utility of a surface global grammaticality measure based on combinatory operators. We perform various experiments on the Arabic to English NIST 2005 test set addressing issues such as sparseness scalability and the utility of system subcomponents. Our best result BLEU improves by relative to a state-of-the-art PBSMT model which compares very favourably with the leading systems on the NIST 2005 task. 1 Introduction Within the field of Machine Translation by far the most dominant paradigm is Phrase-based Statistical Machine Translation PBSMT Koehn et al. 2003 288 Tillmann Xia 2003 . However unlike in rule- and example-based MT it has proven difficult to date to incorporate linguistic syntactic knowledge in order to improve translation quality. Only quite recently have Chiang 2005 and Marcu et al. 2006 shown that incorporating some form of syntactic structure could show improvements over a baseline PBSMT system. While .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.