TAILIEUCHUNG - Báo cáo khoa học: "Extending MARIE: an N -gram-based SMT decoder"

In this paper we present several extensions of MARIE1 , a freely available N -gram-based statistical machine translation (SMT) decoder. The extensions mainly consist of the ability to accept and generate word graphs and the introduction of two new N -gram models in the loglinear combination of feature functions the decoder implements. Additionally, the decoder is enhanced with a caching strategy that reduces the number of N -gram calls improving the overall search efficiency. Experiments are carried out over the Eurpoean Parliament Spanish-English translation task. . | Extending MARIE an N-gram-based SMT decoder Josep M. Crego TALP Research Center Universitat Politecnica de Catalunya Barcelona 08034 jmcrego@ Jose B. Marino TALP Research Center Universitat Politecnica de Catalunya Barcelona 08034 canton@ Abstract In this paper we present several extensions of MARIE1 a freely available N-gram-based statistical machine translation SMT decoder. The extensions mainly consist of the ability to accept and generate word graphs and the introduction of two new N -gram models in the log-linear combination of feature functions the decoder implements. Additionally the decoder is enhanced with a caching strategy that reduces the number of N-gram calls improving the overall search efficiency. Experiments are carried out over the Eurpoean Parliament Spanish-English translation task. 1 Introduction Research on SMT has been strongly boosted in the last few years partially thanks to the relatively easy development of systems with enough competence as to achieve rather competitive results. In parallel tools and techniques have grown in complexity which makes it difficult to carry out state-of-the-art research without sharing some of this toolkits. Without aiming at being exhaustive GIZA 2 SRILM3 and PHARAOH4 are probably the best known examples. We introduce the recent extensions made to an Ngram-based SMT decoder Crego et al. 2005 which allowed us to tackle several translation issues such as reordering rescoring modeling etc. successfully improving accuracy as well as efficiency results. As far as SMT can be seen as a double-sided problem modeling and search the decoder emerges as a key component core module of any SMT system. Mainly 1 http soft soft marie 2http GIZA .html 3http projects srilm 4http publications licensed-sw pharaoh any technique aiming at dealing with a translation problem needs for a decoder extension to be implemented. Particularly the .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.