TAILIEUCHUNG - Báo cáo khoa học: "Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers"

In this paper I present a Master’s thesis proposal in syntax-based Statistical Machine Translation. I propose to build discriminative SMT models using both tree-to-string and tree-to-tree approaches. Translation and language models will be represented mainly through the use of Tree Automata and Tree Transducers. These formalisms have important representational properties that makes them well-suited for syntax modeling. nce it’s u | Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers Daniel Emilio Beck Computer Science Department Federal University of Sao Carlos daniel_beck@ Abstract In this paper I present a Master s thesis proposal in syntax-based Statistical Machine Translation. I propose to build discriminative SMT models using both tree-to-string and tree-to-tree approaches. Translation and language models will be represented mainly through the use of Tree Automata and Tree Transducers. These formalisms have important representational properties that makes them well-suited for syntax modeling. I also present an experiment plan to evaluate these models through the use of a parallel corpus written in English and Brazilian Portuguese. 1 Introduction Statistical Machine Translation SMT has dominated Machine Translation MT research in the last two decades. One of its variants Phrase-based SMT PB-SMT is currently considered the state of the art in the area. However since the advent of PB-SMT by Koehn et al. 2003 and Och and Ney 2004 purely statistical MT systems have not achieved considerable improvements. So new research directions point toward the use of linguistic resources integrated into SMT systems. According to Lopez 2008 there are four steps when building an SMT system translational equivalence modeling1 parameterization parameter estimation and decoding. This Master s thesis proposal aims to improve SMT systems by including syntactic information in the first and second steps. There 1 For the remainder of this proposal I will refer to this step as simply translation model. 36 fore I plan to investigate two approaches the Tree-to-String TTS and the Tree-to-Tree TTT models. In the former syntactic information is provided only for the source language while in the latter it is provided for both source and target languages. There are many formal theories to represent syntax in a language like Context-free Grammars CFGs Tree Substitution Grammars

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.