TAILIEUCHUNG - Báo cáo khoa học: "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation"

The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of subtrees. This paper goes further to present a translation model based on non-contiguous tree sequence alignment, where a non-contiguous tree sequence is a sequence of sub-trees and gaps. Compared with the contiguous tree sequencebased model, the proposed model can well handle non-contiguous phrases with any large gaps by means of non-contiguous tree sequence alignment. . | A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun1 2 Min Zhang1 Chew Lim Tan2 1 Institute for Infocomm Research 2School of Computing National University of Singapore sunjun@ mzhang@ tancl@ Abstract The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases where a tree sequence is a contiguous sequence of subtrees. This paper goes further to present a translation model based on non-contiguous tree sequence alignment where a non-contiguous tree sequence is a sequence of sub-trees and gaps. Compared with the contiguous tree sequencebased model the proposed model can well handle non-contiguous phrases with any large gaps by means of non-contiguous tree sequence alignment. An algorithm targeting the noncontiguous constituent decoding is also proposed. Experimental results on the NIST MT-05 Chinese-English translation task show that the proposed model statistically significantly outperforms the baseline systems. 1 Introduction Current research in statistical machine translation SMT mostly settles itself in the domain of either phrase-based or syntax-based. Between them the phrase-based approach Marcu and Wong 2002 Koehn et al 2003 Och and Ney 2004 allows local reordering and contiguous phrase translation. However it is hard for phrase-based models to learn global reorderings and to deal with noncontiguous phrases. To address this issue many syntax-based approaches Yamada and Knight 2001 Eisner 2003 Gildea 2003 Ding and Palmer 2005 Quirk et al 2005 Zhang et al 2007 2008a Bod 2007 Liu et al 2006 2007 Hearne and Way 2003 tend to integrate more syntactic information to enhance the non-contiguous phrase modeling. In general most of them achieve this goal by introducing syntactic non-terminals as translational equivalent placeholders in both source and target sides. Nevertheless the generated rules are strictly .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.