TAILIEUCHUNG - Báo cáo khoa học: "Gappy Phrasal Alignment by Agreement"

We propose a principled and efficient phraseto-phrase alignment model, useful in machine translation as well as other related natural language processing problems. In a hidden semiMarkov model, word-to-phrase and phraseto-word translations are modeled directly by the system. Agreement between two directional models encourages the selection of parsimonious phrasal alignments, avoiding the overfitting commonly encountered in unsupervised training with multi-word units. | Gappy Phrasal Alignment by Agreement Mohit Bansal UC Berkeley CS Division mbansal@ Chris Quirk Microsoft Research chrisq@ Robert C. Moore Google Research Abstract We propose a principled and efficient phrase-to-phrase alignment model useful in machine translation as well as other related natural language processing problems. In a hidden semiMarkov model word-to-phrase and phrase-to-word translations are modeled directly by the system. Agreement between two directional models encourages the selection of parsimonious phrasal alignments avoiding the overfitting commonly encountered in unsupervised training with multi-word units. Expanding the state space to include gappy phrases such as French ne pas makes the alignment space more symmetric thus it allows agreement between discontinuous alignments. The resulting system shows substantial improvements in both alignment quality and translation quality over word-based Hidden Markov Models while maintaining asymptotically equivalent runtime. 1 Introduction Word alignment is an important part of statistical machine translation MT pipelines. Phrase tables containing pairs of source and target language phrases are extracted from word alignments forming the core of phrase-based statistical machine translation systems Koehn et al. 2003 . Most syntactic machine translation systems extract synchronous context-free grammars SCFGs from aligned syntactic fragments Galley et al. 2004 Zollmann et al. 2006 which in turn are derived from bilingual word alignments and syntactic Author was a summer intern at Microsoft Research during this project. French ne voudrais pas voyager par chemin de fer English would not like traveling by railroad Figure 1 French-English pair with complex word alignment. parses. Alignment is also used in various other NLP problems such as entailment paraphrasing question answering summarization and spelling correction. A limitation to word-based alignment

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.