TAILIEUCHUNG - Báo cáo khoa học: "Soft Syntactic Constraints for Word Alignment through Discriminative Training"

Word alignment methods can gain valuable guidance by ensuring that their alignments maintain cohesion with respect to the phrases specified by a monolingual dependency tree. However, this hard constraint can also rule out correct alignments, and its utility decreases as alignment models become more complex. We use a publicly available structured output SVM to create a max-margin syntactic aligner with a soft cohesion constraint. | Soft Syntactic Constraints for Word Alignment through Discriminative Training Colin Cherry Department of Computing Science University of Alberta Edmonton AB Canada T6G2E8 colinc@ Dekang Lin Google Inc. 1600 Amphitheatre Parkway Mountain View CA USA 94043 lindek@ Abstract Word alignment methods can gain valuable guidance by ensuring that their alignments maintain cohesion with respect to the phrases specified by a monolingual dependency tree. However this hard constraint can also rule out correct alignments and its utility decreases as alignment models become more complex. We use a publicly available structured output SVM to create a max-margin syntactic aligner with a soft cohesion constraint. The resulting aligner is the first to our knowledge to use a discriminative learning method to train an ITG bitext parser. 1 Introduction Given a parallel sentence pair or bitext bilingual word alignment finds word-to-word connections across languages. Originally introduced as a byproduct of training statistical translation models in Brown et al. 1993 word alignment has become the first step in training most statistical translation systems and alignments are useful to a host of other tasks. The dominant IBM alignment models Och and Ney 2003 use minimal linguistic intuitions sentences are treated as flat strings. These carefully designed generative models are difficult to extend and have resisted the incorporation of intuitively useful features such as morphology. There have been many attempts to incorporate syntax into alignment we will not present a complete list here. Some methods parse two flat strings at once using a bitext grammar Wu 1997 . Others parse one of the two strings before alignment begins and align the resulting tree to the remaining string Yamada and Knight 2001 . The statistical models associated with syntactic aligners tend to be very different from their IBM counterparts. They model operations that are meaningful at a syntax level .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.