TAILIEUCHUNG - Báo cáo khoa học: "A Probability Model to Improve Word Alignment"

Word alignment plays a crucial role in statistical machine translation. Word-aligned corpora have been found to be an excellent source of translation-related knowledge. We present a statistical model for computing the probability of an alignment given a sentence pair. This model allows easy integration of context-specific features. Our experiments show that this model can be an effective tool for improving an existing word alignment. | A Probability Model to Improve Word Alignment Colin Cherry and Dekang Lin Department of Computing Science University of Alberta Edmonton Alberta Canada T6G 2E8 colinc lindek @ Abstract Word alignment plays a crucial role in statistical machine translation. Word-aligned corpora have been found to be an excellent source of translation-related knowledge. We present a statistical model for computing the probability of an alignment given a sentence pair. This model allows easy integration of context-specific features. Our experiments show that this model can be an effective tool for improving an existing word alignment. 1 Introduction Word alignments were first introduced as an intermediate result of statistical machine translation systems Brown et al. 1993 . Since their introduction many researchers have become interested in word alignments as a knowledge source. For example alignments can be used to learn translation lexicons Melamed 1996 transfer rules Carbonell et al. 2002 Menezes and Richardson 2001 and classifiers to find safe sentence segmentation points Berger et al. 1996 . In addition to the IBM models researchers have proposed a number of alternative alignment methods. These methods often involve using a statistic such as Ọ1 2 Gale and Church 1991 or the log likelihood ratio Dunning 1993 to create a score to measure the strength of correlation between source and target words. Such measures can then be used to guide a constrained search to produce word alignments Melamed 2000 . It has been shown that once a baseline alignment has been created one can improve results by using a refined scoring metric that is based on the alignment. For example Melamed uses competitive linking along with an explicit noise model in Melamed 2000 to produce a new scoring metric which in turn creates better alignments. In this paper we present a simple flexible statistical model that is designed to capture the information present in a baseline alignment. This model .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.