TAILIEUCHUNG - Báo cáo khoa học: "Topic Models for Dynamic Translation Model Adaptation"

We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts, where topics are induced in an unsupervised way using topic models; this can be thought of as inducing subcorpora for adaptation without any human annotation. We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorporate them into our translation model as features. | Topic Models for Dynamic Translation Model Adaptation Vladimir Eidelman Computer Science and UMIACS University of Maryland College Park MD vlad@ Jordan Boyd-Graber iSchool and UMIACS University of Maryland College Park MD jbg@ Philip Resnik Linguistics and UMIACS University of Maryland College Park MD resnik@ Abstract We propose an approach that biases machine translation systems toward relevant translations based on topic-specific contexts where topics are induced in an unsupervised way using topic models this can be thought of as inducing subcorpora for adaptation without any human annotation. We use these topic distributions to compute topic-dependent lexical weighting probabilities and directly incorporate them into our translation model as features. Conditioning lexical probabilities on the topic biases translations toward topicrelevant output resulting in significant improvements of up to 1 BLEU and 3 TER on Chinese to English translation over a strong baseline. 1 Introduction The performance of a statistical machine translation SMT system on a translation task depends largely on the suitability of the available parallel training data. Domains . newswire vs. blogs may vary widely in their lexical choices and stylistic preferences and what may be preferable in a general setting or in one domain is not necessarily preferable in another domain. Indeed sometimes the domain can change the meaning of a phrase entirely. In a food related context the Chinese sentence W M ÍM fensi henduo would mean They have a lot of vermicelli however in an informal Internet conversation this sentence would mean They have a lot of fans . Without the broader context it is impossible to determine the correct translation in otherwise identical sentences. 115 This problem has led to a substantial amount of recent work in trying to bias or adapt the translation model TM toward particular domains of interest Axelrod et al. 2011 Foster et al. 2010 .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.