TAILIEUCHUNG - Báo cáo khoa học: "Hierarchical Bayesian Language Modelling for the Linguistically Informed"

In this work I address the challenge of augmenting n-gram language models according to prior linguistic intuitions. I argue that the family of hierarchical Pitman-Yor language models is an attractive vehicle through which to address the problem, and demonstrate the approach by proposing a model for German compounds. In an empirical evaluation, the model outperforms the Kneser-Ney model in terms of perplexity, and achieves preliminary improvements in English-German translation. | Hierarchical Bayesian Language Modelling for the Linguistically Informed Jan A. Botha Department of Computer Science University of Oxford UK Abstract In this work I address the challenge of augmenting n-gram language models according to prior linguistic intuitions. I argue that the family of hierarchical Pitman-Yor language models is an attractive vehicle through which to address the problem and demonstrate the approach by proposing a model for German compounds. In an empirical evaluation the model outperforms the Kneser-Ney model in terms of perplexity and achieves preliminary improvements in English-German translation. 1 Introduction The importance of effective language models in machine translation MT and automatic speech recognition ASR is widely recognised. n-gram models in particular ones using Kneser-Ney KN smoothing have become the standard workhorse for these tasks. These models are not ideal for languages that have relatively free word order and or complex morphology. The ability to encode additional linguistic intuitions into models that already have certain attractive properties is an important piece of the puzzle of improving machine translation quality for those languages. But despite their widespread use KN n-gram models are not easily extensible with additional model components that target particular linguistic phenomena. I argue in this paper that the family of hierarchical Pitman-Yor language models HPYLM Teh 2006 Goldwater et al. 2006 are suitable for investigations into more linguistically-informed n-gram language models. Firstly the flexibility to specify arbitrary back-off distributions makes it easy to incorporate multiple models into a larger n-gram model. Secondly the Pitman-Yor process prior Pitman and Yor 1997 generates distributions that are well-suited to a variety of powerlaw behaviours as is often observed in language. Catering for a variety of those is important since the frequency distributions of say suffixes

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.