TAILIEUCHUNG - Báo cáo khoa học: "Unsupervised Multilingual Learning for Morphological Segmentation"

For centuries, the deep connection between languages has brought about major discoveries about human communication. In this paper we investigate how this powerful source of information can be exploited for unsupervised language learning. In particular, we study the task of morphological segmentation of multiple languages. We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patterns, or abstract morphemes. . | Unsupervised Multilingual Learning for Morphological Segmentation Benjamin Snyder and Regina Barzilay Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology bsnyder regina @ Abstract For centuries the deep connection between languages has brought about major discoveries about human communication. In this paper we investigate how this powerful source of information can be exploited for unsupervised language learning. In particular we study the task of morphological segmentation of multiple languages. We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patterns or abstract morphemes. We apply our model to three Semitic languages Arabic Hebrew Aramaic as well as to English. Our results demonstrate that learning morphological models in tandem reduces error by up to 24 relative to monolingual models. Furthermore we provide evidence that our joint model achieves better performance when applied to languages from the same family. 1 Introduction For centuries the deep connection between human languages has fascinated linguists anthropologists and historians Eco 1995 . The study of this connection has made possible major discoveries about human communication it has revealed the evolution of languages facilitated the reconstruction of proto-languages and led to understanding language universals. The connection between languages should be a powerful source of information for automatic linguistic analysis as well. In this paper we investigate two questions i Can we exploit cross-lingual correspondences to improve unsupervised language learning ii Will this joint analysis provide more or less benefit when the languages belong to the same family We study these two questions in the context of unsupervised morphological segmentation the automatic division of a word into morphemes the basic units of meaning

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.