TAILIEUCHUNG - Báo cáo khoa học: " Memory-Based Learning of Morphology with Stochastic Transducers"

This paper discusses the supervised learning of morphology using stochastic transducers, trained using the ExpectationMaximization (EM) algorithm. Two approaches are presented: first, using the transducers directly to model the process, and secondly using them to define a similarity measure, related to the Fisher kernel method (Jaakkola and Haussler, 1998), and then using a Memory-Based Learning (MBL) technique. These are evaluated and compared on data sets from English, German, Slovene and Arabic. . | Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics ACL Philadelphia July 2002 pp. 513-520. Memory-Based Learning of Morphology with Stochastic Transducers Alexander Clark ISSCO TIM University of Geneva UNI-MAIL Boulevard du Pont-d Atve CH-1211 Geneve 4 Switzerland Abstract This paper discusses the supervised learning of morphology using stochastic transducers trained using the ExpectationMaximization EM algorithm. Two approaches are presented first using the transducers directly to model the process and secondly using them to define a similarity measure related to the Fisher kernel method Jaakkola and Haussler 1998 and then using a Memory-Based Learning MBL technique. These are evaluated and compared on data sets from English German Slovene and Arabic. 1 Introduction Finite-state methods are in large part adequate to model morphological processes in many languages. A standard methodology is that of two-level morphology Koskenniemi 1983 which is capable of handling the complexity of Finnish though it needs substantial extensions to handle non-concatenative languages such as Arabic Kiraz 1994 . These models are primarily concerned with the mapping from deep lexical strings to surface strings and within this framework learning is in general difficult Itai 1994 . In this paper I present algorithms for learning the finite-state transduction between pairs of uninflected and inflected words. - supervised learning of morphology. The techniques presented here are however applicable to learning other types of string transductions. Memory-based techniques based on principles of non-parametric density estimation are a powerful form of machine learning well-suited to natural language tasks. A particular strength is their ability to model both general rules and specific exceptions in a single framework van den Bosch and Daelemans 1999 . However they have generally only been used in supervised learning techniques .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.