TAILIEUCHUNG - Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble"

This paper demonstrates that the use of ensemble methods and carefully calibrating the decision threshold can significantly improve the performance of machine learning methods for morphological word decomposition. We employ two algorithms which come from a family of generative probabilistic models. The models consider segment boundaries as hidden variables and include probabilities for letter transitions within segments. The advantage of this model family is that it can learn from small datasets and easily generalises to larger datasets. . | Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble Sebastian Spiegler Intelligent Systems Laboratory University of Bristol . spiegler@ Peter A. Flach Intelligent Systems Laboratory University of Bristol . Abstract This paper demonstrates that the use of ensemble methods and carefully calibrating the decision threshold can significantly improve the performance of machine learning methods for morphological word decomposition. We employ two algorithms which come from a family of generative probabilistic models. The models consider segment boundaries as hidden variables and include probabilities for letter transitions within segments. The advantage of this model family is that it can learn from small datasets and easily generalises to larger datasets. The first algorithm Promodes which participated in the Morpho Challenge 2009 an international competition for unsupervised morphological analysis employs a lower order model whereas the second algorithm Promodes-H is a novel development of the first using a higher order model. We present the mathematical description for both algorithms conduct experiments on the morphologically rich language Zulu and compare characteristics of both algorithms based on the experimental results. 1 Introduction Words are often considered as the smallest unit of a language when examining the grammatical structure or the meaning of sentences referred to as syntax and semantics however words themselves possess an internal structure denominated by the term word morphology. It is worthwhile studying this internal structure since a language description using its morphological formation is more compact and complete than listing all possible words. This study is called morphological analysis. According to Goldsmith 2009 four tasks are assigned to morphological analysis word decomposition into morphemes building morpheme dictionaries .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.