TAILIEUCHUNG - Báo cáo khoa học: "Japanese Dependency Structure Analysis Based on Maximum Entropy Models"

This paper describes a dependency structure analysis of Japanese sentences based on the maximum entropy models. Our model is created by learning the weights of some features from a training corpus to predict the dependency between bunsetsus or phrasal units. The dependency accuracy of our system is using the Kyoto University corpus. We discuss the contribution of each feature set and the relationship between the number of training data and the accuracy. | Proceedings of EACL 99 Japanese Dependency structure Analysis Based on Maximum Entropy Models Kiyotaka Uchimoto Satoshi Sekine Hitoshi Isahara Communications Research Laboratory Ministry of Posts and Telecommunications 588-2 Iwaoka Iwaoka-cho Nishi-ku Kobe Hyogo 651-2401 Japan uch imot oIi sahara New York University 715 Broadway 7th floor New York NY 10003 USA Abstract This paper describes a dependency structure analysis of Japanese sentences based on the maximum entropy models. Our model is created by learning the weights of some features from a training corpus to predict the dependency between bunsetsus or phrasal units. The dependency accuracy of our system is using the Kyoto University corpus. We discuss the contribution of each feature set and the relationship between the number of training data and the accuracy. 1 Introduction Dependency structure analysis is one of the basic techniques in Japanese sentence analysis. The Japanese dependency structure is usually represented by the relationship between phrasal units called bunsetsu. The analysis has two conceptual steps. In the first step a dependency matrix is prepared. Each element of the matrix represents how likely one bunsetsu is to depend on the other. In the second step an optimal set of dependencies for the entire sentence is found. In this paper we will mainly discuss the first step a model for estimating dependency likelihood. So far there have been two different approaches to estimating the dependency likelihood. One is the rule-based approach in which the rules are created by experts and likelihoods are calculated by some means including semiautomatic corpusbased methods but also by manual assignment of scores for rules. However hand-crafted rules have the following problems. They have a problem with their coverage. Because there are many features to find correct dependencies it is difficult to find them manually. They also have a problem with their consistency .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.