TAILIEUCHUNG - Báo cáo khoa học: "Construction of Domain Dictionary for Fundamental Vocabulary"

For natural language understanding, it is essential to reveal semantic relations between words. To date, only the IS-A relation has been publicly available. Toward deeper natural language understanding, we semiautomatically constructed the domain dictionary that represents the domain relation between Japanese fundamental words. This is the first Japanese domain resource that is fully available. Besides, our method does not require a document collection, which is indispensable for keyword extraction techniques but is hard to obtain. . | Construction of Domain Dictionary for Fundamental Vocabulary Chikara Hashimoto Sadao Kurohashi Faculty of Engineering Graduate School of Informatics Yamagata University Kyoto University 4-3-16 Jonan Yonezawa-shi Yamagata 36-1 Yoshida-Honmachi Sakyo-ku Kyoto 992-8510 Japan 606-8501 Japan Abstract 2 Two Issues For natural language understanding it is essential to reveal semantic relations between words. To date only the IS-A relation has been publicly available. Toward deeper natural language understanding we semi-automatically constructed the domain dictionary that represents the domain relation between Japanese fundamental words. This is the first Japanese domain resource that is fully available. Besides our method does not require a document collection which is indispensable for keyword extraction techniques but is hard to obtain. As a task-based evaluation we performed blog categorization. Also we developed a technique for estimating the domain of unknown words. 1 Introduction We constructed a lexical resource that represents the domain relation among Japanese fundamental words JFWs and we call it the domain It associates JFWs with domains in which they are typically used. For example rk ký y home run is associated with the domain SPORTS2. That is we aim to make explicit the horizontal relation between words the domain relation while thesauri indicate the vertical relation called In fact there have been a few domain resources in Japanese like Yoshimoto et al. 1997 . But they are not publicly available. 2Domains are CAPITALIZED in this paper. 3The lack of the horizontal relationship is also known as the tennis problem Fellbaum 1998 . 137 You have to address two issues. One is what domains to assume and the other is how to associate words with domains without document collections. The former is paraphrased as how people categorize the real world which is really a hard problem. In this study we avoid being too involved in the problem and .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.