TAILIEUCHUNG - Báo cáo khoa học: "An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation"

This paper presents a new approach based on Equivalent Pseudowords (EPs) to tackle Word Sense Disambiguation (WSD) in Chinese language. EPs are particular artificial ambiguous words, which can be used to realize unsupervised WSD. A Bayesian classifier is implemented to test the efficacy of the EP solution on Senseval-3 Chinese test set. The performance is better than state-of-the-art results with an average F-measure of . The experiment verifies the value of EP for unsupervised WSD. | An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation Zhimao Lu Haifeng Wang Jianmin Yao Ting Liu Sheng Li Information Retrieval Laboratory School of Computer Science and Technology Harbin Institute of Technology Harbin 150001 China lzm tliu lisheng @ Toshiba China Research and Development Center 5 F. Tower W2 Oriental Plaza No. 1 East Chang An Ave. Beijing 100738 China wanghaifeng@ School of Computer Science and Technology Soochow University Suzhou 215006 China jyao@ Abstract This paper presents a new approach based on Equivalent Pseudowords EPs to tackle Word Sense Disambiguation WSD in Chinese language. EPs are particular artificial ambiguous words which can be used to realize unsupervised WSD. A Bayesian classifier is implemented to test the efficacy of the EP solution on Senseval-3 Chinese test set. The performance is better than state-of-the-art results with an average F-measure of . The experiment verifies the value of EP for unsupervised WSD. 1 Introduction Word sense disambiguation WSD has been a hot topic in natural language processing which is to determine the sense of an ambiguous word in a specific context. It is an important technique for applications such as information retrieval text mining machine translation text classification automatic text summarization and so on. Statistical solutions to WSD acquire linguistic knowledge from the training corpus using machine learning technologies and apply the knowledge to disambiguation. The first statistical model of WSD was built by Brown et al. 1991 . Since then most machine learning methods have been applied to WSD including decision tree Bayesian model neural network SVM maxi mum entropy genetic algorithms and so on. For different learning methods supervised methods usually achieve good performance at a cost of human tagging of training corpus. The precision improves with larger size of training corpus. Compared with supervised methods .

TÀI LIỆU MỚI ĐĂNG
41    195    5    09-01-2025
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.