TAILIEUCHUNG - Báo cáo khoa học: "A Supervised Learning Approach to Automatic Synonym Identification based on Distributional Features"

Distributional similarity has been widely used to capture the semantic relatedness of words in many NLP tasks. However, various parameters such as similarity measures must be handtuned to make it work effectively. Instead, we propose a novel approach to synonym identification based on supervised learning and distributional features, which correspond to the commonality of individual context types shared by word pairs. Considering the integration with pattern-based features, we have built and compared five synonym classifiers. . | A Supervised Learning Approach to Automatic Synonym Identification based on Distributional Features Masato Hagiwara Graduate School of Information Science Nagoya University Furo-cho Chikusa-ku Nagoya 464-8603 JAPAN hagiwara@ Abstract Distributional similarity has been widely used to capture the semantic relatedness of words in many NLP tasks. However various parameters such as similarity measures must be hand-tuned to make it work effectively. Instead we propose a novel approach to synonym identification based on supervised learning and distributional features which correspond to the commonality of individual context types shared by word pairs. Considering the integration with pattern-based features we have built and compared five synonym classifiers. The evaluation experiment has shown a dramatic performance increase of over 120 on the F-1 measure basis compared to the conventional similarity-based classification. On the other hand the pattern-based features have appeared almost redundant. 1 Introduction Semantic similarity of words is one of the most important lexical knowledge for NLP tasks including word sense disambiguation and automatic thesaurus construction. To measure the semantic relatedness of words a concept called distributional similarity has been widely used. Distributional similarity represents the relatedness of two words by the commonality of contexts the words share based on the distributional hypothesis Harris 1985 which states that semantically similar words share similar contexts. A number of researches which utilized distributional similarity have been conducted including Hindle 1990 Lin 1998 Geffet and Dagan 2004 and many others. Although they have been successful in acquiring related words various parameters such as similarity measures and weighting are involved. As Weeds et al. 2004 pointed out it is not at all obvious that one universally best measure exists for all application thus they must be tuned by hand in an .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.