TAILIEUCHUNG - Báo cáo khoa học: "Incorporating topic information into sentiment analysis models"

This paper reports experiments in classifying texts based upon their favorability towards the subject of the text using a feature set enriched with topic information on a small dataset of music reviews hand-annotated for topic. The results of these experiments suggest ways in which incorporating topic information into such models may yield improvement over models which do not use topic information. | Incorporating topic information into sentiment analysis models Tony Mullen National Institute of Informatics NII Hitotsubashi 2-1-2 Chiyoda-ku Tokyo 101-8430 Japan mullen@ Nigel Collier National Institute of Informatics NII Hitotsubashi 2-1-2 Chiyoda-ku Tokyo 101-8430 Japan collier@ Abstract This paper reports experiments in classifying texts based upon their favorability towards the subject of the text using a feature set enriched with topic information on a small dataset of music reviews hand-annotated for topic. The results of these experiments suggest ways in which incorporating topic information into such models may yield improvement over models which do not use topic information. 1 Introduction There are a number of challenging aspects in recognizing the favorability of opinion-based texts the task known as sentiment analysis. Opinions in natural language are very often expressed in subtle and complex ways presenting challenges which may not be easily addressed by simple text categorization approaches such as n-gram or keyword identification approaches. Although such approaches have been employed effectively Pang et al. 2002 there appears to remain considerable room for improvement. Moving beyond these approaches can involve addressing the task at several levels. Negative reviews may contain many apparently positive phrases even while maintaining a strongly negative tone and the opposite is also common. This paper attempts to address this issue using Support Vector Machines SVMs a well-known and powerful tool for classification of vectors of real-valued features Vapnik 1998 . The present approach emphasizes the use of a variety of diverse information sources. In particular several classes of features based upon the proximity of the topic with phrases which have been assigned favorability values are described in order to take advantage of situations in which the topic of the text may be explicitly identified. 2 Motivation In the past work has

TÀI LIỆU MỚI ĐĂNG
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.