TAILIEUCHUNG - Báo cáo khoa học: "The Sentimental Factor: Improving Review Classification via Human-Provided Information"

Sentiment classification is the task of labeling a review document according to the polarity of its prevailing opinion (favorable or unfavorable). In approaching this problem, a model builder often has three sources of information available: a small collection of labeled documents, a large collection of unlabeled documents, and human understanding of language. Ideally, a learning method will utilize all three sources. To accomplish this goal, we generalize an existing procedure that uses the latter two. We extend this procedure by re-interpreting it as a Naive Bayes model for document sentiment. . | The Sentimental Factor Improving Review Classification via Human-Provided Information Philip Beineke and Trevor Hastie Dept. of Statistics Stanford University Stanford CA 94305 Shivakumar Vaithyanathan IBM Almaden Research Center 650 Harry Rd. San Jose CA 95120-6099 Abstract Sentiment classification is the task of labeling a review document according to the polarity of its prevailing opinion favorable or unfavorable . In approaching this problem a model builder often has three sources of information available a small collection of labeled documents a large collection of unlabeled documents and human understanding of language. Ideally a learning method will utilize all three sources. To accomplish this goal we generalize an existing procedure that uses the latter two. We extend this procedure by re-interpreting it as a Naive Bayes model for document sentiment. Viewed as such it can also be seen to extract a pair of derived features that are linearly combined to predict sentiment. This perspective allows us to improve upon previous methods primarily through two strategies incorporating additional derived features into the model and where possible using labeled data to estimate their relative influence. 1 Introduction Text documents are available in ever-increasing numbers making automated techniques for information extraction increasingly useful. Traditionally most research effort has been directed towards objective information such as classification according to topic however interest is growing in producing information about the opinions that a document contains for instance Morinaga et al. 2002 . In March 2004 the American Association for Artificial Intelligence held a symposium in this area entitled Exploring Affect and Attitude in Text. One task in opinion extraction is to label a review document d according to its prevailing sentiment s 2 1 1 unfavorable or favorable . Several previous papers have addressed this problem by building models that rely exclusively

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.