TAILIEUCHUNG - Báo cáo khoa học: "Cross-Lingual Mixture Model for Sentiment Classification"

The amount of labeled sentiment data in English is much larger than that in other languages. Such a disproportion arouse interest in cross-lingual sentiment classification, which aims to conduct sentiment classification in the target language (. Chinese) using labeled data in the source language (. English). | Cross-Lingual Mixture Model for Sentiment Classification Xinfan Meng Furu Wei Xiaohua Liu Ming Zhou Ge Xu HoufengWang MOE Key Lab of Computational Linguistics Peking University Microsoft Research Asia mxf xuge wanghf @ fuwei xiaoliu mingzhou @ Abstract The amount of labeled sentiment data in English is much larger than that in other languages. Such a disproportion arouse interest in cross-lingual sentiment classification which aims to conduct sentiment classification in the target language . Chinese using labeled data in the source language . English . Most existing work relies on machine translation engines to directly adapt labeled data from the source language to the target language. This approach suffers from the limited coverage of vocabulary in the machine translation results. In this paper we propose a generative cross-lingual mixture model CLMM to leverage unlabeled bilingual parallel data. By fitting parameters to maximize the likelihood of the bilingual parallel data the proposed model learns previously unseen sentiment words from the large bilingual parallel data and improves vocabulary coverage significantly. Experiments on multiple data sets show that CLMM is consistently effective in two settings 1 labeled data in the target language are unavailable and 2 labeled data in the target language are also available. 1 Introduction Sentiment Analysis also known as opinion mining which aims to extract the sentiment information from text has attracted extensive attention in recent years. Sentiment classification the task of determining the sentiment orientation positive negative or neutral of text has been the most extensively studied task in sentiment analysis. There is Contribution during internship at Microsoft Research Asia. 572 already a large amount of work on sentiment classification of text in various genres and in many languages. For example Pang et al. 2002 focus on sentiment classification of movie reviews in English and

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.