TAILIEUCHUNG - Báo cáo khoa học: "When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging"

This study presents a novel approach to the problem of system portability across different domains: a sentiment annotation system that integrates a corpus-based classifier trained on a small set of annotated in-domain data and a lexicon-based system trained on WordNet. The paper explores the challenges of system portability across domains and text genres (movie reviews, news, blogs, and product reviews), highlights the factors affecting system performance on out-of-domain and smallset in-domain data, and presents a new system consisting of the ensemble of two classifiers with precision-based vote weighting, that provides significant gains in accuracy and recall over the corpus-based. | When Specialists and Generalists Work Together Overcoming Domain Dependence in Sentiment Tagging Alina Andreevskaia Concordia University Montreal Quebec andreev@ Sabine Bergler Concordia University Montreal Canada bergler@ Abstract This study presents a novel approach to the problem of system portability across different domains a sentiment annotation system that integrates a corpus-based classifier trained on a small set of annotated in-domain data and a lexicon-based system trained on Word-Net. The paper explores the challenges of system portability across domains and text genres movie reviews news blogs and product reviews highlights the factors affecting system performance on out-of-domain and smallset in-domain data and presents a new system consisting of the ensemble of two classifiers with precision-based vote weighting that provides significant gains in accuracy and recall over the corpus-based classifier and the lexicon-based system taken individually. 1 Introduction One of the emerging directions in NLP is the development of machine learning methods that perform well not only on the domain on which they were trained but also on other domains for which training data is not available or is not sufficient to ensure adequate machine learning. Many applications require reliable processing of heterogeneous corpora such as the World Wide Web where the diversity of genres and domains present in the Internet limits the feasibility of in-domain training. In this paper sentiment annotation is defined as the assignment of positive negative or neutral sentiment values to texts sentences and other linguistic units. Recent experiments assessing system portability across different domains conducted by Aue and Gamon 2005 demonstrated that sentiment annotation classifiers trained in one domain do not perform well on other domains. A number of methods has been proposed in order to overcome this system portability limitation by using .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.