TAILIEUCHUNG - Báo cáo khoa học: "Experimenting with Distant Supervision for Emotion Classification"

We describe a set of experiments using automatically labelled data to train supervised classifiers for multi-class emotion detection in Twitter messages with no manual intervention. By cross-validating between models trained on different labellings for the same six basic emotion classes, and testing on manually labelled data, we conclude that the method is suitable for some emotions (happiness, sadness and anger) but less able to distinguish others; and that different labelling conventions are more suitable for some emotions than others. . | Experimenting with Distant Supervision for Emotion Classification Matthew Purver and Stuart Battersby Interaction Media and Communication Group Chatterbox Analytics School of Electronic Engineering and Computer Science Queen Mary University of London Mile End Road London E1 4NS UK stuart@ Abstract We describe a set of experiments using automatically labelled data to train supervised classifiers for multi-class emotion detection in Twitter messages with no manual intervention. By cross-validating between models trained on different labellings for the same six basic emotion classes and testing on manually labelled data we conclude that the method is suitable for some emotions happiness sadness and anger but less able to distinguish others and that different labelling conventions are more suitable for some emotions than others. 1 Introduction We present a set of experiments into classifying Twitter messages into the six basic emotion classes of Ekman 1972 . The motivation behind this work is twofold firstly to investigate the possibility of detecting emotions of multiple classes rather than purely positive or negative sentiment in such short texts and secondly to investigate the use of distant supervision to quickly bootstrap large datasets and classifiers without the need for manual annotation. Text classification according to emotion and sentiment is a well-established research area. In this and other areas of text analysis and classification recent years have seen a rise in use of data from online sources and social media as these provide very large often freely available datasets see . Eisenstein et al. 2010 Go et al. 2009 Pak and Paroubek 2010 amongst many others . However one of the challenges this poses is that of data annotation given very large amounts of data often consisting of very short texts written in unconventional style and without accompanying metadata audio video signals or access to the author for .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.