TAILIEUCHUNG - Báo cáo khoa học: "Mining Co-Occurrence Matrices for SO-PMI Paradigm Word Candidates"

This paper is focused on one aspect of SOPMI, an unsupervised approach to sentiment vocabulary acquisition proposed by Turney (Turney and Littman, 2003). The method, originally applied and evaluated for English, is often used in bootstrapping sentiment lexicons for European languages where no such resources typically exist. In general, SO-PMI values are computed from word co-occurrence frequencies in the neighbourhoods of two small sets of paradigm words. The goal of this work is to investigate how lexeme selection affects the quality of obtained sentiment estimations. . | Mining Co-Occurrence Matrices for SO-PMI Paradigm Word Candidates Aleksander Wawer Institute of Computer Science Polish Academy of Science ul. Jana Kazimierza 5 01-248 Warszawa Poland axw@ Abstract This paper is focused on one aspect of SO-PMI an unsupervised approach to sentiment vocabulary acquisition proposed by Turney Turney and Littman 2003 . The method originally applied and evaluated for English is often used in bootstrapping sentiment lexicons for European languages where no such resources typically exist. In general SO-PMI values are computed from word co-occurrence frequencies in the neighbourhoods of two small sets of paradigm words. The goal of this work is to investigate how lexeme selection affects the quality of obtained sentiment estimations. This has been achieved by comparing ad hoc random lexeme selection with two alternative heuristics based on clustering and SVD decomposition of a word co-occurrence matrix demonstrating superiority of the latter methods. The work can be also interpreted as sensitivity analysis on SO-PMI with regard to paradigm word selection. The experiments were carried out for Polish. 1 Introduction This paper seeks to improve one of the main methods of unsupervised lexeme sentiment polarity assignment. The method introduced by Turney and Littman 2003 is described in more detail in Section 2. It relies on two sets of paradigm words positive and negative which determine the polarity of unseen words. The method is resource lean and therefore often used in languages other than English. Recent examples include Japanese Wang and Araki 2007 and German Remus et al. 2006 . Unfortunately the selection of paradigm words rarely receives sufficient attention and is typically done in an ad hoc manner. One notable example of manual paradigm word selection method was presented in Read and Carroll 2009 . In this context an interesting variation of the semantic orientation-pointwise mutual information SO-PMI algorithm for .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.