TAILIEUCHUNG - Báo cáo khoa học: "An Error Analysis of Relation Extraction in Social Media Documents"

The annotated mentions in the Corpus are single or multi-word expressions which refer to a particular real world or abstract entity. The mentions are annotated to indicate sets of mentions which constitute co-reference groups referring to the same entity. Five relationships are annotated between these entities: PartOf, FeatureOf, Produces, InstanceOf, and MemberOf. One significant difference between these relation annotations and those in the ACE Corpus is that the former are relations between sets of mentions (the co-reference groups) rather than between individual mentions | An Error Analysis of Relation Extraction in Social Media Documents Gregory Ichneumon Brown University of Colorado at Boulder Boulder Colorado browngp@ Abstract Relation extraction in documents allows the detection of how entities being discussed in a document are related to one another . part-of . This paper presents an analysis of a relation extraction system based on prior work but applied to the . Power and Associates Sentiment Corpus to examine how the system works on documents from a range of social media. The results are examined on three different subsets of the JDPA Corpus showing that the system performs much worse on documents from certain sources. The proposed explanation is that the features used are more appropriate to text with strong editorial standards than the informal writing style of blogs. 1 Introduction To summarize accurately determine the sentiment or answer questions about a document it is often necessary to be able to determine the relationships between entities being discussed in the document such as part-of or member-of . In the simple sentiment example Example I bought a new car yesterday. I love the powerful engine. determining the sentiment the author is expressing about the car requires knowing that the engine is a part of the car so that the positive sentiment being expressed about the engine can also be attributed to the car. In this paper we examine our preliminary results from applying a relation extraction system to the 64 . Power and Associates JDPA Sentiment Corpus Kessler et al. 2010 . Our system uses lexical features from prior work to classify relations and we examine how the system works on different subsets from the JDPA Sentiment Corpus breaking the source documents down into professionally written reviews blog reviews and social networking reviews. These three document types represent quite different writing styles and we see significant difference in how the relation extraction system performs .

TỪ KHÓA LIÊN QUAN
TÀI LIỆU MỚI ĐĂNG
6    139    0    25-12-2024
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.