TAILIEUCHUNG - Báo cáo khoa học: "Automatic Single-Document Key Fact Extraction from Newswire Articles"

This paper addresses the problem of extracting the most important facts from a news article. Our approach uses syntactic, semantic, and general statistical features to identify the most important sentences in a document. The importance of the individual features is estimated using generalized iterative scaling methods trained on an annotated newswire corpus. The performance of our approach is evaluated against 300 unseen news articles and shows that use of these features results in statistically significant improvements over a provenly robust baseline, as measured using metrics such as precision, recall and ROUGE. . | Automatic Single-Document Key Fact Extraction from Newswire Articles Itamar Kastner Department of Computer Science Queen Mary University of London UK itk1@ Christof Monz ISLA University of Amsterdam Amsterdam The Netherlands christof@ Abstract This paper addresses the problem of extracting the most important facts from a news article. Our approach uses syntactic semantic and general statistical features to identify the most important sentences in a document. The importance of the individual features is estimated using generalized iterative scaling methods trained on an annotated newswire corpus. The performance of our approach is evaluated against 300 unseen news articles and shows that use of these features results in statistically significant improvements over a provenly robust baseline as measured using metrics such as precision recall and ROUGE. 1 Introduction The increasing amount of information that is available to both professional users such as journalists financial analysts and intelligence analysts and lay users has called for methods condensing information in order to make the most important content stand out. Several methods have been proposed over the last two decades among which keyword extraction and summarization are the most prominent ones. Keyword extraction aims to identify the most relevant words or phrases in a document . Witten et al. 1999 while summarization aims to provide a short commonly 100 words coherent full-text summary of the document . McKeown et al. 1999 . Key fact extraction falls in between key word extraction and summarization. Here the challenge is to identify the most relevant facts in a document but not necessarily in a coherent full-text form as is done in summarization. Evidence of the usefulness of key fact extraction is CNN s web site which since 2006 has most of its news articles preceded by a list of story highlights see Figure 1. The advantage of the news highlights as opposed to .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.