TAILIEUCHUNG - Báo cáo khoa học: "Automatic Generation of Story Highlights"

In this paper we present a joint content selection and compression model for single-document summarization. The model operates over a phrase-based representation of the source document which we obtain by merging information from PCFG parse trees and dependency graphs. Using an integer linear programming formulation, the model learns to select and combine phrases subject to length, coverage and grammar constraints. | Automatic Generation of Story Highlights Kristian Woodsend and Mirella Lapata School of Informatics University of Edinburgh Edinburgh EH8 9AB United Kingdom mlap@ Abstract In this paper we present a joint content selection and compression model for single-document summarization. The model operates over a phrase-based representation of the source document which we obtain by merging information from PCFG parse trees and dependency graphs. Using an integer linear programming formulation the model learns to select and combine phrases subject to length coverage and grammar constraints. We evaluate the approach on the task of generating story highlights a small number of brief self-contained sentences that allow readers to quickly gather information on news stories. Experimental results show that the model s output is comparable to human-written highlights in terms of both grammaticality and content. 1 Introduction Summarization is the process of condensing a source text into a shorter version while preserving its information content. Humans summarize on a daily basis and effortlessly but producing high quality summaries automatically remains a challenge. The difficulty lies primarily in the nature of the task which is complex must satisfy many constraints . summary length informativeness coherence grammaticality and ultimately requires wide-coverage text understanding. Since the latter is beyond the capabilities of current NLP technology most work today focuses on extractive summarization where a summary is created simply by identifying and subsequently concatenating the most important sentences in a document. Without a great deal of linguistic analysis it is possible to create summaries for a wide range of documents. Unfortunately extracts are often documents of low readability and text quality and contain much redundant information. This is in marked contrast with hand-written summaries which often combine several pieces of .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.