TAILIEUCHUNG - Báo cáo khoa học: "Large-Margin Learning of Submodular Summarization Models"

In this paper, we present a supervised learning approach to training submodular scoring functions for extractive multidocument summarization. By taking a structured prediction approach, we provide a large-margin method that directly optimizes a convex relaxation of the desired performance measure. | Large-Margin Learning of Submodular Summarization Models Ruben Sipos Dept. of Computer Science Cornell University Ithaca NY 14853 USA Pannaga Shivaswamy Dept. of Computer Science Cornell University Ithaca NY 14853 USA Thorsten Joachims Dept. of Computer Science Cornell University Ithaca NY 14853 USA rs@ pannaga@ tj@ Abstract In this paper we present a supervised learning approach to training submodu-lar scoring functions for extractive multidocument summarization. By taking a structured prediction approach we provide a large-margin method that directly optimizes a convex relaxation of the desired performance measure. The learning method applies to all submodular summarization methods and we demonstrate its effectiveness for both pairwise as well as coverage-based scoring functions on multiple datasets. Compared to state-of-the-art functions that were tuned manually our method significantly improves performance and enables high-fidelity models with number of parameters well beyond what could reasonably be tuned by hand. 1 Introduction Automatic document summarization is the problem of constructing a short text describing the main points in a set of document s . Example applications range from generating short summaries of news articles to presenting snippets for URLs in web-search. In this paper we focus on extractive multi-document summarization where the final summary is a subset of the sentences from multiple input documents. In this way extractive summarization avoids the hard problem of generating well-formed natural-language sentences since only existing sentences from the input documents are presented as part of the summary. A current state-of-the-art method for document summarization was recently proposed by Lin and Bilmes 2010 using a submodular scoring function based on inter-sentence similarity. On the one hand this scoring function rewards summaries that are similar to many sentences in the original documents .

TỪ KHÓA LIÊN QUAN
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.