Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Ordering information is a difficult but important task for applications generating natural-language text. We present a bottom-up approach to arranging sentences extracted for multi-document summarization. To capture the association and order of two textual segments (eg, sentences), we define four criteria, chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion until we obtain the overall segment with all sentences arranged. . | A Bottom-up Approach to Sentence Ordering for Multi-document Summarization Danushka Bollegala Naoaki Okazaki Mitsuru Ishizuka Graduate School of Information Science and Technology The University of Tokyo 7-3-1 Hongo Bunkyo-ku Tokyo 113-8656 Japan danushka okazaki @mi.ci.i.u-tokyo.ac.jp ishizuka@i.u-tokyo.ac.jp Abstract Ordering information is a difficult but important task for applications generating natural-language text. We present a bottom-up approach to arranging sentences extracted for multi-document summarization. To capture the association and order of two textual segments eg sentences we define four criteria chronology topical-closeness precedence and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion until we obtain the overall segment with all sentences arranged. Our experimental results show a significant improvement over existing sentence ordering strategies. 1 Introduction Multi-document summarization MDS Radev and McKeown 1999 tackles the information overload problem by providing a condensed version of a set of documents. Among a number of sub-tasks involved in MDS eg sentence extraction topic detection sentence ordering information extraction sentence generation etc. most MDS systems have been based on an extraction method which identifies important textual segments eg sentences or paragraphs in source documents. It is important for such MDS systems to determine a coherent arrangement of the textual segments extracted from multi-documents in order to reconstruct the text structure for summarization. Ordering information is also essential for Research Fellow of the Japan Society for the Promotion of Science JSPS other text-generation applications such as Question Answering. A summary with improperly ordered sentences confuses the reader and degrades the qual-ity reliability of the summary itself. Barzi-lay 2002 has provided