Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We propose a framework for generating an abstractive summary from a semantic model of a multimodal document. We discuss the type of model required, the means by which it can be constructed, how the content of the model is rated and selected, and the method of realizing novel sentences for the summary. | Towards a Framework for Abstractive Summarization of Multimodal Documents Charles F. Greenbacker Dept. of Computer Information Sciences University of Delaware Newark Delaware USA charlieg@cis.udel.edu Abstract We propose a framework for generating an abstractive summary from a semantic model of a multimodal document. We discuss the type of model required the means by which it can be constructed how the content of the model is rated and selected and the method of realizing novel sentences for the summary. To this end we introduce a metric called information density used for gauging the importance of content obtained from text and graphical sources. 1 Introduction The automatic summarization of text is a prominent task in the field of natural language processing NLP . While significant achievements have been made using statistical analysis and sentence extraction true abstractive summarization remains a researcher s dream Radev et al. 2002 . Although existing systems produce high-quality summaries of relatively simple articles there are limitations as to the types of documents these systems can handle. One such limitation is the summarization of multimodal documents no existing system is able to incorporate the non-text portions of a document e.g. information graphics images into the overall summary. Carberry et al. 2006 showed that the content of information graphics is often not repeated in the article s text meaning important information may be overlooked if the graphical content is not included in the summary. Systems that perform statistical analysis of text and extract sentences from the original article to assemble a summary cannot access the information contained in non-text components 75 let alone seamlessly combine that information with the extracted text. The problem is that information from the text and graphical components can only be integrated at the conceptual level necessitating a semantic understanding of the underlying concepts. Our proposed .