TAILIEUCHUNG - Báo cáo khoa học: "Towards a Framework for Abstractive Summarization of Multimodal Documents"

We propose a framework for generating an abstractive summary from a semantic model of a multimodal document. We discuss the type of model required, the means by which it can be constructed, how the content of the model is rated and selected, and the method of realizing novel sentences for the summary. | Towards a Framework for Abstractive Summarization of Multimodal Documents Charles F. Greenbacker Dept. of Computer Information Sciences University of Delaware Newark Delaware USA charlieg@ Abstract We propose a framework for generating an abstractive summary from a semantic model of a multimodal document. We discuss the type of model required the means by which it can be constructed how the content of the model is rated and selected and the method of realizing novel sentences for the summary. To this end we introduce a metric called information density used for gauging the importance of content obtained from text and graphical sources. 1 Introduction The automatic summarization of text is a prominent task in the field of natural language processing NLP . While significant achievements have been made using statistical analysis and sentence extraction true abstractive summarization remains a researcher s dream Radev et al. 2002 . Although existing systems produce high-quality summaries of relatively simple articles there are limitations as to the types of documents these systems can handle. One such limitation is the summarization of multimodal documents no existing system is able to incorporate the non-text portions of a document . information graphics images into the overall summary. Carberry et al. 2006 showed that the content of information graphics is often not repeated in the article s text meaning important information may be overlooked if the graphical content is not included in the summary. Systems that perform statistical analysis of text and extract sentences from the original article to assemble a summary cannot access the information contained in non-text components 75 let alone seamlessly combine that information with the extracted text. The problem is that information from the text and graphical components can only be integrated at the conceptual level necessitating a semantic understanding of the underlying concepts. Our proposed .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.