Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
In this paper, we present a study of a novel summarization problem, i.e., summarizing the impact of a scientific publication. Given a paper and its citation context, we study how to extract sentences that can represent the most influential content of the paper. We propose language modeling methods for solving this problem, and study how to incorporate features such as authority and proximity to accurately estimate the impact language model. Experiment results on a SIGIR publication collection show that the proposed methods are effective for generating impact-based summaries. . | Generating Impact-Based Summaries for Scientific Literature Qiaozhu Mei University of Illinois at Urbana-Champaign qmei2@uiuc.edu ChengXiang Zhai University of Illinois at Urbana-Champaign czhai@cs.uiuc.edu Abstract In this paper we present a study of a novel summarization problem i.e. summarizing the impact of a scientific publication. Given a paper and its citation context we study how to extract sentences that can represent the most influential content of the paper. We propose language modeling methods for solving this problem and study how to incorporate features such as authority and proximity to accurately estimate the impact language model. Experiment results on a SIGIR publication collection show that the proposed methods are effective for generating impact-based summaries. 1 Introduction The volume of scientific literature has been growing rapidly. From recent statistics each year 400 000 new citations are added to MEDLINE the major biomedical literature database 1. This fast growth of literature makes it difficult for researchers especially beginning researchers to keep track of the research trends and find high impact papers on unfamiliar topics. Impact factors Kaplan and Nelson 2000 are useful but they are just numerical values so they cannot tell researchers which aspects of a paper are influential. On the other hand a regular contentbased summary e.g. the abstract or conclusion section of a paper or an automatically generated topical summary Giles et al. 1998 can help a user know 1http www.nlm.nih.gov bsd history tsld024.htm about the main content of a paper but not necessarily the most influential content of the paper. Indeed the abstract of a paper mostly reflects the expected impact of the paper as perceived by the author s which could significantly deviate from the actual impact of the paper in the research community. Moreover the impact of a paper changes over time due to the evolution and progress of research in a field. For example an algorithm