Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Many applications in NLP, such as questionanswering and summarization, either require or would greatly benefit from the knowledge of when an event occurred. Creating an effective algorithm for identifying the activity time of an event in news is difficult in part because of the sparsity of explicit temporal expressions. | Inferring Activity Time in News through Event Modeling Vladimir Eidelman Department of Computer Science Columbia University New York NY 10027 vae2101@columbia.edu Abstract Many applications in NLP such as questionanswering and summarization either require or would greatly benefit from the knowledge of when an event occurred. Creating an effective algorithm for identifying the activity time of an event in news is difficult in part because of the sparsity of explicit temporal expressions. This paper describes a domain-independent machine-learning based approach to assign activity times to events in news. We demonstrate that by applying topic models to text we are able to cluster sentences that describe the same event and utilize the temporal information within these event clusters to infer activity times for all sentences. Experimental evidence suggests that this is a promising approach given evaluations performed on three distinct news article sets against the baseline of assigning the publication date. Our approach achieves 90 88.7 and 68.7 accuracy respectively outperforming the baseline twice. 1 Introduction Many practical applications in NLP either require or would greatly benefit from the use of temporal information. For instance question-answering and summarization systems demand accurate processing of temporal information in order to be useful for answering when questions and creating coherent summaries by temporally ordering information. Proper processing is especially relevant in news where multiple disparate events may be described within one news article and it is necessary to identify the separate timepoints of each event. Event descriptions may be confined to one sentence which we establish as our text unit or be spread over many thus forcing us to assign all sentences an activity time. However only 20 -30 of sentences contain an explicit temporal expression thus leaving the vast majority of sentences without temporal information. A similar proportion .