TAILIEUCHUNG - Báo cáo khoa học: "Labeling Documents with Timestamps: Learning from their Time Expressions"

Temporal reasoners for document understanding typically assume that a document’s creation date is known. Algorithms to ground relative time expressions and order events often rely on this timestamp to assist the learner. Unfortunately, the timestamp is not always known, particularly on the Web. | Labeling Documents with Timestamps Learning from their Time Expressions Nathanael Chambers Department of Computer Science United States Naval Academy nchamber@ Abstract Temporal reasoners for document understanding typically assume that a document s creation date is known. Algorithms to ground relative time expressions and order events often rely on this timestamp to assist the learner. Unfortunately the timestamp is not always known particularly on the Web. This paper addresses the task of automatic document timestamping presenting two new models that incorporate rich linguistic features about time. The first is a discriminative classifier with new features extracted from the text s time expressions . since 1999 . This model alone improves on previous generative models by 77 . The second model learns probabilistic constraints between time expressions and the unknown document time. Imposing these learned constraints on the discriminative model further improves its accuracy. Finally we present a new experiment design that facilitates easier comparison by future work. 1 Introduction This paper addresses a relatively new task in the NLP community automatic document dating. Given a document with unknown origins what characteristics of its text indicate the year in which the document was written This paper proposes a learning approach that builds constraints from a document s use of time expressions and combines them with a new discriminative classifier that greatly improves previous work. The temporal reasoning community has long depended on document timestamps to ground rela 98 tive time expressions and events Mani and Wilson 2000 Llido et al. 2001 . For instance consider the following passage from the TimeBank corpus Pustejovsky et al. 2003 And while there was no profit this year from discontinued operations last year they contributed 34 million before tax. Reconstructing the timeline of events from this document requires extensive temporal knowledge most

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.