TAILIEUCHUNG - Báo cáo khoa học: "French TimeBank: An ISO-TimeML Annotated Reference Corpus

This article presents the main points in the creation of the French TimeBank (Bittar, 2010), a reference corpus annotated according to the ISO-TimeML standard for temporal annotation. A number of improvements were made to the markup language to deal with linguistic phenomena not yet covered by ISO-TimeML, including cross-language modifications and others specific to French. An automatic preannotation system was used to speed up the annotation process. | French TimeBank An ISO-TimeML Annotated Reference Corpus Andre Bittar Alpage Univ. Paris Diderot Pascal Amsili LLF Univ. Paris Diderot amsili@ Pascal Denis Alpage INRIA Laurence Danlos Alpage Univ. Paris Diderot danlos@ Abstract This article presents the main points in the creation of the French TimeBank Bittar 2010 a reference corpus annotated according to the ISO-TimeML standard for temporal annotation. A number of improvements were made to the markup language to deal with linguistic phenomena not yet covered by ISO-TimeML including cross-language modifications and others specific to French. An automatic preannotation system was used to speed up the annotation process. A preliminary evaluation of the methodology adopted for this project yields positive results in terms of data quality and annotation time. 1 Introduction The processing of temporal information events time expressions and relations between these entities is essential for overall comprehension of natural language discourse. Determining the temporal structure of a text can bring added value to numerous NLP applications information extraction Q A systems summarization. . Progress has been made in recent years in the processing of temporal data notably through the ISO-TimeML standard ISO 2008 and the creation of the TimeBank corpus Pustejovsky et al 2006 for English. Here we present the French TimeBank FTiB a corpus for French annotated in ISO-TimeML. We also present the methodology adopted for the creation of this resource which may be generalized to other annotation tasks. We evaluate the effects of our methodology on the quality of the corpus and the time taken in the task. 130 2 ISO-TimeML ISO-TimeML ISO 2008 is a surface-based language for the marking of events EVENT tag and temporal expressions TIMEX3 as well as the realization of the temporal TLINK aspectual ALINK and modal subordination SLINK .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.