TAILIEUCHUNG - Báo cáo khoa học: "A Rote Extractor with Edit Distance-based Generalisation and Multi-corpora Precision Calculation"

In this paper, we describe a rote extractor that learns patterns for finding semantic relationships in unrestricted text, with new procedures for pattern generalization and scoring. These include the use of partof-speech tags to guide the generalization, Named Entity categories inside the patterns, an edit-distance-based pattern generalization algorithm, and a pattern accuracy calculation procedure based on evaluating the patterns on several test corpora. | A Rote Extractor with Edit Distance-based Generalisation and Multi-corpora Precision Calculation Enrique Alfonseca12 Pablo Castells1 Manabu Okumura2 Maria Ruiz-Casado12 1 Computer Science Deptartment Univ. Autonoma de Madrid 2Precision and Intelligence Laboratory Tokyo Institute of Technology enrique@ oku@ maria@ Abstract In this paper we describe a rote extractor that learns patterns for finding semantic relationships in unrestricted text with new procedures for pattern generalization and scoring. These include the use of part-of-speech tags to guide the generalization Named Entity categories inside the patterns an edit-distance-based pattern generalization algorithm and a pattern accuracy calculation procedure based on evaluating the patterns on several test corpora. In an evaluation with 14 entities the system attains a precision higher than 50 for half of the relationships considered. 1 Introduction Recently there is an increasing interest in automatically extracting structured information from large corpora and in particular from the Web Craven et al. 1999 . Because of the difficulty of collecting annotated data several procedures have been described that can be trained on unannotated textual corpora Riloff and Schmelzenbach 1998 Soderland 1999 Mann and Yarowsky 2005 . An interesting approach is that of rote extractors Brin 1998 Agichtein and Gravano 2000 Ravichandran and Hovy 2002 which look for textual contexts that happen to convey a certain relationship between two concepts. In this paper we describe some contributions to the training of Rote extractors including a procedure for generalizing the patterns and a more complex way of calculating their accuracy. We first introduce the general structure of a rote extractor and its limitations. Next we describe the proposed modifications Sections 2 3 and 4 and the evaluation performed Section 5 . .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.