Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "A Rote Extractor with Edit Distance-based Generalisation and Multi-corpora Precision Calculation"

Phi Ðiệp 74 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

In this paper, we describe a rote extractor that learns patterns for ﬁnding semantic relationships in unrestricted text, with new procedures for pattern generalization and scoring. These include the use of partof-speech tags to guide the generalization, Named Entity categories inside the patterns, an edit-distance-based pattern generalization algorithm, and a pattern accuracy calculation procedure based on evaluating the patterns on several test corpora. | A Rote Extractor with Edit Distance-based Generalisation and Multi-corpora Precision Calculation Enrique Alfonseca12 Pablo Castells1 Manabu Okumura2 Maria Ruiz-Casado12 1 Computer Science Deptartment Univ. Autonoma de Madrid Enrique.Alfonseca@uam.es Pablo.Castells@uam.es Maria.Ruiz@uam.es 2Precision and Intelligence Laboratory Tokyo Institute of Technology enrique@lr.pi.titech.ac.jp oku@pi.titech.ac.jp maria@lr.pi.titech.ac.jp Abstract In this paper we describe a rote extractor that learns patterns for finding semantic relationships in unrestricted text with new procedures for pattern generalization and scoring. These include the use of part-of-speech tags to guide the generalization Named Entity categories inside the patterns an edit-distance-based pattern generalization algorithm and a pattern accuracy calculation procedure based on evaluating the patterns on several test corpora. In an evaluation with 14 entities the system attains a precision higher than 50 for half of the relationships considered. 1 Introduction Recently there is an increasing interest in automatically extracting structured information from large corpora and in particular from the Web Craven et al. 1999 . Because of the difficulty of collecting annotated data several procedures have been described that can be trained on unannotated textual corpora Riloff and Schmelzenbach 1998 Soderland 1999 Mann and Yarowsky 2005 . An interesting approach is that of rote extractors Brin 1998 Agichtein and Gravano 2000 Ravichandran and Hovy 2002 which look for textual contexts that happen to convey a certain relationship between two concepts. In this paper we describe some contributions to the training of Rote extractors including a procedure for generalizing the patterns and a more complex way of calculating their accuracy. We first introduce the general structure of a rote extractor and its limitations. Next we describe the proposed modifications Sections 2 3 and 4 and the evaluation performed Section 5 . .

TÀI LIỆU LIÊN QUAN

Kỷ yếu tóm tắt báo cáo khoa học: Hội nghị khoa học tim mạch toàn quốc lần thứ XI - Hội tim mạch Quốc gia Việt Nam

Báo cáo nghiên cứu khoa học: "Danh lục các loài thú ở khu bảo tồn thiên nhiên Pù Huống tỉnh Nghệ An và ý nghĩa bảo tồn nguồn gen quí hiếm của chúng"

Báo cáo khoa học: Hỗ trợ nâng cao năng lực quản lý chất thải sinh hoạt tại thành phố Hội An

Báo cáo nghiên cứu khoa học: "Tính năng động nghệ thuật của văn học hiện đại Việt Nam và một cách nhìn hành trình thể loại"

Báo cáo nghiên cứu khoa học: " DỊCH CHUYỂN TRUY VẤN OQL VÀO CÁC PHÉP TÍNH BAO HÀM"

Báo cáo khoa học: " Áp dụng thủ tục phân tích trong kiểm toán báo cáo tài chính"

Báo cáo nghiên cứu khoa học: "Người lính trở về sau chiến tranh với mặc cảm “ăn mày dĩ vãng’ trong tiểu thuyết Chu Lai"

Báo cáo nghiên cứu khoa học: "Khảo sát hiện tượng chuyển đổi chức năng - nghĩa của động từ tiếng Việt"

Báo cáo nghiên cứu khoa học: " BẢN CHẤT KHOA HỌC VÀ CÁCH MẠNG LÀ CỘI NGUỒN SỨC SỐNG CỦA CHỦ NGHĨA MÁC - LÊNIN"

Báo cáo khoa học: " CẢI TIẾN CÁC THUẬT TOÁN MƯỢN VÀ KHOÁ KÊNH TẦN SỐ MẠNG DI ĐỘNG TẾ BÀO"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.