Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Many errors produced by unsupervised and semi-supervised relation extraction (RE) systems occur because of wrong recognition of entities that participate in the relations. This is especially true for systems that do not use separate named-entity recognition components, instead relying on general-purpose shallow parsing. Such systems have greater applicability, because they are able to extract relations that contain attributes of unknown types. However, this generality comes with the cost in accuracy. In this paper we show how to use corpus statistics to validate and correct the arguments of extracted relation instances, improving the overall RE performance . | Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web Benjamin Rosenfeld Information Systems HU School of Business Hebrew University Jerusalem Israel grurgrur@gmail.com Ronen Feldman Information Systems HU School of Business Hebrew University Jerusalem Israel ronen.feldman@huji.ac.il Abstract Many errors produced by unsupervised and semi-supervised relation extraction RE systems occur because of wrong recognition of entities that participate in the relations. This is especially true for systems that do not use separate named-entity recognition components instead relying on general-purpose shallow parsing. Such systems have greater applicability because they are able to extract relations that contain attributes of unknown types. However this generality comes with the cost in accuracy. In this paper we show how to use corpus statistics to validate and correct the arguments of extracted relation instances improving the overall RE performance. We test the methods on SRES - a self-supervised Web relation extraction system. We also compare the performance of corpus-based methods to the performance of validation and correction methods based on supervised NER components. 1 Introduction Information Extraction IE is the task of extracting factual assertions from text. Most IE systems rely on knowledge engineering or on machine learning to generate the task model that is subsequently used for extracting instances of entities and relations from new text. In the knowledge engineering approach the model usually in the form of extraction rules is created manually and in the machine learning approach the model is learned automatically from a manually labeled training set of documents. Both approaches require substantial human effort particularly when applied to the broad range of documents entities and relations on the Web. In order to minimize the manual effort necessary to build Web IE systems semisupervised and completely unsupervised .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.