TAILIEUCHUNG - Báo cáo khoa học: "Instance Splitting Strategies for Dependency Relation-based Information Extraction"

Information Extraction (IE) is a fundamental technology for NLP. Previous methods for IE were relying on co-occurrence relations, soft patterns and properties of the target (for example, syntactic role), which result in problems of handling paraphrasing and alignment of instances. Our system ARE (Anchor and Relation) is based on the dependency relation model and tackles these problems by unifying entities according to their dependency relations, which we found to provide more invariant relations between entities in many cases. . | ARE Instance Splitting Strategies for Dependency Relation-based Information Extraction Mstislav Maslennikov Hai-Kiat Goh Tat-Seng Chua Department of Computer Science School of Computing National University of Singapore maslenni gohhaiki chuats @ Abstract Information Extraction IE is a fundamental technology for NLP. Previous methods for IE were relying on co-occurrence relations soft patterns and properties of the target for example syntactic role which result in problems of handling paraphrasing and alignment of instances. Our system ARE Anchor and Relation is based on the dependency relation model and tackles these problems by unifying entities according to their dependency relations which we found to provide more invariant relations between entities in many cases. In order to exploit the complexity and characteristics of relation paths we further classify the relation paths into the categories of easy average and hard and utilize different extraction strategies based on the characteristics of those categories. Our extraction method leads to improvement in performance by 3 and 6 for MUC4 and MUC6 respectively as compared to the state-of-art IE systems. 1 Introduction Information Extraction IE is one of the fundamental problems of natural language processing. Progress in IE is important to enhance results in such tasks as Question Answering Information Retrieval and Text Summarization. Multiple efforts in MUC series allowed IE systems to achieve nearhuman performance in such domains as biological Humphreys et al. 2000 terrorism Kaufmann 1992 Kaufmann 1993 and management succession Kaufmann 1995 . The IE task is formulated for MUC series as filling of several predefined slots in a template. The terrorism template consists of slots Perpetrator Victim and Target the slots in the management succession template are Org PersonIn PersonOut and Post. We decided to choose both terrorism and management succession domains from MUC4 and MUC6 respectively in .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.