TAILIEUCHUNG - Báo cáo khoa học: "Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations"

In this paper, we present Espresso, a weakly-supervised, general-purpose, and accurate algorithm for harvesting semantic relations. The main contributions are: i) a method for exploiting generic patterns by filtering incorrect instances using the Web; and ii) a principled measure of pattern and instance reliability enabling the filtering algorithm. We present an empirical comparison of Espresso with various state of the art systems, on different size and genre corpora, on extracting various general and specific relations. Experimental results show that our exploitation of generic patterns substantially increases system recall with small effect on overall precision. . | Espresso Leveraging Generic Patterns for Automatically Harvesting Semantic Relations Patrick Pantel Information Sciences Institute University of Southern California 4676 Admiralty Way Marina del Rey CA 90292 pantel@ Abstract In this paper we present Espresso a weakly-supervised general-purpose and accurate algorithm for harvesting semantic relations. The main contributions are i a method for exploiting generic patterns by filtering incorrect instances using the Web and ii a principled measure of pattern and instance reliability enabling the filtering algorithm. We present an empirical comparison of Espresso with various state of the art systems on different size and genre corpora on extracting various general and specific relations. Experimental results show that our exploitation of generic patterns substantially increases system recall with small effect on overall precision. 1 Introduction Recent attention to knowledge-rich problems such as question answering Pasca and Harabagiu 2001 and textual entailment Geffet and Dagan 2005 has encouraged natural language processing researchers to develop algorithms for automatically harvesting shallow semantic resources. With seemingly endless amounts of textual data at our disposal we have a tremendous opportunity to automatically grow semantic term banks and ontological resources. To date researchers have harvested with varying success several resources including concept lists Lin and Pantel 2002 topic signatures Lin and Hovy 2000 facts Etzioni et al. 2005 and word similarity lists Hindle 1990 . Many recent efforts have also focused on extracting semantic relations between entities such as Marco Pennacchiotti ART Group - DISP University of Rome Tor Vergata Viale del Politecnico 1 Rome Italy pennacchiotti@ entailments Szpektor et al. 2004 is-a Ravi-chandran and Hovy 2002 part-of Girju et al. 2006 and other relations. The following desiderata outline the properties of an ideal relation harvesting .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.