TAILIEUCHUNG - Báo cáo khoa học: "Automatic Set Instance Extraction using the Web"

An important and well-studied problem is the production of semantic lexicons from a large corpus. In this paper, we present a system named ASIA (Automatic Set Instance Acquirer), which takes in the name of a semantic class as input (., “car makers”) and automatically outputs its instances (., “ford”, “nissan”, “toyota”). ASIA is based on recent advances in webbased set expansion - the problem of finding all instances of a set given a small number of “seed” instances. This approach effectively exploits web resources and can be easily adapted to different languages. . | Automatic Set Instance Extraction using the Web Richard C. Wang Language Technologies Institute Carnegie Mellon University rcwang@ William W. Cohen Machine Learning Department Carnegie Mellon University wcohen@ Abstract An important and well-studied problem is the production of semantic lexicons from a large corpus. In this paper we present a system named ASIA Automatic Set Instance Acquirer which takes in the name of a semantic class as input . car makers and automatically outputs its instances . ford nissan toyota . ASIA is based on recent advances in webbased set expansion - the problem of finding all instances of a set given a small number of seed instances. This approach effectively exploits web resources and can be easily adapted to different languages. In brief we use languagedependent hyponym patterns to find a noisy set of initial seeds and then use a state-of-the-art language-independent set expansion system to expand these seeds. The proposed approach matches or outperforms prior systems on several English-language benchmarks. It also shows excellent performance on three dozen additional benchmark problems from English Chinese and Japanese thus demonstrating language-independence. 1 Introduction An important and well-studied problem is the production of semantic lexicons for classes of interest that is the generation of all instances of a set . apple orange banana given a name of that set . fruits . This task is often addressed by linguistically analyzing very large collections of text Hearst 1992 Kozareva et al. 2008 Etzioni et al. 2005 Pantel and Ravichandran 2004 Pasca 2004 often using hand-constructed or machine-learned shallow linguistic patterns to detect hyponym instances. A hyponym is a word or phrase whose semantic range English Chinese Japanese Is Amazing Race K 7 Ẳ è A Survivor WK Big Brother 7 The Mole ỳẠít-O 5 The Apprentice d Lb7 o Project Runway SfW 7 7-1 - The Bachelor Wii ỉ 7-ềA Figure 1 Examples of seal s

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.