TAILIEUCHUNG - Báo cáo khoa học: "N Semantic Classes are Harder than Two"

We show that we can automatically classify semantically related phrases into 10 classes. Classification robustness is improved by training with multiple sources of evidence, including within-document cooccurrence, HTML markup, syntactic relationships in sentences, substitutability in query logs, and string similarity. Our work provides a benchmark for automatic n-way classification into WordNet’s semantic classes, both on a TREC news corpus and on a corpus of substitutable search query phrases. . | N Semantic Classes are Harder than Two Wiley Greiner Los Angeles Software Inc. 1329 Pine Street Santa Monica CA 90405 I Ben Carterette Rosie Jones CIIR Yahoo Research University of Massachusetts 3333 Empire Ave. Amherst MA 01003 Burbank CA 91504 carteret@ jonesr@ Cory Barr Yahoo Research 3333 Empire Ave. Burbank CA 91504 barrc@ Abstract We show that we can automatically classify semantically related phrases into 10 classes. Classification robustness is improved by training with multiple sources of evidence including within-document cooccurrence HTML markup syntactic relationships in sentences substitutability in query logs and string similarity. Our work provides a benchmark for automatic n-way classification into WordNet s semantic classes both on a TREC news corpus and on a corpus of substitutable search query phrases. 1 Introduction Identifying semantically related phrases has been demonstrated to be useful in information retrieval Anick 2003 Terra and Clarke 2004 and sponsored search Jones et al. 2006 . Work on semantic entailment often includes lexical entailment as a subtask Dagan et al. 2005 . We draw a distinction between the task of identifying terms which are topically related and identifying the specific semantic class. For example the terms dog puppy canine schnauzer cat and pet are highly related terms which can be identified using techniques that include distributional similarity Lee 1999 and within-document cooccurrence measures such as pointwise mutual information Turney et al. 2003 . These techniques however do not allow us to distinguish the more specific relationships hypernym dog puppy This work was carried out while these authors were at Yahoo Research. hyponym dog canine coordinate dog cat Lexical resources such as WordNet Miller 1995 are extremely useful but are limited by being manually constructed. They do not contain semantic class relationships for the many new terms we encounter

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.