TAILIEUCHUNG - Báo cáo khoa học: "Wikipedia as Sense Inventory to Improve Diversity in Web Search Results"

Is it possible to use sense inventories to improve Web search results diversity for one word queries? To answer this question, we focus on two broad-coverage lexical resources of a different nature: WordNet, as a de-facto standard used in Word Sense Disambiguation experiments; and Wikipedia, as a large coverage, updated encyclopaedic resource which may have a better coverage of relevant senses in Web pages. | Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina Santamaria Julio Gonzalo and Javier Artiles UNED c Juan del Rosal 16 28040 Madrid Spain julio@ javart@ Abstract Is it possible to use sense inventories to improve Web search results diversity for one word queries To answer this question we focus on two broad-coverage lexical resources of a different nature Word-Net as a de-facto standard used in Word Sense Disambiguation experiments and Wikipedia as a large coverage updated encyclopaedic resource which may have a better coverage of relevant senses in Web pages. Our results indicate that i Wikipedia has a much better coverage of search results ii the distribution of senses in search results can be estimated using the internal graph structure of the Wikipedia and the relative number of visits received by each sense in Wikipedia and iii associating Web pages to Wikipedia senses with simple and efficient algorithms we can produce modified rankings that cover 70 more Wikipedia senses than the original search engine rankings. 1 Motivation The application of Word Sense Disambiguation WSD to Information Retrieval IR has been subject of a significant research effort in the recent past. The essential idea is that by indexing and matching word senses or even meanings the retrieval process could better handle polysemy and synonymy problems Sanderson 2000 . In practice however there are two main difficulties i for long queries IR models implicitly perform disambiguation and thus there is little room for improvement. This is the case with most standard IR benchmarks such as TREC or CLEF ad-hoc collections ii for very short queries disambiguation may not be possible or even desirable. This is often the case with one word and even two word queries in Web search engines. In Web search there are at least three ways of coping with ambiguity Promoting diversity

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.