TAILIEUCHUNG - Báo cáo khoa học: "Summarization-based Query Expansion in Information Retrieval"

We discuss a seml-interactive approach to information retrieval which consists of two tasks performed in a sequence. First, the system assists the searcher in building a comprehensive statement of information need, using automatically generated topical summaries of sample documents. | Summarization-based Query Expansion in Information Retrieval Tomek Sfrzalkowski Jin Wang and Bowden Wise GE Corporate Research and Development 1 Research Circle Niskayuna NY 12309 strzalkowski@ Abstract We discuss a semi-interactive approach to information retrieval which consists of two tasks performed in a sequence. First the system assists the searcher in building a comprehensive statement of information need using automatically generated topical summaries of sample documents. Second the detailed statement of information need is automatically processed by a series of natural language processing routines in order to derive an optimal search query for a statistical information retrieval system. In this paper we investigate the role of automated document summarization in building effective search statements. We also discuss the results of latest evaluation of our system at the annual Text Retrieval Conference TREC . Information Retrieval Information retrieval IR is a task of selecting documents from a database in response to a user s query and ranking them according to relevance. This has been usually accomplished using statistical methods often coupled with manual encoding that a select terms words phrases and other units from documents that are deemed to best represent their content and b create an inverted index file or files that provide an easy access to documents containing these terms. A subsequent search process attempts to match preprocessed user queries against termbased representations of documents in each case determining a degree of relevance between the two which depends upon the number and types of matching terms. A search is successful if it can return as many as possible documents which are relevant to the query with as few as possible non-relevant documents. In addition the relevant documents should be ranked ahead of non-relevant ones. The quantitative text representation methods predominant in today s leading information retrieval .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.