TAILIEUCHUNG - Báo cáo khoa học: "A Graph Approach to Spelling Correction in Domain-Centric Search"

Spelling correction for keyword-search queries is challenging in restricted domains such as personal email (or desktop) search, due to the scarcity of query logs, and due to the specialized nature of the domain. For that task, this paper presents an algorithm that is based on statistics from the corpus data (rather than the query log). This algorithm, which employs a simple graph-based approach, can incorporate different types of data sources with different levels of reliability (., email subject vs. email body), and can handle complex spelling errors like splitting and merging of words. . | A Graph Approach to Spelling Correction in Domain-Centric Search Zhuowei Bao University of Pennsylvania Philadelphia PA 19104 USA zhuowei@ Benny Kimelfeld IBM Research-Almaden San Jose CA 95120 USA kimelfeld@ Yunyao Li IBM Research-Almaden San Jose CA 95120 USA yunyaoli@ Abstract Spelling correction for keyword-search queries is challenging in restricted domains such as personal email or desktop search due to the scarcity of query logs and due to the specialized nature of the domain. For that task this paper presents an algorithm that is based on statistics from the corpus data rather than the query log . This algorithm which employs a simple graph-based approach can incorporate different types of data sources with different levels of reliability . email subject vs. email body and can handle complex spelling errors like splitting and merging of words. An experimental study shows the superiority of the algorithm over existing alternatives in the email domain. 1 Introduction An abundance of applications require spelling correction which at the high level is the following task. The user intends to type a chunk q of text but types instead the chunk s that contains spelling errors which we discuss in detail later due to uncareful typing or lack of knowledge of the exact spelling of q. The goal is to restore q when given s. Spelling correction has been extensively studied in the literature and we refer the reader to comprehensive summaries of prior work Peterson 1980 Kukich 1992 Jurafsky and Martin 2000 Mitton 2010 . The focus of this paper is on the special case where q is a search query and where s instead of q is submitted to a search engine with the goal of retrieving documents that match the search query q . Spelling correction for search queries is important because a significant portion of posed queries may be misspelled Cucerzan and Brill 2004 . Effective 905 spelling correction has a major effect on the experience and effort .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.