Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Optimizing Language Model Information Retrieval System with Expectation Maximization Algorithm"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

Statistical language modeling (SLM) has been used in many different domains for decades and has also been applied to information retrieval (IR) recently. Documents retrieved using this approach are ranked according their probability of generating the given query. In this paper, we present a novel approach that employs the generalized Expectation Maximization (EM) algorithm to improve language models by representing their parameters as observation probabilities of Hidden Markov Models (HMM). | Optimizing Language Model Information Retrieval System with Expectation Maximization Algorithm Justin Liang-Te Chiu Department of Computer Science and Information Engineering National Taiwan University 1 Roosevelt Rd. Sec. 4 Taipei Taiwan 106 ROC b94902009@ntu.edu.tw Jyun-Wei Huang Department of Computer Science and Engineering Yuan Ze University 135 Yuan-Tung Road Chungli Taoyuan Taiwan ROC s976017 @mail.yzu.edu.tw Abstract Statistical language modeling SLM has been used in many different domains for decades and has also been applied to information retrieval IR recently. Documents retrieved using this approach are ranked according their probability of generating the given query. In this paper we present a novel approach that employs the generalized Expectation Maximization EM algorithm to improve language models by representing their parameters as observation probabilities of Hidden Markov Models HMM . In the experiments we demonstrate that our method outperforms standard SLM-based and tf.idf-based methods on TREC 2005 HARD Track data. 1 Introduction In 1945 soon after the computer was invented Vannevar Bush wrote a famous article--- As we may think V. Bush 1996 which formed the basis of research into Information Retrieval IR . The pioneers in IR developed two models for ranking the vector space model G. Salton and M. J. McGill 1986 and the probabilistic model S. E. Robertson and S. Jones 1976 . Since then the research of classical probabilistic models of relevance has been widely studied. For example Robertson S. E. Robertson and S. Walker 1994 S. E. Robertson 1977 modeled word occurrences into relevant or non-relevant classes and ranked documents according to the probabilities they belong to the relevant one. In 1998 Ponte and Croft 1998 proposed a language modeling framework which opens a new point of view in IR. In this approach they gave up the model of relevance instead they treated query generation as random sampling from every document model. The retrieval

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.