TAILIEUCHUNG - Báo cáo khoa học: "MARS: Multilingual Access and Retrieval System with Enhanced Query Translation and Document Retrieval"

In this paper, we introduce a multilingual access and retrieval system with enhanced query translation and multilingual document retrieval, by mining bilingual terminologies and aligned document directly from the set of comparable corpora which are to be searched upon by users. By extracting bilingual terminologies and aligning bilingual documents with similar content prior to the search process provide more accurate translated terms for the in-domain data and support multilingual retrieval even without the use of translation tool during retrieval time | MARS Multilingual Access and Retrieval System with Enhanced Query Translation and Document Retrieval Lianhau Lee Aiti Aw Thuy Vu Sharifah Aljunied Mahani Min Zhang Haizhou Li Institute for Infocomm Research 1 Fusionopolis Way 21-01 Connexis Singapore 138632 lhlee aaiti tvu smaljunied mzhang hli @ Abstract In this paper we introduce a multilingual access and retrieval system with enhanced query translation and multilingual document retrieval by mining bilingual terminologies and aligned document directly from the set of comparable corpora which are to be searched upon by users. By extracting bilingual terminologies and aligning bilingual documents with similar content prior to the search process provide more accurate translated terms for the in-domain data and support multilingual retrieval even without the use of translation tool during retrieval time. This system includes a userfriendly graphical user interface designed to provide navigation and retrieval of information in browse mode and search mode respectively. 1 Introduction Query translation is an important step in the cross-language information retrieval CLIR . Currently most of the CLIR system relies on various kinds of dictionaries for example Word-Nets Luca and Nurnberger 2006 Ranieri et al. 2004 in query translation. Although dictionaries can provide effective translation on common words or even phrases they are always limited in the coverage. Hence there is a need to expand the existing collections of bilingual terminologies through various means. Recently there has been more and more research work focus on bilingual terminology extraction from comparable corpora. Some promising results have been reported making use of statistics linguistics Sadat et al. 2003 transliteration Udupa et al. 2008 date information Tao and Zhai 2005 and document alignment approach Talvensaari et al. 2007 . In this paper we introduce our Multilingual Access and Retrieval System - MARS which addresses the query

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.