TAILIEUCHUNG - Báo cáo khoa học: "Should we Translate the Documents or the Queries in Cross-language Information Retrieval?"

Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems. . | Should we Translate the Documents or the Queries in Cross-language Information Retrieval J. Scott McCarley IBM . Watson Research Center . Box 218 Yorktown Heights NY 10598 jsmc@ Abstract Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French incorporating both translations directions into both document translation and query translation-based information retrieval as well as into hybrid systems. We find that hybrids of document and query translation-based systems outperform query translation systems even human-quality query translation systems. 1 Introduction Should we translate the documents or the queries in cross-language information retrieval The question is more subtle than the implied two alternatives. The need for translation has itself been questioned although non-translation based methods of cross-language information retrieval CLIR such as cognate-matching Buckley et al. 1998 and cross-language Latent Semantic Indexing Dumais et al. 1997 have been developed the most common approaches have involved coupling information retrieval IR with machine translation MT . For convenience we refer to dictionary-lookup techniques and interlingua Diekema et al. 1999 as translation even if these techniques make no attempt to produce coherent or sensibly-ordered language this distinction is important in other areas but a stream of words is adequate for IR. Translating the documents into the query s language s and translating the queries into the document s language s represent two extreme approaches to coupling MT and IR. These two approaches are neither equivalent nor mutually exclusive. They are not equivalent because machine translation is not an .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.