Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Adaptation of Statistical Machine Translation Model for Cross-Lingual Information Retrieval in a Service Context"

Ngọc Ái 64 11 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

One of the important observations done during the CLEF 2009 campaign (Ferro and Peters, 2009) related to CLIR was that the usage of Statistical Machine Translation (SMT) systems (eg. Google Translate) for query translation led to important improvements in the cross-lingual retrieval performance (the best CLIR performance increased from ˜55% of the monolingual baseline in 2008 to more than 90% in 2009 for French and German target languages). However, generalpurpose SMT systems are not necessarily adapted for query translation. That is because SMT systems trained on a corpus of standard parallel phrases take into account the phrase structure implicitly | Adaptation of Statistical Machine Translation Model for Cross-Lingual Information Retrieval in a Service Context Vassilina Nikoulina Bogomil Kovachev Xerox Research Center Europe Informatics Institute vassilina.nikoulina@xrce.xerox.com University of Amsterdam B.K.Kovachev@uva.nl Nikolaos Lagos Xerox Research Center Europe nikolaos.lagos@xrce.xerox.com Christof Monz Informatics Institute University of Amsterdam C.Monz@uva.nl Abstract This work proposes to adapt an existing general SMT model for the task of translating queries that are subsequently going to be used to retrieve information from a target language collection. In the scenario that we focus on access to the document collection itself is not available and changes to the IR model are not possible. We propose two ways to achieve the adaptation effect and both of them are aimed at tuning parameter weights on a set of parallel queries. The first approach is via a standard tuning procedure optimizing for BLEU score and the second one is via a reranking approach optimizing for MAP score. We also extend the second approach by using syntax-based features. Our experiments show improvements of 1-2.5 in terms of MAP score over the retrieval with the non-adapted translation. We show that these improvements are due both to the integration of the adaptation and syntax-features for the query translation task. 1 Introduction Cross Lingual Information Retrieval CLIR is an important feature for any digital content provider in today s multilingual environment. However many of the content providers are not willing to change existing well-established document indexing and search tools nor to provide access to their document collection by a third-party external service. The work presented in this paper assumes such a context of use where a query translation service allows translating queries posed to the search engine of a content provider into several target languages without requiring changes to the undelying IR system used .

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation"

Báo cáo khoa học: "Topic Models for Dynamic Translation Model Adaptation"

Báo cáo khoa học: "Information-theoretic Multi-view Domain Adaptation"

Báo cáo khoa học: "Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information"

Báo cáo khoa học: "Faster Parsing by Supertagger Adaptation"

Báo cáo khoa học: "Preferences versus Adaptation during Referring Expression Generation"

Báo cáo khoa học: "Cross Lingual Adaptation: An Experiment on Sentiment Classiﬁcations"

Báo cáo khoa học: "Domain Adaptation of Maximum Entropy Language Models"

Báo cáo khoa học: "Domain Adaptation for Machine Translation by Mining Unseen Words"

Báo cáo khoa học: "Data point selection for cross-language adaptation of dependency parsers"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.