TAILIEUCHUNG - Báo cáo khoa học: "Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora"

For example, the company name create multilingual translation lexicons with “Sony” is transliterated into “新力” (xinli) in Tairegional variations. We propose a transitive wan and “索尼” (suoni) in mainland China. Such translation approach to determine translation terms, in today’s increasingly internationalized variations across languages that have insuffiworld, are appearing more and more often. | Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora Pu-Jen Cheng Yi-Cheng Pan Wen-Hsiang Lu and Lee-Feng Chien Institute of Information Science Academia Sinica Taiwan Dept. of Computer Science and Information Engineering National Cheng Kung Univ. Taiwan Dept. of Information Management National Taiwan University Taiwan pjcheng thomas02 whlu lfchien @ Abstract The purpose of this paper is to automatically create multilingual translation lexicons with regional variations. We propose a transitive translation approach to determine translation variations across languages that have insufficient corpora for translation via the mining of bilingual search-result pages and clues of geographic information obtained from Web search engines. The experimental results have shown the feasibility of the proposed approach in efficiently generating translation equivalents of various terms not covered by general translation dictionaries. It also revealed that the created translation lexicons can reflect different cultural aspects across regions such as Taiwan Hong Kong and mainland China. 1 Introduction Compilation of translation lexicons is a crucial process for machine translation MT Brown et al. 1990 and cross-language information retrieval CLIR systems Nie et al. 1999 . A lot of effort has been spent on constructing translation lexicons from domain-specific corpora in an automatic way Melamed 2000 Smadja et al. 1996 Kupiec 1993 . However such methods encounter two fundamental problems translation of regional variations and the lack of up-to-date and high-lexical-coverage corpus source which are worthy of further investigation. The first problem is resulted from the fact that the translations of a term may have variations in different dialectal regions. Translation lexicons constructed with conventional methods may not adapt to regional usages. For example a Chinese-English lexicon constructed using a Hong Kong corpus cannot be .

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.