TAILIEUCHUNG - Báo cáo khoa học: "A High-Accurate Chinese-English NE Backward Translation System Combining Both Lexical Information and Web Statistics"

Named entity translation is indispensable in cross language information retrieval nowadays. We propose an approach of combining lexical information, web statistics, and inverse search based on Google to backward translate a Chinese named entity (NE) into English. Our system achieves a high Top-1 accuracy of , which is a relatively good performance reported in this area until present. | A High-Accurate Chinese-English NE Backward Translation System Combining Both Lexical Information and Web Statistics Conrad Chen Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei Taiwan drchen@ hhchen@ Abstract Named entity translation is indispensable in cross language information retrieval nowadays. We propose an approach of combining lexical information web statistics and inverse search based on Google to backward translate a Chinese named entity NE into English. Our system achieves a high Top-1 accuracy of which is a relatively good performance reported in this area until present. 1 Introduction Translation of named entities NE attracts much attention due to its practical applications in World Wide Web. The most challenging issue behind is the genres of NEs are various NEs are open vocabulary and their translations are very flexible. Some previous approaches use phonetic similarity to identify corresponding transliterations . translation by phonetic values Lin and Chen 2002 Lee and Chang 2003 . Some approaches combine lexical phonetic and meaning and semantic information to find corresponding translation of NEs in bilingual corpora Feng et al. 2004 Huang et al. 2004 Lam et al. 2004 . These studies focus on the alignment of NEs in parallel or comparable corpora. That is called close-ended NE translation. In open-ended NE translation an arbitrary NE is given and we want to find its corresponding translations. Most previous approaches exploit web search engine to help find translating candidates on the Internet. Al-Onaizan and Knight 2003 adopt language models to generate possible candidates first and then verify these candidates by web statistics. They achieve a Top- 1 accuracy of about with Arabic-to-English translation. Lu et al. 2004 use statistics of anchor texts in web search result to identify translation and obtain a Top-1 accuracy of about in .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.