TAILIEUCHUNG - Báo cáo khoa học: "Compiling a Massive, Multilingual Dictionary via Probabilistic Inference"

Can we automatically compose a large set of Wiktionaries and translation dictionaries to yield a massive, multilingual dictionary whose coverage is substantially greater than that of any of its constituent dictionaries? The composition of multiple translation dictionaries leads to a transitive inference problem: if word A translates to word B which in turn translates to word C, what is the probability that C is a translation of A? The paper introduces a novel algorithm that solves this problem for 10,000,000 words in more than 1,000 languages. . | Compiling a Massive Multilingual Dictionary via Probabilistic Inference Mausam Stephen Soderland Oren Etzioni Daniel S. Weld Michael Skinner Jeff Bilmes University of Washington Seattle Google Seattle mausam soderlan etzioni weld bilmes @ mskinner@ Abstract Can we automatically compose a large set of Wiktionaries and translation dictionaries to yield a massive multilingual dictionary whose coverage is substantially greater than that of any of its constituent dictionaries The composition of multiple translation dictionaries leads to a transitive inference problem if word A translates to word B which in turn translates to word C what is the probability that C is a translation of A The paper introduces a novel algorithm that solves this problem for 10 000 000 words in more than 1 000 languages. The algorithm yields PanDicTIONARY a novel multilingual dictionary. PanDictionary contains more than four times as many translations than in the largest Wiktionary at precision and over 200 000 000 pairwise translations in over 200 000 language pairs at precision . 1 Introduction and Motivation in the era of globalization inter-lingual communication is becoming increasingly important. Although nearly 7 000 languages are in use today Gordon 2005 most language resources are mono-lingual or This paper investigates whether Wiktionaries and other translation dictionaries available over the Web can be automatically composed to yield a massive multilingual dictionary with superior coverage at comparable precision. We describe the automatic construction of a massive multilingual translation dictionary called 1 The English Wiktionary a lexical resource developed by volunteers over the Internet is one notable exception that contains translations of English words in about 500 languages. Figure 1 A fragment of the translation graph for two senses of the English word spring . Edges labeled 1 and 3 are for spring in the sense of a season and 2

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.