TAILIEUCHUNG - Báo cáo khoa học: "Exploiting Aggregate Properties of Bilingual Dictionaries For Distinguishing Senses of English Words and Inducing English Sense Clusters"

We propose a novel method for inducing monolingual semantic hierarchies and sense clusters from numerous foreign-language-to-English bilingual dictionaries. The method exploits patterns of non-transitivity in translations across multiple languages. No complex or hierarchical structure is assumed or used in the input dictionaries: each is initially parsed into the “lowest common denominator” form, which is to say, a list of pairs of the form (foreign word, English word). | Exploiting Aggregate Properties of Bilingual Dictionaries For Distinguishing Senses of English Words and Inducing English Sense Clusters Charles SCHAFER and David YAROWSKY Department of Computer Science and Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218 uSa cschafer yarowsky @ Abstract We propose a novel method for inducing monolingual semantic hierarchies and sense clusters from numerous foreign-language-to-English bilingual dictionaries. The method exploits patterns of non-transitivity in translations across multiple languages. No complex or hierarchical structure is assumed or used in the input dictionaries each is initially parsed into the lowest common denominator form which is to say a list of pairs of the form foreign word English word . We then propose a monolingual synonymy measure derived from this aggregate resource which is used to derive multilingually-motivated sense hierarchies for monolingual English words with potential applications in word sense classification lexicography and statistical machine translation. 1 Introduction In this work we consider a learning resource comprising over 80 foreign-language-to-English bilingual dictionaries collected by downloading electronic dictionaries from the Internet and also scanning and running optical character recognition OCR software on paper dictionaries. Such a diverse parallel lexical data set has not to our knowledge previously been assembled and examined in its aggregate form as a lexical semantics training resource. We show that this aggregate data set admits of some surprising applications including discovery of synonymy relationships between words and automatic induction of high-quality hierarchical word sense clusterings for English. We perform and describe several experiments deriving synonyms and sense groupings from the aggregate bilingual dictionary and subsequently suggest some possible applications for the results. Finally we propose that sense

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.