Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation"

Mạnh Tuấn 74 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This paper presents a method to combine a set of unsupervised algorithms that can accurately disambiguate word senses in a large, completely untagged corpus. Although most of the techniques for word sense resolution have been presented as stand-alone, it is our belief that full-fledged lexical ambiguity resolution should combine several information sources and techniques. The set of techniques have been applied in a combined way to disambiguate the genus terms of two machine-readable dictionaries (MRD), enabling us to construct complete taxonomies for Spanish and French. . | Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation Euskal Herriko Unibertsitatea Donostia Basque Country j ibagbeeSsi.ehu.es German Rigau Jordi Atserias Eneko Agirre Dept de Llenguatges i Sist. Informatics Lengoaia eta Sist. Informatikoak saila Universitat Politècnica de Catalunya Barcelona Catalonia g.rigau batalla @lsi.upc.es Abstract This paper presents a method to combine a set of unsupervised algorithms that can accurately disambiguate word senses in a large completely untagged corpus. Although most of the techniques for word sense resolution have been presented as stand-alone it is our belief that full-fledged lexical ambiguity resolution should combine several information sources and techniques. The set of techniques have been applied in a combined way to disambiguate the genus terms of two machine-readable dictionaries MRD enabling US to construct complete taxonomies for Spanish and French. Tested accuracy is above 80 overall and 95 for two-way ambiguous genus terms showing that taxonomy building is not limited to structured dictionaries such as LDOCE. 1 Introduction While in English the lexical bottleneck problem Briscoe 1991 seems to be softened e.g. WordNet Miller 1990 Alvey Lexicon Grover et al. 1993 COMLEX Grishman et ah 1994 etc. there are no available wide range lexicons for natural language processing NLP for other languages. Manual construction of lexicons is the most reliable technique for obtaining structured lexicons but is costly and highly time-consuming. This is the reason for many researchers having focused on the massive acquisition of lexical knowledge and semantic information from pre-existing structured lexical resources as automatically as possible. This research has been partially funded by CICYT TIC96-1243-C03-02 ITEM project and the European Comission LE-4003 EuroWordNet project . As dictionaries are special texts whose subject matter is a language or a pair of languages in the case of bilingual .

TÀI LIỆU LIÊN QUAN

Báo cáo khoa học: "Combining Coherence Models and Machine Translation Evaluation Metrics for Summarization Evaluation"

Báo cáo khoa học: "Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions"

Báo cáo khoa học: "Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages"

Báo cáo khoa học: "Deciphering Foreign Language by Combining Language Models and Context Vectors"

Báo cáo khoa học: "Combining data and mathematical models of language change"

Báo cáo khoa học: "Combining Orthogonal Monolingual and Multilingual Sources of Evidence for All Words WSD"

Báo cáo khoa học: "Combining Indicators of Allophony"

Báo cáo khoa học: "Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding"

Báo cáo khoa học: "Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction"

Báo cáo khoa học: "Combining POMDPs trained with User Simulations and Rule-based Dialogue Management in a Spoken Dialogue System"

Đã phát hiện trình chặn quảng cáo AdBlock

Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.