Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper describes a heuristic-based approach t o word-sense disambiguation. The heuristics that are applied to disambiguate a word depend on its part of speech, and on its relationship to neighboring salient words in the text. Parts of speech are found through a tagger, and related neighboring words are identified by a phrase extractor operating on the tagged text. To suggest possible senses, each heuristic draws on semantic relations extracted from a Webster's dictionary and the semantic thesaurus WordNet. For a given word, all applicable heuristics are tried, and those senses that are rejected by all heuristics are discarded | Sense Disambiguation Using Semantic Relations and Adjacency Information Anil s. Chakravarthy MIT Media Laboratory 20 Ames Street E15-468a Cambridge MA 02139 anil @ media.mit.edu Abstract This paper describes a heuristic-based approach to word-sense disambiguation. The heuristics that are applied to disambiguate a word depend on its part of speech and on its relationship to neighboring salient words in the text. Parts of speech are found through a tagger and related neighboring words are identified by a phrase extractor operating on the tagged text. To suggest possible senses each heuristic draws on semantic relations extracted from a Webster s dictionary and the semantic thesaurus WordNet. For a given word all applicable heuristics are tried and those senses that are rejected by all heuristics are discarded. In all the disam-biguator uses 39 heuristics based on 12 relationships. 1 Introduction Word-sense disambiguation has long been recognized as a difficult problem in computational linguistics. As early as 1960 Bar-Hillel 1 noted that a computer program would find it challenging to recognize the two different senses of the word pen in The pen is in the box and The box is in the pen. In recent years there has been a resurgence of interest in word-sense disambiguation due to the availability of linguistic resources like dictionaries and thesauri and due to the importance of disambiguation in applications like information retrieval and machine translation. The task of disambiguation is to assign a word to one or more senses in a reference by taking into account the context in which the word occurs. The reference can be a standard dictionary or thesaurus or a lexicon constructed specially for some application. The context is provided by the text unit paragraph sentence etc. in which the word occurs. The disambiguator described in this paper is based on two reference sources the Webster s Seventh Dictionary and the semantic thesaurus WordNet 12 . Before the .