Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper discusses research on distinguishing word meanings in the context of information retrieval systems. We conducted experiments with three sources of evidence for making these distinctions: morphology, part-of-speech, and phrases. We have focused on the distinction between h o m o n y m y and polysemy (unrelated vs. related meanings). Our results support the need to distinguish h o m o n y m y and p o l y semy. We found: 1) grouping morphological variants makes a significant improvement in retrieval performance, 2) that more than half of all words in a dictionary that differ. | Homonymy and Polysemy in Information Retrieval Robert Krovetz NEC Research Institute 4 Independence Way Princeton NJ. 08540 krovetz@research.nj .nec.com Abstract This paper discusses research on distinguishing word meanings in the context of information retrieval systems. We conducted experiments with three sources of evidence for making these distinctions morphology part-of-speech and phrases. We have focused on the distinction between homonymy and polysemy unrelated vs. related meanings . Our results support the need to distinguish homonymy and polysemy. We found 1 grouping morphological variants makes a significant improvement in retrieval performance 2 that more than half of all words in à dictionary that differ in part-of-speech are related in meaning and 3 that it is crucial to assign credit to the component words of a phrase. These experiments provide a better understanding of word-based methods and suggest where natural language processing can provide further improvements in retrieval performance. 1 Introduction Lexical ambiguity is a fundamental problem in natural language processing but relatively little quantitative information is available about the extent of the problem or about the impact that it has on specific applications. We report on our experiments to resolve lexical ambiguity in the context of information retrieval IR . Our approach to disambiguation is to treat the information associated with dictionary This paper is based on work that Weis done at the Center for Intelligent Information Retrieval at the University of Massachusetts. It was supported by the National Science Foundation Library of Congress and Department of Commerce under cooperative agreement number EEC-9 209623. I am grateful for their support. senses morphology part of speech and phrases as multiple sources of evidence.1 Experiments were designed to test each source of evidence independently and to identify areas of interaction. Our hypothesis is Hypothesis 1 Resolving lexical