Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
We present a novel approach to the word sense disambiguation problem which makes use of corpus-based evidence combined with background knowledge. Employing an inductive logic programming algorithm, the approach generates expressive disambiguation rules which exploit several knowledge sources and can also model relations between them. The approach is evaluated in two tasks: identification of the correct translation for a set of highly ambiguous verbs in EnglishPortuguese translation and disambiguation of verbs from the Senseval-3 lexical sample task. . | Learning Expressive Models for Word Sense Disambiguation Lucia Specia NILC ICMC University of São Paulo Caixa Postal 668 13560-970 São Carlos SP Brazil lspecia@icmc.usp.br Mark Stevenson Department of Computer Science University of Sheffield Regent Court 211 Portobello St. Sheffield S1 4DP UK marks@dcs.shef.ac.uk Maria das Graẹas V. Nunes NILC ICMC University of São Paulo Caixa Postal 668 13560-970 São Carlos SP Brazil gracan@icmc.usp.br Abstract We present a novel approach to the word sense disambiguation problem which makes use of corpus-based evidence combined with background knowledge. Employing an inductive logic programming algorithm the approach generates expressive disambiguation rules which exploit several knowledge sources and can also model relations between them. The approach is evaluated in two tasks identification of the correct translation for a set of highly ambiguous verbs in English-Portuguese translation and disambiguation of verbs from the Senseval-3 lexical sample task. The average accuracy obtained for the multilingual task outperforms the other machine learning techniques investigated. In the monolingual task the approach performs as well as the state-of-the-art systems which reported results for the same set of verbs. 1 Introduction Word Sense Disambiguation WSD is concerned with the identification of the meaning of ambiguous words in context. For example among the possible senses of the verb run are to move fast by using one s feet and to direct or control . WSD can be useful for many applications including information retrieval information extraction and machine translation. Sense ambiguity has been recognized as one of the most important obstacles to successful language understanding since the early 1960 s and many techniques have been proposed to solve the problem. Recent approaches focus on the use of various lexical resources and corpus-based techniques in order to avoid the substantial effort required to codify linguistic knowledge. .