Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
Identifying whether a multi-word expression (MWE) is compositional or not is important for numerous NLP applications. Sense induction can partition the context of MWEs into semantic uses and therefore aid in deciding compositionality. We propose an unsupervised system to explore this hypothesis on compound nominals, proper names and adjective-noun constructions, and evaluate the contribution of sense induction. The evaluation set is derived from WordNet in a semisupervised way. Graph connectivity measures are employed for unsupervised parameter tuning. . | Detecting Compositionality in Multi-Word Expressions loannis Korkontzelos Department of Computer Science The University of York Heslington York Yo10 5NG UK johnkork@cs.york.ac.uk Suresh Manandhar Department of Computer Science The University of York Heslington York Yo10 5NG UK suresh@cs.york.ac.uk Abstract Identifying whether a multi-word expression MWE is compositional or not is important for numerous NLP applications. Sense induction can partition the context of MWEs into semantic uses and therefore aid in deciding compositionality. We propose an unsupervised system to explore this hypothesis on compound nom-inals proper names and adjective-noun constructions and evaluate the contribution of sense induction. The evaluation set is derived from WordNet in a semisupervised way. Graph connectivity measures are employed for unsupervised parameter tuning. 1 Introduction and related work Multi-word expressions MWEs are sequences of words that tend to cooccur more frequently than chance and are either idiosyncratic or decomposable into multiple simple words Baldwin 2006 . Deciding idiomaticity of MWEs is highly important for machine translation information retrieval question answering lexical acquisition parsing and language generation. Compositionality refers to the degree to which the meaning of a MWE can be predicted by combining the meanings of its components. Unlike syntactic compositionality e.g. by and large semantic compositionality is continuous Baldwin 2006 . In this paper we propose a novel unsupervised approach that compares the major senses of a MWE and its semantic head using distributional similarity measures to test the compositionality of the MWE. These senses are induced by a graph based sense induction system whose parameters are estimated in an unsupervised manner exploiting a number of graph connectivity measures Ko-rkontzelos et al. 2009 . Our method partitions the context space and only uses the major senses filtering out minor senses. In our .