Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper presents a set of Bayesian methods for automatically extending the W ORD N ET ontology with new concepts and annotating existing concepts with generic property fields, or attributes. We base our approach on Latent Dirichlet Allocation and evaluate along two dimensions: (1) the precision of the ranked lists of attributes, and (2) the quality of the attribute assignments to W ORD N ET concepts. In all cases we find that the principled LDA-based approaches outperform previously proposed heuristic methods, greatly improving the specificity of attributes at each concept. . | Latent Variable Models of Concept-Attribute Attachment Joseph Reisinger Department of Computer Sciences The University of Texas at Austin Austin Texas 78712 joeraii@cs.utexas.edu Marius Pa ca Google Inc. 1600 Amphitheatre Parkway Mountain View California 94043 mars@google.com Abstract This paper presents a set of Bayesian methods for automatically extending the WordNet ontology with new concepts and annotating existing concepts with generic property fields or attributes. We base our approach on Latent Dirichlet Allocation and evaluate along two dimensions 1 the precision of the ranked lists of attributes and 2 the quality of the attribute assignments to WORDNET concepts. In all cases we find that the principled LDA-based approaches outperform previously proposed heuristic methods greatly improving the specificity of attributes at each concept. 1 Introduction We present a Bayesian approach for simultaneously extending Is-A hierarchies such as those found in WordNet WN Fellbaum 1998 with additional concepts and annotating the resulting concept graph with attributes i.e. generic property fields shared by instances of that concept. Examples of attributes include height and eyecolor for the concept Person or gdp and president for Country. Identifying and extracting such attributes relative to a set of flat i.e. non-hierarchically organized labeled classes of instances has been extensively studied using a variety of data e.g. Web search query logs Pa ca and Van Durme 2008 Web documents Yoshinaga and Torisawa 2007 and Wikipedia Suchanek et al. 2007 Wu and Weld 2008 . Building on the current state of the art in attribute extraction we propose a model-based approach for mapping flat sets of attributes annotated with class labels into an existing ontology. This inference problem is divided into two main components 1 identifying the appropriate parent concept for each labeled class and 2 learning Contributions made during an internship at Google. the correct level of .