Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval, all sentences are annotated with predicate argument structures and ontological identifiers by applying a deep parser and a term recognizer. During the run time, user requests are converted into queries of region algebra on these annotations. Structural matching with pre-computed semantic annotations establishes the accurate and efficient retrieval of relational concepts. This framework was applied to a text retrieval system for MEDLINE. . | Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases Yusuke Miyao Tomoko Ohta Katsuya Masuda Yoshimasa Tsuruoka Kazuhiro Yoshida Takashi Ninomiya Jun ichi Tsujii Department of Computer Science University of Tokyo t School of Informatics University of Manchester Information Technology Center University of Tokyo Hongo 7-3-1 Bunkyo-ku Tokyo 113-0033 JAPaN yusuke okap kmasuda tsuruoka kyoshida ninomi tsujii @is.s.u-tokyo.ac.jp Abstract This paper introduces a novel framework for the accurate retrieval of relational concepts from huge texts. Prior to retrieval all sentences are annotated with predicate argument structures and ontological identifiers by applying a deep parser and a term recognizer. During the run time user requests are converted into queries of region algebra on these annotations. Structural matching with pre-computed semantic annotations establishes the accurate and efficient retrieval of relational concepts. This framework was applied to a text retrieval system for MEDLINE. Experiments on the retrieval of biomedical correlations revealed that the cost is sufficiently small for real-time applications and that the retrieval precision is significantly improved. 1 Introduction Rapid expansion of text information has motivated the development of efficient methods of accessing information in huge texts. Furthermore user demand has shifted toward the retrieval of more precise and complex information including relational concepts. For example biomedical researchers deal with a massive quantity of publications MEDLINE contains approximately 15 million references to journal articles in life sciences and its size is rapidly increasing at a rate of more than 10 yearly National Library of Medicine 2005 . Researchers would like to be able to search this huge textbase for biomedical correlations such as protein-protein or gene-disease associations Blaschke and Valencia 2002 Hao et al. 2005 Chun et al. 2006 . However the .