TAILIEUCHUNG - Báo cáo khoa học: "Integrating Information Extraction and Automatic Hyperlinking"

This paper presents a novel information system integrating advanced information extraction technology and automatic hyper-linking. Extracted entities are mapped into a domain ontology that relates concepts to a selection of hyperlinks. For information extraction, we use SProUT, a generic platform for the development and use of multilingual text processing components. By combining finite-state and unification-based formalisms, the grammar formalism used in SProUT offers both processing efficiency and a high degree of decalrativeness. . | Integrating Information Extraction and Automatic Hyperlinking Stephan Busemann Witold Drozdzynski Hans-Ulrich Krieger Jakub Piskorski Ulrich Schafer Hans Uszkoreit Feiyu Xu German Research Center for Artificial Intelligence DFKI GmbH Stuhlsatzenhausweg 3 D-66123 Saarbrucken Germany sprout@ Abstract This paper presents a novel information system integrating advanced information extraction technology and automatic hyper-linking. Extracted entities are mapped into a domain ontology that relates concepts to a selection of hyperlinks. For information extraction we use SProUT a generic platform for the development and use of multilingual text processing components. By combining finite-state and unification-based formalisms the grammar formalism used in SProUT offers both processing efficiency and a high degree of decal-rativeness. The ExtraLink demo system showcases the extraction of relevant concepts from German texts in the tourism domain offering the direct connection to associated web documents on demand. 1 Introduction The utilization of language technology for the creation of hyperlinks has a long history . Allen et al. 1993 . Information extraction IE is a technology that can be applied to identifying both sources and targets of new hyperlinks. IE systems are becoming commercially viable in supporting diverse information discovery and management tasks. Similarly automatic hyperlinking is a maturing technology designed to interrelate pieces of information using ontologies to define the relationships. With ExtraLink we present a novel information system that integrates both technologies in order to reach at an improved level of informativeness and comfort. Extraction and link generation occur completely in the background. Entities identified by the IE system are mapped into a domain ontology that relates concepts to a structured selection of predefined hyperlinks which can be directly visualized on demand using a standard web browser. This way the user can

Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.