Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
The Web and digitized text sources contain a wealth of information about named entities such as politicians, actors, companies, or cultural landmarks. Extracting this information has enabled the automated construction of large knowledge bases, containing hundred millions of binary relationships or attribute values about these named entities. | Coupling Label Propagation and Constraints for Temporal Fact Extraction Yafang Wang Maximilian Dylla Marc Spaniol and Gerhard Weikum Max Planck Institute for Informatics Saarbrticken Germany ywang mdylla mspaniol weikum @mpi-inf.mpg.de Abstract The Web and digitized text sources contain a wealth of information about named entities such as politicians actors companies or cultural landmarks. Extracting this information has enabled the automated construction of large knowledge bases containing hundred millions of binary relationships or attribute values about these named entities. However in reality most knowledge is transient i.e. changes over time requiring a temporal dimension in fact extraction. In this paper we develop a methodology that combines label propagation with constraint reasoning for temporal fact extraction. Label propagation aggressively gathers fact candidates and an Integer Linear Program is used to clean out false hypotheses that violate temporal constraints. Our method is able to improve on recall while keeping up with precision which we demonstrate by experiments with biography-style Wikipedia pages and a large corpus of news articles. 1 Introduction In recent years automated fact extraction from Web contents has seen significant progress with the emergence of freely available knowledge bases such as DBpedia Auer et al. 2007 YAGO Suchanek et al. 2007 TextRunner Etzioni et al. 2008 or ReadTheWeb Carlson et al. 2010a . These knowledge bases are constantly growing and contain currently by example of DBpedia several million entities and half a billion facts about them. This wealth of data allows to satisfy the information needs of advanced Internet users by raising queries from keywords to entities. This enables queries like Who is married to Prince Charles or Who are the teammates of Lionel Messi at FC Barcelona . 233 However factual knowledge is highly ephemeral Royals get married and divorced politicians hold positions only for a limited time and