Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
It is a fundamental and important task to extract key phrases from documents. Generally, phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction, we suggest exploring the Wikipedia knowledge to model a document as a semantic network, where both n-ary and binary relationships among phrases are formulated. Based on a commonly accepted assumption that the title of a document is always elaborated to reflect the content of a document and consequently key phrases tend to have close semantics to. | A Semi-Supervised Key Phrase Extraction Approach Learning from Title Phrases through a Document Semantic Network Decong Li1 Sujian Li1 Wenjie Li2 Wei Wang1 Weiguang Qu3 1Key Laboratory of Computational Linguistics Peking University 2Department of Computing The Hong Kong Polytechnic University 3 School of Computer Science and Technology Nanjing Normal University lidecong lisujian wwei @pku.edu.cn cswjli@comp.polyu.edu.hk wgqu@njnu.edu.cn Abstract It is a fundamental and important task to extract key phrases from documents. Generally phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction we suggest exploring the Wikipedia knowledge to model a document as a semantic network where both n-ary and binary relationships among phrases are formulated. Based on a commonly accepted assumption that the title of a document is always elaborated to reflect the content of a document and consequently key phrases tend to have close semantics to the title we propose a novel semi-supervised key phrase extraction approach in this paper by computing the phrase importance in the semantic network through which the influence of title phrases is propagated to the other phrases iteratively. Experimental results demonstrate the remarkable performance of this approach. 1 Introduction Key phrases are defined as the phrases that express the main content of a document. Guided by the given key phrases people can easily understand what a document describes saving a great amount of time reading the whole text. Consequently automatic key phrase extraction is in high demand. Meanwhile it is also fundamental to many other natural language processing applications such as information retrieval text clustering and so on. Key phrase extraction can be normally cast as a ranking problem solved by either supervised or unsupervised methods. Supervised learning requires a large amount of .