Đang chuẩn bị liên kết để tải về tài liệu:
Báo cáo khoa học: "Language Independent Extractive Summarization"

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ

We demonstrate TextRank – a system for unsupervised extractive summarization that relies on the application of iterative graphbased ranking algorithms to graphs encoding the cohesive structure of a text. An important characteristic of the system is that it does not rely on any language-specific knowledge resources or any manually constructed training data, and thus it is highly portable to new languages or domains. | Language Independent Extractive Summarization Rada Mihalcea Department of Computer Science and Engineering University of North Texas rada@cs.unt.edu Abstract We demonstrate TextRank - a system for unsupervised extractive summarization that relies on the application of iterative graphbased ranking algorithms to graphs encoding the cohesive structure of a text. An important characteristic of the system is that it does not rely on any language-specific knowledge resources or any manually constructed training data and thus it is highly portable to new languages or domains. 1 Introduction Given the overwhelming amount of information available today on the Web and elsewhere techniques for efficient automatic text summarization are essential to improve the access to such information. Algorithms for extractive summarization are typically based on techniques for sentence extraction and attempt to identify the set of sentences that are most important for the understanding of a given document. Some of the most successful approaches to extractive summarization consist of supervised algorithms that attempt to learn what makes a good summary by training on collections of summaries built for a relatively large number of training documents e.g. Hirao et al. 2002 Teufel and Moens 1997 . However the price paid for the high performance of such supervised algorithms is their inability to easily adapt to new languages or domains as new training data are required for each new type of data. TextRank Mi-halcea and Tarau 2004 Mihalcea 2004 is specifi cally designed to address this problem by using an extractive summarization technique that does not require any training data or any language-specific knowledge sources. TextRank can be effectively applied to the summarization of documents in different languages without any modifications of the algorithm and without any requirements for additional data. Moreover results from experiments performed on standard data sets have demonstrated that .

TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.