TAILIEUCHUNG - Báo cáo khoa học: "Automatic Headline Generation using Character Cross-Correlation"

Arabic language is a morphologically complex language. Affixes and clitics are regularly attached to stems which make direct comparison between words not practical. In this paper we propose a new automatic headline generation technique that utilizes character cross-correlation to extract best headlines and to overcome the Arabic language complex morphology | Automatic Headline Generation using Character Cross-Correlation Fahad A. Alotaiby Department of Electrical Engineering College of Engineering King Saud University 800 Riyadh 11421 Saudi Arabia falotaiby@ Abstract Arabic language is a morphologically complex language. Affixes and clitics are regularly attached to stems which make direct comparison between words not practical. In this paper we propose a new automatic headline generation technique that utilizes character cross-correlation to extract best headlines and to overcome the Arabic language complex morphology. The system that uses character cross-correlation achieves ROUGE-L score of while the exact word matching scores only for the same set of documents. 1 Introduction A headline is considered as a condensed summary of a document. It can be classified as the acme of text summarization. The necessity for automatic headline generation has been raised due to the need to handle huge amount of documents which is a tedious and time-consuming process. Instead of reading every document the headline can be used to decide which of them contains important information. There are two major disciplines towards automatic headline generation extractive and abstractive. In the work of Douzidia and Lapalme 2004 and extractive method was used to produce a 10-words summary which can be considered as a headline of an Arabic document and then it was automatically translated into English. Therefore the reported score reflects the accuracy of the gen 117 eration and translation which makes it difficult to evaluate the process of headline generation of this system. Hedge Trimmer Dorr et al. 2003 is a system that creates a headline for an English newspaper story using linguistically-motivated heuristics to choose a potential headline. Jin and Hauptmann 2002 proposed a probabilistic model for headline generation in which they divide headline generation process into two steps namely the step of .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.