TAILIEUCHUNG - Báo cáo khoa học: "How Many Words is a Picture Worth? Automatic Caption Generation for News Images"

In this paper we tackle the problem of automatic caption generation for news images. Our approach leverages the vast resource of pictures available on the web and the fact that many of them are captioned. Inspired by recent work in summarization, we propose extractive and abstractive caption generation models. They both operate over the output of a probabilistic image annotation model that preprocesses the pictures and suggests keywords to describe their content. | How Many Words is a Picture Worth Automatic Caption Generation for News Images YansongFeng and Mirella Lapata School of Informatics University of Edinburgh 10 Crichton Street Edinburgh Eh8 9AB uK mlap@ Abstract In this paper we tackle the problem of automatic caption generation for news images. Our approach leverages the vast resource of pictures available on the web and the fact that many of them are captioned. Inspired by recent work in summarization we propose extractive and abstractive caption generation models. They both operate over the output of a probabilistic image annotation model that preprocesses the pictures and suggests keywords to describe their content. Experimental results show that an abstractive model defined over phrases is superior to extractive methods. 1 Introduction Recent years have witnessed an unprecedented growth in the amount of digital information available on the Internet. Flickr one of the best known photo sharing websites hosts more than three billion images with approximately million images being uploaded every Many on-line news sites like CNN Yahoo and BBC publish images with their stories and even provide photo feeds related to current events. Browsing and finding pictures in large-scale and heterogeneous collections is an important problem that has attracted much interest within information retrieval. Many of the search engines deployed on the web retrieve images without analyzing their content simply by matching user queries against collocated textual information. Examples include meta-data . the image s file name and format user-annotated tags captions and generally text surrounding the image. As this limits the applicability of search engines images that 1http 2008 11 03 three-billion-photos-at-flickr do not coincide with textual data cannot be retrieved a great deal of work has focused on the development of methods that generate description words for a picture

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.