TAILIEUCHUNG - Báo cáo khoa học: "Simple Supervised Document Geolocation with Geodesic Grids"

We investigate automatic geolocation (. identification of the location, expressed as latitude/longitude coordinates) of documents. Geolocation can be an effective means of summarizing large document collections and it is an important component of geographic information retrieval. We describe several simple supervised methods for document geolocation using only the document’s raw text as evidence. | Simple Supervised Document Geolocation with Geodesic Grids Benjamin P. Wing Department of Linguistics University of Texas at Austin Austin TX 78712 USA ben@ Jason Baldridge Department of Linguistics University of Texas at Austin Austin TX 78712 USA jbaldrid@ Abstract We investigate automatic geolocation . identification of the location expressed as latitude longitude coordinates of documents. Geolocation can be an effective means of summarizing large document collections and it is an important component of geographic information retrieval. We describe several simple supervised methods for document geolocation using only the document s raw text as evidence. All of our methods predict locations in the context of geodesic grids of varying degrees of resolution. We evaluate the methods on geotagged Wikipedia articles and Twitter feeds. For Wikipedia our best method obtains a median prediction error of just kilometers. Twitter geolocation is more challenging we obtain a median error of 479 km an improvement on previous results for the dataset. 1 Introduction There are a variety of applications that arise from connecting linguistic content be it a word phrase document or entire corpus to geography. Lei-dner 2008 provides a systematic overview of geography-based language applications over the previous decade with a special focus on the problem of toponym resolution identifying and disambiguating the references to locations in texts. Perhaps the most obvious and far-reaching application is geographic information retrieval Ding et al. 2000 Martins 2009 Andogah 2010 with applications like MetaCarta s geographic text search Rauch et al. 2003 and NewsStand Teitler et al. 2008 these allow users to browse and search for 955 content through a geo-centric interface. The Perseus project performs automatic toponym resolution on historical texts in order to display a map with each text showing the locations that are mentioned Smith and Crane 2001 .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.