TAILIEUCHUNG - Báo cáo khoa học: "A Latent Topic Extracting Method based on Events in a Document and its Application"

Recently, several latent topic analysis methods such as LSI, pLSI, and LDA have been widely used for text analysis. However, those methods basically assign topics to words, but do not account for the events in a document. With this background, in this paper, we propose a latent topic extracting method which assigns topics to events. | A Latent Topic Extracting Method based on Events in a Document and its Application Risa Kitajima Ochanomizu University Ichiro Kobayashi Ochanomizu University koba@ Abstract Recently several latent topic analysis methods such as LSI pLSI and LDA have been widely used for text analysis. However those methods basically assign topics to words but do not account for the events in a document. With this background in this paper we propose a latent topic extracting method which assigns topics to events. We also show that our proposed method is useful to generate a document summary based on a latent topic. 1 Introduction Recently several latent topic analysis methods such as Latent Semantic Indexing LSI Deerwester et al. 1990 Probabilistic LSI pLSI Hofmann 1999 and Latent Dirichlet Allocation LDA Blei et al. 2003 have been widely used for text analysis. However those methods basically assign topics to words but do not account for the events in a document. Here we define a unit of informing the content of document at the level of sentence as an Event 1 and propose a model that treats a document as a set of Events. We use LDA as a latent topic analysis method and assign topics to Events in a document. To examine our proposed method s performance on extracting latent topics from a document we compare the accuracy of our method to that of the conventional methods through a common document retrieval task. Furthermore as an application of our method we apply it to a query-biased document summarization Tombros and Sanderson 1For the definition of an Event see Section 3. 30 1998 Okumura and Mochizuki 2000 Berger and Mittal 2000 to verify that the method is useful for various applications. 2 Related Studies Suzuki et al. 2010 proposed a flexible latent topics inference in which topics are assigned to phrases in a document. Matsumoto et al. 2005 showed that the accuracy of document classification will be improved by introducing a feature .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.