TAILIEUCHUNG - Báo cáo khoa học: "A systematic understanding of probabilistic semantic extraction in large corpus"

Probabilistic topic models have recently gained much popularity in informational retrieval and related areas. Via such models, one can project high-dimensional objects such as text documents into a low dimensional space where their latent semantics are captured and modeled; can integrate multiple sources of information—to ”share statistical strength” among components of a hierarchical probabilistic model; and can structurally display and classify the otherwise unstructured object collections. . | Topic Models Latent Space Models Sparse Coding and All That A systematic understanding of probabilistic semantic extraction in large corpus Eric Xing School of Computer Science Carnegie Mellon University Abstract Probabilistic topic models have recently gained much popularity in informational retrieval and related areas. Via such models one can project high-dimensional objects such as text documents into a low dimensional space where their latent semantics are captured and modeled can integrate multiple sources of information to share statistical strength among components of a hierarchical probabilistic model and can structurally display and classify the otherwise unstructured object collections. However to many practitioners how topic models work what to and not to expect from a topic model how is it different from and related to classical matrix algebraic techniques such as LSI NMF in NLP how to empower topic models to deal with complex scenarios such as multimodal data contractual text in social media evolving corpus or presence of supervision such as labeling and rating how to make topic modeling computationally tractable even on webscale data etc. in a principled way remain unclear. In this tutorial I will demystify the conceptual mathematical and computational issues behind all such problems surrounding the topic models and their applications by presenting a systematic overview of the mathematical foundation of topic modeling and its connections to a number of related methods popular in other fields such as the LDA admixture model mixed membership model latent space models and sparse coding. I will offer a simple and unifying view of all these techniques under the framework multi-view latent space embedding and online the roadmap of model extension and algorithmic design to ward different applications in IR and NLP. A main theme of this tutorial that tie together a wide range of issues and problems will build on the probabilistic graphical model formalism a .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.