Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
The problem of posterior inference for individual documents is particularly important in topic models. However, it is often intractable in practice. Many existing methods for posterior inference such as variational Bayes, collapsed variational Bayes and collapsed Gibbs sampling do not have any guarantee on either quality or rate of convergence. | Research and Development on Information and Communication Technology Some Methods for Posterior Inference in Topic Models Xuan Bui1 2 Tu Vu1 Khoat Than1 1 Hanoi University of Science and Technology Hanoi Vietnam 2 Thai Nguyen University of Information and Communication Technology Vietnam Correspondence Xuan Bui thanhxuan1581@gmail.com Communication received 27 February 2018 revised 10 July 2018 accepted 8 August 2018 Online early access 8 November 2018 Digital Object Identifier 10.32913 rd-ict.vol2.no15.687 The Area Editor coordinating the review of this article and deciding to accept it was Dr. Trinh Quoc Anh Abstract The problem of posterior inference for individual documents is particularly important in topic models. However it is often intractable in practice. Many existing methods for posterior inference such as variational Bayes collapsed variational Bayes and collapsed Gibbs sampling do not have any guarantee on either quality or rate of convergence. The online maximum a posteriori estimation OPE algorithm has more attractive properties than other inference approaches. In this paper we introduced four algorithms to improve OPE namely OpE1 OPE2 OPE3 and OPE4 by combining two stochastic bounds. Our new algorithms not only preserve the key advantages of OPE but also can sometimes perform significantly better than OPE. These algorithms were employed to develop new effective methods for learning topic models from massive streaming text collections. Empirical results show that our approaches were often more efficient than the state-of-the-art methods. Keywords Topic models posterior inference online maximum a posteriori estimation OPE large-scale learning. I. Introduction Topic modeling provides a framework to model highdimensional sparse data. It can also be seen as an unsupervised learning approach in machine learning. One of the most famous topic models latent Dirichlet allocation LDA 1 has been successfully applied in a wide range of areas including text .