Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
This paper brings a marriage of two seemly unrelated topics, natural language processing (NLP) and social network analysis (SNA). We propose a new task in SNA which is to predict the diffusion of a new topic, and design a learning-based framework to solve this problem. We exploit the latent semantic information among users, topics, and social connections as features for prediction. Our framework is evaluated on real data collected from public domain. | Exploiting Latent Information to Predict Diffusions of Novel Topics on Social Networks Tsung-Ting Kuo1 San-Chuan Hung1 Wei-Shih Lin1 Nanyun Peng1 Shou-De Lin1 Wei-Fen Lin2 Graduate Institute of Networking and Multimedia National Taiwan University Taiwan 2MobiApps Corporation Taiwan d97944007@csie.ntu.edu.tw Abstract This paper brings a marriage of two seemly unrelated topics natural language processing NLP and social network analysis SNA . We propose a new task in SNA which is to predict the diffusion of a new topic and design a learning-based framework to solve this problem. We exploit the latent semantic information among users topics and social connections as features for prediction. Our framework is evaluated on real data collected from public domain. The experiments show 16 AUC improvement over baseline methods. The source code and dataset are available at http www.csie.ntu.edu.tw d97944007 dif fusion 1 Background The diffusion of information on social networks has been studied for decades. Generally the proposed strategies can be categorized into two categories model-driven and data-driven. The model-driven strategies such as independent cascade model Kempe et al. 2003 rely on certain manually crafted usually intuitive models to fit the diffusion data without using diffusion history. The data-driven strategies usually utilize learning-based approaches to predict the future propagation given historical records of prediction Fei et al. 2011 Galuba et al. 2010 Petrovic et al. 2011 . Data-driven strategies usually perform better than model-driven approaches because the past diffusion behavior is used during learning Galuba et al. 2010 . Recently researchers started to exploit content information in data-driven diffusion models Fei et al. 2011 Petrovic et al. 2011 Zhu et al. 2011 . 344 However most of the data-driven approaches assume that in order to train a model and predict the future diffusion of a topic it is required to obtain historical records about how .