摘要
Abstract
Microblogging retweet prediction is one of the key problems in information dissemination,which plays important roles in public opinion monitoring,advertising,and business decision making.The process of information dissemination is influenced by many factors such as user interest,microblogging author’s influence,and content of post,etc.The challenge of improving prediction performance is how to capture the important features for retweet prediction.In this paper,we propose a retweet prediction method based on hybrid features learning. Firstly,the method introduces and analyses the impacts of hybrid features including social influence locality,user features,and microblogging content features.Then,it builds the retweet prediction model based on classification algorithms.Finally,it compares the results of different types of microblog.Experimental results on Sina Weibo datasets show that local social influence features,user features and microblogging content features affect the retweet prediction,and the greatest impact is the micro-blog content features.Random forest method has the best performance,and the accuracy rate can reach 83.1%.Compared to Naive Bayes,logistic regression and SVM,the accuracy rate increased by an average of about 7.4%,the highest increase of about 10.8%.In addition,the method has an advantage on topics about natural disasters,environment,trial,rights,which shows that these kinds of events contain stronger retweet patterns.关键词
微博/混合特征学习/转发预测Key words
Microblogging/Hybrid features learning/Retweet prediction分类
信息技术与安全科学