数据采集与处理2018,Vol.33Issue(1):186-194,9.DOI:10.16337/j.1004-9037.2018.01.021
基于词共现网络的微博话题发现方法
Micro-blog Topic Detection in Frequent Word Networks
摘要
Abstract
As an important information platform,micro-blog has a large number of user visits every day, and important public opinion events will form a hot topic on micro-blog.In this study,we propose a novel micro-blog topic detection method,named TDFWN(Topic detection in frequent word networks),to ex-cavate hot topics in micro-blog corpus.First,frequent k-item sets(k≥3)in Microblog text data are mined.Second,a word co-occurrence network is build based on these mined frequent k-item sets.Third, the network is partitioned into different communities by using a community detection method,where each community represents a micro-blog hot topic.At last,the micro-blog text data are clustered into differ-ent groups by computing similarity of each micro-blog text with the found topics.The empirical study shows that the TDFWN method is able to find hot topics in micro-blog text data and cluster the micro-blog text data by the found topics simultaneously.关键词
微博/话题发现/短文本/社区划分Key words
micro-blog/topic detection/short text/community detection分类
信息技术与安全科学引用本文复制引用
李伟,贾彩燕..基于词共现网络的微博话题发现方法[J].数据采集与处理,2018,33(1):186-194,9.基金项目
国家自然科学基金(61473030)面上资助项目 (61473030)
中央高校基本科研业务专项资金(2014JBM031)资助项目 (2014JBM031)
数字出版国家重点实验室专项课题. ()