| 注册
首页|期刊导航|数据采集与处理|基于词共现网络的微博话题发现方法

基于词共现网络的微博话题发现方法

李伟 贾彩燕

数据采集与处理2018,Vol.33Issue(1):186-194,9.
数据采集与处理2018,Vol.33Issue(1):186-194,9.DOI:10.16337/j.1004-9037.2018.01.021

基于词共现网络的微博话题发现方法

Micro-blog Topic Detection in Frequent Word Networks

李伟 1贾彩燕2

作者信息

  • 1. 北京交通大学计算机与信息技术学院,北京,100044
  • 2. 交通数据分析与挖掘北京市重点实验室,北京,100044
  • 折叠

摘要

Abstract

As an important information platform,micro-blog has a large number of user visits every day, and important public opinion events will form a hot topic on micro-blog.In this study,we propose a novel micro-blog topic detection method,named TDFWN(Topic detection in frequent word networks),to ex-cavate hot topics in micro-blog corpus.First,frequent k-item sets(k≥3)in Microblog text data are mined.Second,a word co-occurrence network is build based on these mined frequent k-item sets.Third, the network is partitioned into different communities by using a community detection method,where each community represents a micro-blog hot topic.At last,the micro-blog text data are clustered into differ-ent groups by computing similarity of each micro-blog text with the found topics.The empirical study shows that the TDFWN method is able to find hot topics in micro-blog text data and cluster the micro-blog text data by the found topics simultaneously.

关键词

微博/话题发现/短文本/社区划分

Key words

micro-blog/topic detection/short text/community detection

分类

信息技术与安全科学

引用本文复制引用

李伟,贾彩燕..基于词共现网络的微博话题发现方法[J].数据采集与处理,2018,33(1):186-194,9.

基金项目

国家自然科学基金(61473030)面上资助项目 (61473030)

中央高校基本科研业务专项资金(2014JBM031)资助项目 (2014JBM031)

数字出版国家重点实验室专项课题. ()

数据采集与处理

OA北大核心CSCDCSTPCD

1004-9037

访问量0
|
下载量0
段落导航相关论文