| 注册
首页|期刊导航|计算机工程与科学|基于词向量语义聚类的微博热点挖掘方法

基于词向量语义聚类的微博热点挖掘方法

刘培磊 唐晋韬 王挺 谢松县 岳大鹏 刘海池

计算机工程与科学2018,Vol.40Issue(2):313-319,7.
计算机工程与科学2018,Vol.40Issue(2):313-319,7.DOI:10.3969/j.issn.1007-130X.2018.02.017

基于词向量语义聚类的微博热点挖掘方法

A Twitter hotspot mining method based on sematic clustering of word vectors

刘培磊 1唐晋韬 1王挺 1谢松县 1岳大鹏 1刘海池1

作者信息

  • 1. 国防科技大学计算机学院,湖南长沙410073
  • 折叠

摘要

Abstract

With the rapid development of social media,information overloading becomes a challenge.As a result,how to mining hotspots automatically from so many short and noisy data is an important problem.Social data are real-time and geographic,which usually contain plenty of meta-information.According to these characteristics,this paper proposes a hotspot mining method,which combines user's behavior patterns and text content analysis.In the process of content analysis,we cluster text on the word scale rather than message scale.Besides,sematic clustering technology of word vectors is used for promoting the performance of keywords extraction.Experimental results on real datasets show that this method is better than traditional methods.Specifically,keywords extracted by this method have strong semantic relevance and good topic segmentation,which are superior to the traditional hot-spot mining methods on the main indexes.

关键词

热点挖掘/社交媒体/词向量/语义聚类

Key words

hotspot mining/Twitter/word embedding/semantic clustering

分类

信息技术与安全科学

引用本文复制引用

刘培磊,唐晋韬,王挺,谢松县,岳大鹏,刘海池..基于词向量语义聚类的微博热点挖掘方法[J].计算机工程与科学,2018,40(2):313-319,7.

基金项目

国家自然科学基金(61532001,61472436) (61532001,61472436)

计算机工程与科学

OA北大核心CSCDCSTPCD

1007-130X

访问量0
|
下载量0
段落导航相关论文