计算机与数字工程Issue(10):1729-1735,7.DOI:10.3969/j.issn.1672-9722.2015.10.003
基于词共现网络的海量文本分析与热点话题发现
Large Text Corpus Analysis and Hot Topic Discovery Based on Word Co-occurrence Network
摘要
Abstract
In the domain of linguistics ,the relationship between words has a strong regularity .Using word co‐occur‐rence network for text representation can capture maximum information of text structure ,reveals its internal organization principle and rule of linguistics .In this paper ,geared to the massive military corpora of ifeng .com ,sina .com .cn and china . cn ,the word co‐occurrence network of military domain corpora is constructed ,its small‐world effect and scale‐free property are revealed ,and hot topic discovery is realized through the analysis of word frequency statistics ,node degree distribution and named entity extraction .The experimental result shows the effectiveness of related methods .关键词
词共现网络/小世界特性/无标度特性/热点话题发现Key words
word co-occurrence network/small-world/scale-free/hot topic discovery引用本文复制引用
张海龙,淦文燕,陈刚,姜博..基于词共现网络的海量文本分析与热点话题发现[J].计算机与数字工程,2015,(10):1729-1735,7.基金项目
国家自然科学基金(编号60974086)资助。 ()