计算机技术与发展Issue(9):84-88,5.DOI:10.3969/j.issn.1673-629X.2014.09.019
改进的话题检测和跟踪算法研究
Improved Algorithm Study on Topic Detection and Tracking
摘要
Abstract
The topic detection can detect hot Internet public opinion and emergencies,and can carry out the continuous tracking of the top-ic,which can get a real-time grasp of public opinion trends. Propose an improved algorithm for detecting and tracking based on topic clustering in this paper. First,to improve the feature vectors of document,increase the backbone vectors based on sentence trunk. Then two center vectors are extracted from each detected topic,in which one is the basic center vector and another is the main center vector. On this basis,by calculating the distance between the document vector and the corresponding center vector,the cluster analysis is performed to en-sure the cohesion of each document for the same topic. Meanwhile,based on keyword extraction,the theme correlation between different topics is calculated to improve the accuracy of topic detection and tracking. Taking the top 10 sites 5 channel data for more than two weeks as the test data,the experimental results show that this method improves the accuracy of topic detection and tracking to some ex-tent,and has certain adaptability and generalization.关键词
话题检测和跟踪/聚类算法/特征向量/网络舆情Key words
topic detection and tracking/cluster algorithm/feature vector/network public opinion分类
信息技术与安全科学引用本文复制引用
肖红,许少华..改进的话题检测和跟踪算法研究[J].计算机技术与发展,2014,(9):84-88,5.基金项目
国家自然科学基金资助项目(61170132) (61170132)