计算机与现代化Issue(6):30-33,39,5.DOI:10.3969/j.issn.1006-2475.2017.06.006
一种基于语义分析的热点新闻发现方法
A News Hot Spot Detection Method Based on Semantic Analysis
曹通1
作者信息
- 1. 南京理工大学计算机科学与工程学院,江苏 南京 210094
- 折叠
摘要
Abstract
With the development and popularization of the Internet, Internet news reports are the main means for people to get so-cial information. How to get the hot topic of Internet news quickly and accurately is an urgent problem to be solved. This paper u-ses the theme model of LDA ( Latent Dirichlet Allocation) and BTM ( Biterm Topic Model) , fully considering the different im-pacts of news headlines and news content on news hot spot detection, to make the semantic analysis of news content and title re-spectively. By using the BTM model for news headlines and the LDA model for news content, we extract the feature vectors of the topic and combine the two semantic features to form the semantic feature of the whole text. Then, through improved clustering al-gorithm, the number of documents belonging to each topic is calculated. On this basis, by defining the news heat and using the news heat formula, the news heat is calculated to get the most recent hot news through ordering the news heat values. Through the experiments on the crawling news data, the validity and practicability of the method are verified.关键词
隐含语义分析/新闻热度/话题检测/LDA与BTM模型Key words
latent semantic analysis/news heat/topic detection/LDA and BTM model分类
信息技术与安全科学引用本文复制引用
曹通..一种基于语义分析的热点新闻发现方法[J].计算机与现代化,2017,(6):30-33,39,5.