数据采集与处理2018,Vol.33Issue(3):530-537,8.DOI:10.16337/j.1004-9037.2018.03.017
基于图聚类的汉越双语新闻话题发现
Chinese-Vietnamese Bilingual News Topic Detection Methods Based on Graph Clustering
摘要
Abstract
The purpose of cross-language topic discovery is to classify news texts written in different lan-guages by their topics automatically .However ,due to the difference in different languages ,it′s hard to describe these texts on the same feature space ,so mining the same topic is not an easy work .When a particular news event is reported ,the news elements are the same no matter which language describe it . So news elements can reflect the relevance among different news texts .Therefore ,the paper proposed Chinese-Vietnamese bilingual news topic detection methods based on graph clustering .Firstly ,Chinese-Vietnamese bilingual news elements are extracted and the similarity of different news texts is calculated by using the news elements′similarity to set up a Chinese-Vietnamese bilingual news graph model .Sec-ondly ,through the propagation characteristics of the Chinese-Vietnamese bilingual news graph model , the similarity matrix is adjusted by using the random walk algorithm .Finally ,affinity propagation algo-rithm is used to cluster topic .The experimental result shows that the proposed method is effective .关键词
汉越双语/事件要素/话题发现/图聚类Key words
Chinese-Vietnamese/events element/topic detection/graph clustering分类
信息技术与安全科学引用本文复制引用
王禹森,余正涛,高盛祥,周超,洪旭东..基于图聚类的汉越双语新闻话题发现[J].数据采集与处理,2018,33(3):530-537,8.基金项目
国家自然科学基金(61472168 ,61175068 ,61672271)资助项目 (61472168 ,61175068 ,61672271)
云南省自然科学基金重点(2013FA130)资助项目 (2013FA130)
云南省科技创新人才基金(2014 H E001 )资助项目. (2014 H E001 )