基于改进K-means算法的微博舆情分析研究OA北大核心CSCDCSTPCD

Microblogging opinion analysis based on an improved K-means algorithm

中文摘要

英文摘要

为避免初始聚类中心选取到孤立点容易导致聚类结果陷入局部最优的不足,提出一种基于密度的K-means(聚类算法)初始聚类中心选择方法.该方法首先计算每个数据对象与其它数据对象间的平均相似度,找出平均相似度高于某固定阈值的对象视作核心对象,再从核心对象中选取彼此间最不相似的作为初始聚类中心.通过自构建的新浪微博抓取工具,分别抓取不同类别的数千条数据,经过分词、预处理及权重计算后,用改进的K-means算法对其进行聚类分析,查准/全率较传统的K-mea…查看全部>>

In order to avoid selecting isolated points as the initial clustering center which can cause clustering results to fall into local optimum,we propose a new K-means (clustering algorithm) initial clustering center selection method based on density.This algorithm firstly calculates the average similarity between each data object and the others,and finds the core objects whose average similarities are higher than a fixed threshold.The least similar core object …查看全部>>

作者：谢修娟;李香菊;莫凌飞

作者单位：东南大学成贤学院计算机工程系,江苏南京210000东南大学成贤学院计算机工程系,江苏南京210000东南大学仪器科学与工程学院,江苏南京210000

分类：信息技术与安全科学

中文关键词：微博聚类中心K-means聚类算法密度

英文关键词：microblogclustering centerK-means clustering algorithmdensity

刊名：《计算机工程与科学》 2018 (1)

页码/页数：155-158,4

基金：江苏高校哲学社会科学基金(2016SJD880186)江苏省现代教育技术研究课题(2016-R-46509)“十二五”国家科技支撑计划(2013BAJ05B02-2)

DOI：10.3969/j.issn.1007-130X.2018.01.023

您当前未登录！

去登录

点击加载更多...

基于改进K-means算法的微博舆情分析研究OA北大核心CSCDCSTPCD

Microblogging opinion analysis based on an improved K-means algorithm

评论