首页|期刊导航|计算机与现代化|基于改进k-means算法的文本聚类

基于改进k-means算法的文本聚类

蒋丽薛善良

计算机与现代化Issue(4)：17-21,5.

计算机与现代化Issue(4)：17-21,5.DOI:10.3969/j.issn.1006-2475.2018.04.004

基于改进k-means算法的文本聚类

Text Clustering Based on Improved k-means Algorithm

蒋丽 ¹薛善良¹

作者信息

1. 南京航空航天大学计算机科学与技术学院,江苏南京211106
折叠

摘要

Abstract

To solve the problem that the original k-means algorithm is sensitive to the clustering number k,an improved k-means algorithm is proposed.The algorithm is designed to firstly calculate the similarity between word vectors based on the principle of co-occurrence words and divides the data into k +x clusters according to the similarity threshold and then uses k-means algorithm for k+x clusters.The proposed algorithm is applied to the text clustering.The experimental results show that the proposed algo-rithm is more accurate than the original algorithm.

关键词

k-means算法/共现词/词向量/相似性

Key words

k-means algorithm/co-occurrence word/word vector/similarity

分类

信息技术与安全科学

引用本文复制引用

蒋丽,薛善良..基于改进k-means算法的文本聚类[J].计算机与现代化,2018,(4):17-21,5.

计算机与现代化

OACSTPCD

ISSN：1006-2475

访问量1

下载量0

段落导航