首页|期刊导航|计算机应用研究|基于聚类改进的 KN N文本分类算法

基于聚类改进的 KN N文本分类算法

周庆平谭长庚王宏君湛淼湘

计算机应用研究2016，Vol.33Issue(11)：3374-3377,3382,5.

计算机应用研究2016，Vol.33Issue(11)：3374-3377,3382,5.DOI:10.3969/j.issn.1001--3695.2016.11.038

基于聚类改进的 KN N文本分类算法

Improved KNN text classification algorithm based on clustering

周庆平 ¹谭长庚 ¹王宏君 ¹湛淼湘¹

作者信息

1. 中南大学软件学院，长沙410075
折叠

摘要

Abstract

The traditional KNN text classification algorithm is a classification method which is an unsupervised,no parame-ters,simply,more popular and it’s easily to achieve.But it need to constantly calculate the similarity between the test and sample text sets,when larger amounts of the text,the efficiency will be much more worse.To improve the classification effi-ciency of the traditional KNN algorithm,this paper proposed an improved KNN algorithm based on the clustering.Before this algorithm,it used an improved χ2 statistics way to extract the feature of texts,then making the text sets into several clusters based on clustering method,at last it used the improved KNN way to classify the texts.The experiment and analysis results show that this algorithm can better deal with the text classification.

关键词

文本分类/KNN/聚类化/训练集

Key words

text classification/KNN/clustering/training set

分类

信息技术与安全科学

引用本文复制引用

周庆平,谭长庚,王宏君,湛淼湘..基于聚类改进的 KN N文本分类算法[J].计算机应用研究,2016,33(11):3374-3377,3382,5.

基金项目

国家自然科学基金资助项目（61379057，61309001，61379110，61103202，61301136）；国家教育部博士点基金优先发展领域课题（）

计算机应用研究

OA北大核心CSCDCSTPCD

ISSN：1001-3695

访问量0

下载量0

段落导航