计算机应用研究2012,Vol.29Issue(4):1297-1299,3.DOI:10.3969/j.issn.1001-3695.2012.04.026
基于改进的最大熵均值聚类方法在文本分类中的应用
Application of text categorization based on improved maximum entropy means clustering algorithm
摘要
Abstract
In view of the traditional text classification algorithm has the problems of the characteristics having same influence on classification results, the low rate of classification accuracy,and the increasing of the algorithm time complexity,this paper presented an improved maximum entropy C-means clustering text classification methods. This method combined the C-means clustering algorithm and the maximum entropy algorithm,set Shannon entropy as a maximum entropy model in the target function , simplified classifier forms of expression, and then used the C-means clustering algorithm to the optimal features for classification. The simulation results show that, compared with traditional text classification methods, the proposed method can fast obtain the optimal classification feature subset, greatly improve the accuracy of text classification.关键词
文本分类/最大熵/C-均值聚类/特征选择Key words
text classification/ maximum entropy/ C-means clustering/ feature selection分类
信息技术与安全科学引用本文复制引用
张爱科..基于改进的最大熵均值聚类方法在文本分类中的应用[J].计算机应用研究,2012,29(4):1297-1299,3.基金项目
广西教育厅科研项目基金资助项目(200911LX486,201106LX745) (200911LX486,201106LX745)