计算机应用与软件2013,Vol.30Issue(3):255-257,270,4.DOI:10.3969/j.issn.1000-386x.2013.03.067
基于TF-CA-CI算法的互信息特征选择改进研究
IMPROVEMENT OF TF-CA-CI ALGORITHM-BASED MUTUAL INFORMATION SELECTION
摘要
Abstract
Mutual information ( MI) algorithm for text feature selection is usually prone to selecting the rare feature terms. In allusion to this limitation, in this paper we make use of the term frequency, the coupling factor among classes and the cohesion degree inside the class to MI algorithm, and propose an improved MI approach which is based on TF-CA-CI algorithm. Experimental result shows that the improved method can effectively control the randomness of the MI method happened in the process of feature selection with low dimension, and achieve a better classification results. So the effectiveness and the feasibility of the improved method are achieved.关键词
特征选择/文本分类/互信息/词频度Key words
Feature selection/ Text classification/ Mutual information/ Term frequency分类
信息技术与安全科学引用本文复制引用
柴加加,张德贤,耿瑞焕..基于TF-CA-CI算法的互信息特征选择改进研究[J].计算机应用与软件,2013,30(3):255-257,270,4.