计算机工程与应用2012,Vol.48Issue(25):1-4,97,5.DOI:10.3778/j.issn.1002-8331.2012.25.001
一种基于互信息的改进文本特征选择
Improved mutual information method of feature selection in text categorization
摘要
Abstract
This paper puts forward a kind of optimizing Mutual Information (MI) text characteristic selection method. Aiming at the MI' s deficiencies, it puts forward three approaches to improvement. The positive and negative features with the weight factors are distinguished. Through the introduction of the correct factors way, the low-frequency word is realized to restrain. According to the features position in the text, a further weighted method is put forward. In this way, the paper has improved the efficiency of MI model. Subsequent text classification experimental results show the proposed optimization MI and rationality of the method is effective.关键词
文本分类/特征选择/互信息/特征降维Key words
Text Categorization(TC)/ feature selection/ Mutual Information (Ml)/ feature reduction分类
信息技术与安全科学引用本文复制引用
刘海峰,陈琦,张以皓..一种基于互信息的改进文本特征选择[J].计算机工程与应用,2012,48(25):1-4,97,5.基金项目
国家自然科学基金(No.71071161). (No.71071161)