| 注册
首页|期刊导航|计算机工程与应用|一种基于互信息的改进文本特征选择

一种基于互信息的改进文本特征选择

刘海峰 陈琦 张以皓

计算机工程与应用2012,Vol.48Issue(25):1-4,97,5.
计算机工程与应用2012,Vol.48Issue(25):1-4,97,5.DOI:10.3778/j.issn.1002-8331.2012.25.001

一种基于互信息的改进文本特征选择

Improved mutual information method of feature selection in text categorization

刘海峰 1陈琦 1张以皓2

作者信息

  • 1. 解放军理工大学理学院,南京210007
  • 2. 解放军理工大学指挥自动化学院,南京210007
  • 折叠

摘要

Abstract

This paper puts forward a kind of optimizing Mutual Information (MI) text characteristic selection method. Aiming at the MI' s deficiencies, it puts forward three approaches to improvement. The positive and negative features with the weight factors are distinguished. Through the introduction of the correct factors way, the low-frequency word is realized to restrain. According to the features position in the text, a further weighted method is put forward. In this way, the paper has improved the efficiency of MI model. Subsequent text classification experimental results show the proposed optimization MI and rationality of the method is effective.

关键词

文本分类/特征选择/互信息/特征降维

Key words

Text Categorization(TC)/ feature selection/ Mutual Information (Ml)/ feature reduction

分类

信息技术与安全科学

引用本文复制引用

刘海峰,陈琦,张以皓..一种基于互信息的改进文本特征选择[J].计算机工程与应用,2012,48(25):1-4,97,5.

基金项目

国家自然科学基金(No.71071161). (No.71071161)

计算机工程与应用

OACSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文