| 注册
首页|期刊导航|计算机应用与软件|文本分类中信息增益特征选择算法的改进

文本分类中信息增益特征选择算法的改进

郭颂 马飞

计算机应用与软件2013,Vol.30Issue(8):139-142,4.
计算机应用与软件2013,Vol.30Issue(8):139-142,4.DOI:10.3969/j.issn.1000-386x.2013.08.037

文本分类中信息增益特征选择算法的改进

IMPROVING THE ALGORITHM OF INFORMATION GAIN FEATURE SELECTION IN TEXT CLASSIFICATION

郭颂 1马飞2

作者信息

  • 1. 信阳师范学院计算机与信息技术学院 河南信阳464000
  • 2. 平顶山学院软件学院 河南平顶山467000
  • 折叠

摘要

Abstract

Feature selection algorithm has great impact on the precision of text classification system.Traditional information gain feature selection algorithm usually leads to some features to be selected which are low-frequency in designated category but high-frequency in other categories.To overcome this shortage,based on in-depth analysis on traditional and related improved algorithms,we introduce the improving thoughts of feature distribution difference factor and the weighted factors of inter-category and intra-category,put forward an improved information gain algorithm based on feature distribution weighting,and experiment it using two kinds of classification algorithms,the naive Bayes classifier and the support vector machine classifier respectively.Experimental results demonstrate that the algorithm proposed in the paper outperforms other improved algorithms.

关键词

文本分类/特征选择/信息增益/特征分布加权

Key words

Text classification / Feature selection / Information gain / Feature distribution weighting

分类

信息技术与安全科学

引用本文复制引用

郭颂,马飞..文本分类中信息增益特征选择算法的改进[J].计算机应用与软件,2013,30(8):139-142,4.

基金项目

河南省科技厅基础与前沿技术研究计划项目 (122300410281). (122300410281)

计算机应用与软件

OA北大核心CSCDCSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文