| 注册
首页|期刊导航|计算机应用研究|动态自适应特征权重的多类文本分类算法研究

动态自适应特征权重的多类文本分类算法研究

裴颂文 吴百锋

计算机应用研究2011,Vol.28Issue(11):4092-4096,5.
计算机应用研究2011,Vol.28Issue(11):4092-4096,5.DOI:10.3969/j.issn.1001-3695.2011.11.024

动态自适应特征权重的多类文本分类算法研究

Research on dynamic self-adaptive term weighting for multi-class text classification algorithm

裴颂文 1吴百锋2

作者信息

  • 1. 上海理工大学计算机科学与工程系,上海200093
  • 2. 复旦大学计算机科学技术学院,上海200433
  • 折叠

摘要

Abstract

Text classification plays an important role while studying text data mining and information retrieve, and computing and allocating term weight is the key process while classifying text. Therefore, this paper proposed a dynamic self-adaptive term weighting (DATW) for multi-class text classification, which overcame the disadvantages of the traditional term weighting algorithm TF-1DF. DATW not only considered the term frequency within a text and the number of a text corresponding the term within the whole training set, but also took into account the distribution coefficient and the gradient descent of a term to self-adapting dynamic text classification. It is validated that the performance of using DATW is superior to that of using TF-IDF.

关键词

文本分类/特征权重/TF-IDF/分散度/梯度差

Key words

text classification/ term weighting/ TF-IDF/ distribution coefficient/ gradient descent

分类

信息技术与安全科学

引用本文复制引用

裴颂文,吴百锋..动态自适应特征权重的多类文本分类算法研究[J].计算机应用研究,2011,28(11):4092-4096,5.

基金项目

上海市教委优秀青年教师科研基金资助项目(SLG10005) (SLG10005)

上海理工大学科研创新基金资助项目(GDCX-Y-102) (GDCX-Y-102)

AMD大学合作计划专项基金资助项目(BOW-02) (BOW-02)

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量0
|
下载量0
段落导航相关论文