| 注册
首页|期刊导航|计算机与现代化|基于改进 TFIDF 算法的文本分类研究

基于改进 TFIDF 算法的文本分类研究

郑霖 徐德华

计算机与现代化Issue(9):6-9,14,5.
计算机与现代化Issue(9):6-9,14,5.DOI:10.3969/j.issn.1006-2475.2014.09.002

基于改进 TFIDF 算法的文本分类研究

Research on Text Categorization Based on Improved TFIDF Algorithm

郑霖 1徐德华1

作者信息

  • 1. 同济大学经济与管理学院,上海 200092
  • 折叠

摘要

Abstract

Due to the broad application of text categorization in information retrieval , email filtering, Web page classification , personalized recommendation and other fields , it raised extensive attention among scholars since the concept of text categorization was presented .In text classification research , scholars have adopted a lot of methods , and TFIDF was one of the most commonly used algorithms to calculate the weight of feature items .But the traditional TFIDF algorithm ignored the distribution of feature i-tems within classes and among classes , leading to high weight given to many items with little discrimination .In this paper, with the purpose of improving the traditional TFIDF algorithm , we modified the calculation method of IDF , adding some factors which reflected the distribution of feature items within classes and among classes .In the experiment , we applied the improved TFIDF al-gorithm into text categorization .By investigating the effect of text classification , the improving algorithm was verified valid .

关键词

TFIDF算法/特征选择/文本分类

Key words

TFIDF algorithm/feature items selection/text categorization

分类

信息技术与安全科学

引用本文复制引用

郑霖,徐德华..基于改进 TFIDF 算法的文本分类研究[J].计算机与现代化,2014,(9):6-9,14,5.

计算机与现代化

OACSTPCD

1006-2475

访问量0
|
下载量0
段落导航相关论文