计算机工程与应用2011,Vol.47Issue(29):124-126,3.DOI:10.3778/j.issn.1002-8331.2011.29.034
基于文章要素影响分析的博客文章分类方法
Blog posts classification method based on analysis of article elements
摘要
Abstract
Traditional text classification methods are directly used to classify blog posts without considering characteristics of blog posts,so this paper proposes a method to improve classification results by considering the impact of article elements. This paper proposes an easy method to get rid of noisy posts in order to ensure the reliability of the posts;blog tags are used to extend the thesaurus so as to improve words segment and the accuracy of blog classification; Gl method proposed in comprehensive evaluation model is used to calculate the weights of title,tag,label,first paragraph,last paragraph and other part, which are to be analyzed in blog classification.Experimental results show that this method can gain better classification performance than traditional TF-IDF method.关键词
博客文章分类/博客文本去噪/博客标签/文章要素/G1法Key words
blog posts classification/blog text filtering/blog tags/article element/Gl method分类
信息技术与安全科学引用本文复制引用
鲁梦平,黄翰,蔡昭权,朱一帆,何翊宇,徐震宇..基于文章要素影响分析的博客文章分类方法[J].计算机工程与应用,2011,47(29):124-126,3.基金项目
国家自然科学基金(the National Natural Science Foundation of China under Grant No.61003066,No.61070033) (the National Natural Science Foundation of China under Grant No.61003066,No.61070033)
教育部博士点基金(No.20090172120035) (No.20090172120035)
广东省自然科学基金(No.9151008901000165,No.10151601501000015) (No.9151008901000165,No.10151601501000015)
广东省科技计划项目(No.2009B010800026) (No.2009B010800026)
惠州市现代信息服务业专项资金项目 ()
惠州市科技计划项目(No.2009G024). (No.2009G024)