计算机工程与应用2013,Vol.49Issue(5):123-126,173,5.DOI:10.3778/j.issn.1002-8331.1107-0441
结构特征和内容分析融合的博客文章分类
Structural characteristics and content analysis fusion for blog post classification
张永 1王芳 1张译匀1
作者信息
- 1. 兰州理工大学计算机通信学院,兰州730050
- 折叠
摘要
Abstract
Aiming at the problems of blog posts contents including multiple themes, unobvious categories ownership and more author's subjective views, structures including tags which are different from texts, common text classification methods not performing well, a new blog posts classification method is presented based on structural characteristics and content analysis. By taking into account blog posts content features, it iterates two different feature extraction methods to enhance the representative ability of feature collection effectively, makes use of main body and title classification. By taking into account the structural features of blog posts, it makes use of tags classification and finally fuses three aspects. The experimental results show that the performance of the improved method is obviously better than common text classification methods.关键词
文本分类/博客文章分类/结构特征/内容分析Key words
text classification/blog post classification/structural characteristics/content analysis分类
信息技术与安全科学引用本文复制引用
张永,王芳,张译匀..结构特征和内容分析融合的博客文章分类[J].计算机工程与应用,2013,49(5):123-126,173,5.