计算机应用与软件2011,Vol.28Issue(1):32-34,3.
结合粗糙集与集成学习的中文文本分类方法研究
ON CHINESE TEXT CATEGORIZATION BASED ON ROUGH SET AND ENSEMBLE LEARNING
摘要
Abstract
This paper introduces the flow of Chinese text categorisation and the relevant technologies.A text categorisation approach based on the combination of rough set and ensemble learning is proposed on the basis of analyzing the disadvantage of traditional feature selection,the feature selection of the text is executed through the rough set, and an ensemble learning algorithm AdaBoost.M1 is employed to improve the categorising performance of weak classifier to categorise the Chinese text.Experiment indicates that this method has a more excellent classification performance with its Fl value of the categorised outcome higher than that of the C4.5 and the kNN classifiers.关键词
中文文本分类/粗糙集/集成学习/AdaBoost.M1引用本文复制引用
张翔,周明全,董丽丽,闫清波..结合粗糙集与集成学习的中文文本分类方法研究[J].计算机应用与软件,2011,28(1):32-34,3.基金项目
国家自然科学基金项目(60873094). (60873094)