计算机工程与科学2012,Vol.34Issue(9):160-165,6.DOI:10.3969/j.issn.1007-130X.2012.09.029
一种基于朴素贝叶斯的微博情感分类
Classification of Microblog Sentiment Based on Na(i)ve Bayesian
摘要
Abstract
Based on the twice sentiment feature extraction approach,this paper uses syntactic dependency as the first extraction method and semantic lexicon as the second. A sentiment classifier based on naive Bayesian is constructed in order to classify the inclination of emotions from the collected hot topic data in Chinese microblog and hotel remarks. The experiments mainly compare the classification performance of different combination groups including emoticons, punctuation, extraction methods based on semantic lexicon feature and those based on twice sentiment feature to find out better pretreatment methods for sentiment classification of microblog text. Besides, the experiments also compare and analyze the sentiment classification results between microblog text and hotel remarks to seek out the reasons for influencing the classification performance of microblog sentiment. The results indicate that the twice sentiment feature extraction gain the higher Fl. And the performance of "emoticons + punctuation + twice sentiment feature extraction + BOOL" is the best pretreatment method. Meanwhile,it also shows the reason why the classifier based on naive Bayesian obtains higher classification performance in hotel remarks is probably that the topic in microblog is various.关键词
微博/文本情感分类/二次情感特征提取/朴素贝叶斯Key words
microblogi text sentiment classification/ twice sentiment feature extraction/ naive Bayes- Classification of Microblog Sentiment分类
信息技术与安全科学引用本文复制引用
林江豪,阳爱民,周咏梅,陈锦,蔡泽键..一种基于朴素贝叶斯的微博情感分类[J].计算机工程与科学,2012,34(9):160-165,6.基金项目
国家社科基金资助项目(12BYY045) (12BYY045)
教育部人文社会科学研究青年资助项目(10YJCZH247) (10YJCZH247)
广东省科技计划资助项目(2010B031000014) (2010B031000014)
广东外语外贸大学研究生科研创新资助项目 ()
广东外语外贸大学大学生创新实验资助项目 ()