| 注册
首页|期刊导航|计算机工程与科学|一种基于朴素贝叶斯的微博情感分类

一种基于朴素贝叶斯的微博情感分类

林江豪 阳爱民 周咏梅 陈锦 蔡泽键

计算机工程与科学2012,Vol.34Issue(9):160-165,6.
计算机工程与科学2012,Vol.34Issue(9):160-165,6.DOI:10.3969/j.issn.1007-130X.2012.09.029

一种基于朴素贝叶斯的微博情感分类

Classification of Microblog Sentiment Based on Na(i)ve Bayesian

林江豪 1阳爱民 2周咏梅 2陈锦 3蔡泽键2

作者信息

  • 1. 广东外语外贸大学国际工商管理学院,广东广州510006
  • 2. 广东外语外贸大学思科信息学院,广东广州510006
  • 3. 广东外语外贸大学英语语言文化学院,广东广州510006
  • 折叠

摘要

Abstract

Based on the twice sentiment feature extraction approach,this paper uses syntactic dependency as the first extraction method and semantic lexicon as the second. A sentiment classifier based on naive Bayesian is constructed in order to classify the inclination of emotions from the collected hot topic data in Chinese microblog and hotel remarks. The experiments mainly compare the classification performance of different combination groups including emoticons, punctuation, extraction methods based on semantic lexicon feature and those based on twice sentiment feature to find out better pretreatment methods for sentiment classification of microblog text. Besides, the experiments also compare and analyze the sentiment classification results between microblog text and hotel remarks to seek out the reasons for influencing the classification performance of microblog sentiment. The results indicate that the twice sentiment feature extraction gain the higher Fl. And the performance of "emoticons + punctuation + twice sentiment feature extraction + BOOL" is the best pretreatment method. Meanwhile,it also shows the reason why the classifier based on naive Bayesian obtains higher classification performance in hotel remarks is probably that the topic in microblog is various.

关键词

微博/文本情感分类/二次情感特征提取/朴素贝叶斯

Key words

microblogi text sentiment classification/ twice sentiment feature extraction/ naive Bayes- Classification of Microblog Sentiment

分类

信息技术与安全科学

引用本文复制引用

林江豪,阳爱民,周咏梅,陈锦,蔡泽键..一种基于朴素贝叶斯的微博情感分类[J].计算机工程与科学,2012,34(9):160-165,6.

基金项目

国家社科基金资助项目(12BYY045) (12BYY045)

教育部人文社会科学研究青年资助项目(10YJCZH247) (10YJCZH247)

广东省科技计划资助项目(2010B031000014) (2010B031000014)

广东外语外贸大学研究生科研创新资助项目 ()

广东外语外贸大学大学生创新实验资助项目 ()

计算机工程与科学

OA北大核心CSCDCSTPCD

1007-130X

访问量0
|
下载量0
段落导航相关论文