计算机工程2017,Vol.43Issue(7):187-192,6.DOI:10.3969/j.issn.1000-3428.2017.07.031
基于AWCRF模型的微博情感倾向分类方法
Method of Microblog Emotional Tendency Classification Based on AWCRF Model
摘要
Abstract
To effectively solve the classification problem of imbalanced distribution of Chinese microblog sentiment data, this paper presents a classification method which combines affinity propagation algorithm, Word2vec technology and Conditional Random Field(CRF) model.It clusters the data of microblog by Affinity Propagation(AP) algorithm, and the majority class of samples are divided into several clusters according to the similarity measure for maximum distance between classes and minimum distance within class.The training set with balanced emotional tendency is constructed by using the undersampling technique.Word2vec is used to obtain the texts with the highest sentiment similarity to expand the microblog sentence and increase the sentiment information.The CRF model is used to calculate the label sequence of the balanced and extended training set.It is also possible to classify the emotional tendency of microblog accurately when the distribution of emotional tendency is unbalanced.Experimental results demonstrate that the proposed method achieves better results than ACRF method, CRF method and SCRF method in recall and G-mean evaluation criteria.关键词
情感分析/情感分类/AffinityPropogation算法/欠采样技术/Word2vec技术/条件随机场Key words
sentiment analysis/sentiment classification/Affinity Propogation(AP) algorithm/undersampling technique/Word2vec technique/Conditional Random Field(CRF)分类
信息技术与安全科学引用本文复制引用
陈炳丰,郝志峰,蔡瑞初,温雯,梁礼欣..基于AWCRF模型的微博情感倾向分类方法[J].计算机工程,2017,43(7):187-192,6.基金项目
国家自然科学基金(U1501254,61472089,61572143) (U1501254,61472089,61572143)
广东省自然科学基金(2014A030306004,2014A030308008) (2014A030306004,2014A030308008)
广东省科技计划项目(2015B010108006) (2015B010108006)
广东省教育厅青年创新人才项目(2015KQNCX027). (2015KQNCX027)