苏州科技大学学报(自然科学版)2018,Vol.35Issue(1):71-75,5.DOI:10.12084/j.issn.2096-3289.2018.01.014
基于Spark的海量文本评论情感分析
Implementation of sentiment analysis in comments based on Spark
摘要
Abstract
In order to realize the sentiment analysis of batch comments and improve the efficiency to dig the val-ue of comments,we proposed a textual semantic analysis tool based on Spark platform. Firstly,we preprocessed the data and made word segmentation of comments by jieba segmentation to maintain the balance of the different training data.Secondly,with the Word2Vec model,we transformed the segmented comments into vectors that were used as input to train classifiers. We also designed related applications based on C/S architecture to submit data and acquire results quickly. Finally,we analyzed the performance of different classifiers on Spark and found that the multilayer perceptron performs best among the three classification algorithms and can judge the polarity of e-motions.关键词
Spark/情感分析/支持向量机/多层感知器Key words
Spark/sentiment analysis/SVM/multilayer perceptron分类
信息技术与安全科学引用本文复制引用
王磊,曾诚,奚雪峰,皮洲,顾建伟,卓文婕,陈帅天..基于Spark的海量文本评论情感分析[J].苏州科技大学学报(自然科学版),2018,35(1):71-75,5.基金项目
国家自然科学基金项目(61472264 ()
61472267 ()
61673290) ()
苏州市科技发展计划(重点实验室SZS201609) (重点实验室SZS201609)
江苏省研究生实践创新计划项目(SJCX17_0681) (SJCX17_0681)
2017年江苏省大学生创新创业训练计划资助项目 ()
苏州市科技发展计划(产业前瞻性项目SYG201707) (产业前瞻性项目SYG201707)