经济数学2016,Vol.33Issue(1):36-41,6.
基于 KNN 和 Bayes 算法的组合分类器的垃圾评论识别研究
Research on Identifying Product Review Spam Based on Combination Classification of KNN and Bayesian Algorithms
摘要
Abstract
Product review spam affects the reference value of information to a certain extent.The purpose of this paper was to set up a model to remove the product review spam,and retained the real product reviews.Firstly,this paper analyzed the characteristics of the product reviews,and abstracted 14 features from Data collecting,text preprocessing,mutual informa-tion inspecting,and text representing.Secondly,we established a model of combination classifications based on KNN and Bayes algorithm by using the biggest complementarity.Finally,we made cross validating to the product review for iPhone 6 Plus.This model gets a higher correct recognition rate of 75.3%,the recall rate of 82.1%,and F1 value 77.5%.关键词
KNN 算法/Bayes 算法/组合分类器/互信息/交叉验证Key words
KNN algorithm/Bayes algorithm/combination classification/mutual information/cross validation分类
数理科学引用本文复制引用
梁曌,陈思宇,梁小林,康欣..基于 KNN 和 Bayes 算法的组合分类器的垃圾评论识别研究[J].经济数学,2016,33(1):36-41,6.基金项目
长沙理工大学研究生创新性项目(CX2015SS20) (CX2015SS20)