青岛大学学报(自然科学版)2018,Vol.31Issue(1):109-114,6.DOI:10.3969/j.issn.1006-1037.2018.02.20
基于TextRank的网评产品特征提取方法
Extraction Method of Product Feature from Network Comment Based on TextRank
何金金 1郭振波 2王开西2
作者信息
- 1. 青岛大学数据科学与软件工程学院,青岛266071
- 2. 青岛大学青岛大学计算机科学技术学院,青岛266071
- 折叠
摘要
Abstract
In order to solve the problem of low extraction accuracy caused by ignoring the connection between words in the classical TF-IDF algorithm,a TextRank word construction method based on word2vec weighting is proposed.First of all through the network crawler to obtain the product review corpus,and word segmentation,POS tagging and noun extraction pretreatment;secondly using word2vec words and word form similarity matrix between element;finally the similarity term word2vec to obtain the influence between the words as the weight of the improved extraction method of the classic TextRank product features.The experimental data show that the precition ratio of traditional TextRank product feature extraction method is improved by 5 %,and the recall ratio is improved by 2.9 % by using this improved method.关键词
评论/特征词抽取/TF-IDF/Word2vec/TextRankKey words
comments/feature extraction/TF-IDF/Word2vec/TextRank分类
信息技术与安全科学引用本文复制引用
何金金,郭振波,王开西..基于TextRank的网评产品特征提取方法[J].青岛大学学报(自然科学版),2018,31(1):109-114,6.