基于词向量的文本分类研究OACSTPCD
Research on Text Classification Based on Word Embedding
针对传统特征选择算法分类准确率较低的问题,提出了基于词向量的文本特征选择改进算法.以微博数据为研究对象进行情感分类,提出类别区分能力强的特征项的相似项同时具有较强的类别区分能力,将Word2vec训练得到的词向量应用到传统的特征选择过程中,根据词向量之间存在的相似性关联对特征项进行适当扩充.实验结果表明,所提出的特征选择算法相比原特征选择算法其分类准确率有一定程度的改进.
Focusing on the problems of the low classification accuracy of traditional feature selection algorithm,an improved text feature selection algorithm is proposed based on word vector. The article takes microblog data as the research object to carry on the sentiment analysis. It forwards an assumption that the feature items which are similar to the ones have strong category distinguish ability,would also have strong ability to distinguish categories. It applies…查看全部>>
马力;李沙沙
西安邮电大学 西安 710061西安邮电大学 西安 710061
信息技术与安全科学
词向量特征扩展Word2vec文本分类
word embeddingfeature expansionWord2vectext classification
《计算机与数字工程》 2019 (2)
281-284,303,5
评论