软件导刊2024,Vol.23Issue(12):36-43,8.DOI:10.11907/rjdk.232292
基于ChatGLM的情感分析数据增强方法
Data Augmentation Method for Sentiment Analysis Based on ChatGLM
摘要
Abstract
Sentiment analysis is one of the popular tasks in natural language processing.Due to the difficulty and high cost of annotating train-ing data,sentiment analysis with limited samples has drawn people's attention.Data augmentation methods are one of the primary approaches for handling limited sample learning.However,traditional data augmentation methods have not taken into account the characteristics of senti-ment analysis,which can lead to issues such as semantic inconsistencies,sentiment bias,and excessive generation in the augmented data.To address these problems,a multi-stage data augmentation strategy based on the ChatGLM model is proposed specifically for sentiment analysis.Specifically,it starts with simple word-level data augmentation using EDA methods,followed by filtering the generated data using a sentiment lexicon,and finally,enhancing it at the sentence level using the ChatGLM model.Experimental results demonstrate that this data augmenta-tion method improves accuracy by 1.9%,2.1%,and 2.2%on three different datasets compared to the traditional optimal data augmentation method,confirming the effectiveness of this approach for limited sample sentiment analysis.关键词
少样本学习/情感分析/数据增强/预训练模型/自然语言处理Key words
few-shot learning/sentiment analysis/data augmentation/pre-trained models/natural language processing分类
信息技术与安全科学引用本文复制引用
高新周,叶宁,徐康,王甦,王汝传..基于ChatGLM的情感分析数据增强方法[J].软件导刊,2024,23(12):36-43,8.基金项目
江苏省重大科技示范项目(社会发展)(BE2022798) (社会发展)