|国家科技期刊平台
首页|期刊导航|计算机工程与应用|提示学习驱动的新闻舆情风险识别方法研究

提示学习驱动的新闻舆情风险识别方法研究OACSTPCD

Risk Identification Method for News Public Opinion Driven by Prompt Learning

中文摘要英文摘要

从新闻报道中识别企业的风险可以快速定位企业所涉及的风险类别,从而帮助企业及时地做出应对措施.一般而言,新闻舆情风险识别是一种风险标签的多分类任务.以BERT为代表的深度学习方法采用预训练+微调的模式在文本分类任务当中表现突出.然而新闻舆情领域标记数据偏少,构成了小样本的机器学习问题.以提示学习为代表的新范式为小样本分类性能的提升提供了一种新的途径和手段,现有的研究表明该范式在很多任务上优于预训练+微调的方式.受现有研究工作的启发,提出了基于提示学习的新闻舆情风险识别方法,在BERT预训练模型基础之上根据提示学习的思想设计新闻舆情风险提示模板,通过MLM(masked language model)模型训练之后,将预测出来的标签通过答案工程映射到已有的风险标签.实验结果表明在新闻舆情数据集的不同数量小样本上,提示学习的训练方法均优于微调的训练方法.

Identifying a company's risks from news reports can quickly locate the risk categories involved in the company,so as to help enterprises to take response measures timely.Generally speaking,news public opinion risk identification is a multi-classification task of risk labels.The deep learning method represented by BERT uses the mode of pre-training + fine-tuning,which is prominent in text classification tasks.However,there is little labeled data in the field of news and public opinion,which constitutes a small-sample machine learning problem.The new paradigm represented by prompt learning provides a new way and means to improve the performance of small sample classification,and existing studies have shown that this paradigm is superior to the pre-training + fine-tuning method in many tasks.Inspired by the existing research work,this paper proposes a news public opinion risk identification method based on prompt learning,designs a news public opinion risk prompt template based on the idea of prompt learning on the basis of the BERT pre-training model,and after training by the MLM(masked language model)model,the predicted label is mapped to the existing risk label through answer engineering.The experimental results show that the training method of prompt learning is better than the training method of fine-tuning on different numbers of small samples of the news public opinion datasets.

曾慧玲;李琳;吕思洋;何铮

武汉理工大学 计算机与人工智能学院,武汉 430070武汉理工大学 经济学院,武汉 430070德勤咨询(上海)有限公司,上海 510623

计算机与自动化

风险标签多分类预训练模型提示学习

risk labelmulti-label classificationpretrained modelprompt learning

《计算机工程与应用》 2024 (001)

182-188 / 7

湖北省重点研发计划项目(2021BAA030).

10.3778/j.issn.1002-8331.2208-0295

评论