| 注册
首页|期刊导航|郑州大学学报(理学版)|基于RoBERTa-RCNN和注意力池化的新闻主题文本分类

基于RoBERTa-RCNN和注意力池化的新闻主题文本分类

王乾 曾诚 何鹏 张海丰 余新言

郑州大学学报(理学版)2024,Vol.56Issue(2):43-50,8.
郑州大学学报(理学版)2024,Vol.56Issue(2):43-50,8.DOI:10.13705/j.issn.1671-6841.2022168

基于RoBERTa-RCNN和注意力池化的新闻主题文本分类

News Topic Text Classification Based on RoBERTa-RCNN and Attention Pooling

王乾 1曾诚 2何鹏 2张海丰 1余新言1

作者信息

  • 1. 湖北大学 计算机与信息工程学院 湖北 武汉 430062
  • 2. 湖北大学 计算机与信息工程学院 湖北 武汉 430062||湖北省软件工程技术研究中心 湖北 武汉 430062||智慧政务与人工智能应用湖北省工程研究中心 湖北 武汉 430062
  • 折叠

摘要

Abstract

In order to solve the problem of semantic ambiguity and low standardization of words caused by the lack of context information of Chinese news topics,a news topic text classification method based on RoBERTa-RCNN and multi-head attention pooling mechanism was proposed.Data augmentation technique was used to back-translate part of the training data.The self-coding pre-training model and RCNN were used to extract preliminary and deep features of text,and the idea of multi-head attention was combined to improve the maximum pooling layer.This method adopted fusion mechanisms to improve the defects that the maximum pooling strategy in RCNN was single and couldn't be dynamically optimized.Experi-ments were conducted on three news topic data sets and used the Mish function which was more suitable for news topics classification instead of ReLU function.The label smoothing was used to solve the overfitting problem.The results showed that the proposed method was more effective than traditional classification methods,and the feasibility of the model in classification task was verified through ablation experiments.

关键词

预训练语言模型/文本分类/循环卷积神经网络/注意力机制/标签平滑/数据增强

Key words

pre-trained language model/text classification/recurrent convolutional neural network/at-tention mechanism/label smoothing/data augmentation

分类

信息技术与安全科学

引用本文复制引用

王乾,曾诚,何鹏,张海丰,余新言..基于RoBERTa-RCNN和注意力池化的新闻主题文本分类[J].郑州大学学报(理学版),2024,56(2):43-50,8.

基金项目

国家自然科学基金项目(61977021,61902114) (61977021,61902114)

湖北省重点研发计划项目(2021BAA184). (2021BAA184)

郑州大学学报(理学版)

OA北大核心CSTPCD

1671-6841

访问量0
|
下载量0
段落导航相关论文