| 注册
首页|期刊导航|计算机与数字工程|基于深度学习的中文短文本多标签分类模型

基于深度学习的中文短文本多标签分类模型

曹珍 郭攀峰

计算机与数字工程2024,Vol.52Issue(6):1809-1814,6.
计算机与数字工程2024,Vol.52Issue(6):1809-1814,6.DOI:10.3969/j.issn.1672-9722.2024.06.036

基于深度学习的中文短文本多标签分类模型

Multi-label Classification Model of Chinese Short Texts Based on Deep Learning

曹珍 1郭攀峰1

作者信息

  • 1. 武汉邮电科学研究院 武汉 430074
  • 折叠

摘要

Abstract

Currently,short Chinese texts cannot be effectively distinguished by conventional multi-label classification algo-rithms due to their short length,diverse structure and lack of context.In view of the above problems,this paper proposes a multi-la-bel classification model CRC-MHA for Chinese short texts based on deep learning.The CRC-MHA model abandons the convention-al way of using Word2vec for static word embedding in the text representation layer,and uses BERT to perform dynamic word em-bedding for the input sentence.With the advantage of massive pre-training text,it can better characterize the contextual semantics of the text.At the same time,it designs a parallel feature extraction strategy combining CNN,RCNN and multi-head self-attention mechanism in the feature extraction layer,which enhances the capture of key features inside the text to improve the multi-label clas-sification effect.The experimental results show that the weighted average F1 value of the evaluation index of the CRC-MHA model is 1.95%higher than that of the BERT model,0.42%higher than that of the BERT-CNN model,and 0.34%higher than that of the BERT-RCNN model,which verifies the effectiveness of the model.

关键词

多标签分类/中文短文本/动态词嵌入/特征提取

Key words

multi-label classification/Chinese short text/dynamic word embedding/feature extraction

分类

信息技术与安全科学

引用本文复制引用

曹珍,郭攀峰..基于深度学习的中文短文本多标签分类模型[J].计算机与数字工程,2024,52(6):1809-1814,6.

计算机与数字工程

OACSTPCD

1672-9722

访问量3
|
下载量0
段落导航相关论文