首页|期刊导航|软件导刊|基于改进BERT集成结构的文本分类模型研究

基于改进BERT集成结构的文本分类模型研究

刘振宗王超群陈乐陶永辉王丹

软件导刊2025，Vol.24Issue(5)：79-86,8.

软件导刊2025，Vol.24Issue(5)：79-86,8.DOI:10.11907/rjdk.241141

基于改进BERT集成结构的文本分类模型研究

Research on Text Classification Model Based on Improved BERT Ensemble Structure

刘振宗 ¹王超群 ²陈乐 ²陶永辉 ³王丹³

作者信息

1. 上海电力大学计算机科学与技术学院,上海 201300
2. 航天智慧能源研究院
3. 上海航天能源股份有限公司,上海 201100
折叠

摘要

Abstract

In order to improve the extraction of contextual text features,capture the semantic relationships between texts,and effectively fuse global and local information,the BERT-Transformer-TextCNN parallel text classification model is proposed.This model preprocesses the in-put text through the BERT model to obtain text feature vectors.The Transformer coding layer is used to extract the global information of text fea-ture vectors,and L2 regularization,residual connection and cosine similarity are introduced in the coding layer to overcome the effects of over-fitting,gradient disappearance and vector length.TextCNN is used to extract local information of text feature vectors,and in the process,re-sidual connections,He initialization and average pooling layers are introduced to cope with the disappearance of gradients and insufficient in-formation utilization.Finally,the global and local information are combined,and the text is classified through the Softmax classifier to obtain the final classification result.experiment result shows,Compared with the traditional model in the THUCNews data set,the improved model's accuracy increased by 12%,and its F1 value also increased by 8%.On the IMDB data set,the accuracy and F1 value increased by 13%and 8%respectively,proving the effectiveness of the model in extracting global and local information and integrating semantic relationships.

关键词

文本分类/特征提取/Transformer/TextCNN

Key words

text classification/feature extraction/Transformer/TextCNN

分类

信息技术与安全科学

引用本文复制引用

刘振宗,王超群,陈乐,陶永辉,王丹..基于改进BERT集成结构的文本分类模型研究[J].软件导刊,2025,24(5):79-86,8.

软件导刊

ISSN：1672-7800

访问量7

下载量0

段落导航