通信与信息技术Issue(5):63-67,5.
基于ELMo-BERT的控申法律文书层级多标签分类方法研究
Research on multi-label classification method based on ELMo-BERT for pros-ecution legal documents
摘要
Abstract
At present,the prosecution and prosecution departments of the procuratorate have a heavy workload in the diversion of cases,and the diversion of legal documents for prosecution and prosecution only relies on manual identification,resulting in inefficiency in serving citizens.Due to the complex types and uneven distribution of the text data involved in the prosecution business,the lack of in-formation that occurs when only using large models such as BERT to extract the features of the legal documents of the prosecution and prosecution cannot achieve the effect of classification.In order to solve this problem,this paper proposes a multi-label text classification model based on ELMo-BERT,which uses ELMo and BERT models to extract the features of word vectors and sentence vectors of the text,respectively.The hierarchical labels are represented by Graphormer labels to obtain feature vectors containing label information.Finally,feature fusion is carried out to prevent the phenomenon of missing the main information in the process of information extraction.After add-ing the ELMo module,the accuracy,recall,Micro-F1 value and Macro-F1 value of the model are increased by 3.33%,1.78%,2.48%and 3.58%,respectively,which proves that the addition of the ELMo module can extract the semantic information of the legal documents more comprehensively than single feature extraction.Comparative experiments on self-made datasets show that the Micro-F1 and Macro-F1 values of the ELMo-BERT model are 79.74%and 69.85%,respectively,which are higher than those of other mainstream models,so the multi-scale feature extraction of the ELMo-BERT model has better classification effect than single feature extraction.关键词
控申业务/多标签文本分类/BERT模型/ELMo模型/多尺度特征提取/层级多标签Key words
Control application business/Multi label text classification/BERT model/ELMo model/Multi scale feature extraction/Hierarchical multi-label分类
信息技术与安全科学引用本文复制引用
陈潞潞,陈亮,王珺琳..基于ELMo-BERT的控申法律文书层级多标签分类方法研究[J].通信与信息技术,2025,(5):63-67,5.基金项目
辽宁省教育厅高等学校基本科研项目青年项目(项目编号:1030040000668)沈阳理工大学引进高层次人才项目(项目编号:1010147001228) (项目编号:1030040000668)