| 注册
首页|期刊导航|计算机应用与软件|面向法院电子卷宗的文本分类方法研究

面向法院电子卷宗的文本分类方法研究

王霄 万玉晴

计算机应用与软件2024,Vol.41Issue(6):101-107,133,8.
计算机应用与软件2024,Vol.41Issue(6):101-107,133,8.DOI:10.3969/j.issn.1000-386x.2024.06.015

面向法院电子卷宗的文本分类方法研究

TEXT CLASSIFICATION METHOD FOR COURT ELECTRONIC FILE

王霄 1万玉晴1

作者信息

  • 1. 太极计算机股份有限公司 北京 100102
  • 折叠

摘要

Abstract

This paper provides corresponding solutions to the main problems in the text classification of court electronic files.We propose a multi-dimensional semantic representation method for court case file to obtain more accurate and comprehensive text feature information.The Gaussian kernel-based kernel extreme learning machine(KELM)learning text classifier was used to get the global optimal solution while greatly improving the training efficiency.The sequence optimization model KOS-ELM based on recursive least squares(RLS)was used to iteratively update the model parameters through new samples.The solutions enabled the classification model to learn online by itself and reduce the dependence on the initial samples.Through comparative experiments,it was proved that the accuracy of the Gaussian kernel-based KELM classification model was 2.66 percentage points and 4.43 percentage points higher than that of the BP network model and LSSVM,but the training time was only 1/6 and 1/10 of the two.The multi-dimensional semantic representation method was used to provide input for the model,and the accuracy rate was 8.84 percentage points and 2.33 percentage points higher than the text vector and word vector representation methods respectively.The RLS-based sequence optimization model KOS-ELM was used to iteratively optimize the weak classifier.After 20 iterations with 4 different types of step-size,the classification accuracy was significantly improved.

关键词

法院电子卷宗/文本分类/语义表示/核极限学习机/递归最小二乘

Key words

Court electronic file/Text classification/Semantic representation/Kernel extreme learning machine/Recursive least squares

分类

信息技术与安全科学

引用本文复制引用

王霄,万玉晴..面向法院电子卷宗的文本分类方法研究[J].计算机应用与软件,2024,41(6):101-107,133,8.

基金项目

国家重点研发计划项目(2018YFC0807700). (2018YFC0807700)

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文