首页|期刊导航|计算机与数字工程|基于分歧的核心数据集筛选算法

基于分歧的核心数据集筛选算法

王纵驰刘健王培赵兴博于佳耕陶青川

计算机与数字工程2024，Vol.52Issue(5)：1304-1309,1316,7.

计算机与数字工程2024，Vol.52Issue(5)：1304-1309,1316,7.DOI:10.3969/j.issn.1672-9722.2024.05.008

基于分歧的核心数据集筛选算法

An Efficient Core-set Selection Algorithm Based on Difference

王纵驰 ¹刘健 ²王培 ²赵兴博 ³于佳耕 ⁴陶青川³

作者信息

1. 中国航空油料集团有限公司北京 100088
2. 航天神舟智慧系统技术有限公司北京 100029
3. 四川大学电子信息学院成都 610065
4. 中国科学院软件研究所北京 100190
折叠

摘要

Abstract

With the development of deep learning,the scale of datasets is accumulating at an unprecedented speed,the pro-cess of training is inefficiency.It is usually necessary to simplify the original data set while ensuring similar training effect.In view of this,a core-set selection algorithm based on divergence is proposed.The algorithm uses the iterative method to learn in a supervised learning way,and calculates the divergence values of each data through the voting network framework,and then sorts them to select.The core-set selection experiments on CIFAR,Fashion-MNIST and SVHN datasets are carried out.The results show that the pro-posed algorithm can obtain a core-set size of one fifth of the original size,while the accuracy of the training model is only reduced by less than 5%.At the same time,the generalization error of the core dataset is only 0.13,which makes it more universal.

关键词

卷积神经网络/核心数据集筛选/有监督学习/主动学习

Key words

convolutional neural network/core set selection/supervised learning/active learning

分类

信息技术与安全科学

引用本文复制引用

王纵驰,刘健,王培,赵兴博,于佳耕,陶青川..基于分歧的核心数据集筛选算法[J].计算机与数字工程,2024,52(5):1304-1309,1316,7.

计算机与数字工程

OACSTPCD

ISSN：1672-9722

访问量9

下载量0

段落导航