|国家科技期刊平台
首页|期刊导航|山西大学学报(自然科学版)|基于模糊隶属度函数的SVM样本约简算法

基于模糊隶属度函数的SVM样本约简算法OACSTPCD

SVM Sample Reduction Algorithm Based on Fuzzy Membership Functions

中文摘要英文摘要

支持向量机(Support vector machine,SVM)具有良好的学习泛化性能,但其学习效率随着训练样本数量的增加而显著降低,对于大规模训练集,采用标准优化方法的传统SVM面临着内存需求过大、执行速度慢等问题.为了缓解这个问题,由于不同的数据点对决策平面的贡献程度不同,本文通过模糊隶属度函数计算每个样本的隶属度,利用模糊隶属度评估每个样本的重要程度,从而将隶属度值低的样本进行约简.基于三种不同的模糊隶属度函数,分别提出了基于类中心距离、核目标对齐和中心核对齐模糊隶属度函数的SVM样本约简算法.在UCI(University of California,lrvine)和kaggle数据集上与传统的SVM和最近提出的基于牛顿法稀疏化SVM(Newton-type Sparse SVM,NSSVM)进行了大量的对比实验,实验结果验证了所提出的基于模糊隶属度函数的SVM样本约简算法在准确率、F-度量和Hinge损失这几个分类性能指标方面的优势.例如,基于中心核对齐模糊隶属度的SVM约简算法在diabetes数据集上取得了最高的准确率、F-度量和最小的Hinge损失.与SVM相比,准确率和F-度量分别提高了13.71%和9.55%,Hinge 损失降低了3.28%;与NSSVM相比,准确率和F-度量分别提高了24.54%和9.38%,Hinge 损失降低了21.54%.

Support vector machine(SVM)has good learning generalization performance.However,the learning efficiency of SVM decreases significantly with the increase of the number of training samples.For large-scale training sets,the traditional SVM with standard optimization methods confronts problems such as excessive memory requirements and slow training speed.In order to alle-viate this problem,due to the different contribution of different data points to the decision plane,in this paper,we calculate the fuzzy membership of each sample through the fuzzy membership function,and use the fuzzy membership to evaluate the importance of each sample,so as to delete the samples with low memberships.Based on three different fuzzy membership functions,SVM sample reduction algorithms based on class center distance,kernel target alignment and centered kernel alignment fuzzy membership func-tions are proposed,respectively.Comprehensive comparative experiments are performed on UCI(University of California,lrvine)and kaggle data sets with the traditional SVM and the proposed Newton-type Sparse SVM(NSSVM).The experimental results vali-date the advantages of the proposed SVM sample reduction algorithms based on fuzzy membership functions in terms of Accuracy,F-measure and Hinge loss measures.For example,the algorithm based on the centered kernel alignment fuzzy membership function achieves the highest Accuracy, F-measure, and the smallest Hinge loss on the diabetes data set. Compared with the SVM, the Accuracy and F-measure are increased by 13.71% and 9.55%, respectively, and the Hinge loss is reduced by 3.28%. Compared with the NSSVM, the accuracy and F-measure are increased by 24.54% and 9.38%, respectively, and the Hinge loss is reduced by 21.54%.

张代俐;汪廷华;朱兴淋

赣南师范大学 数学与计算机科学学院,江西 赣州 341000

计算机与自动化

机器学习支持向量机样本约简模糊隶属度函数

machine learningsupport vector machine(SVM)sample reductionfuzzy membership function

《山西大学学报(自然科学版)》 2024 (001)

基于非线性Lasso的多核学习及其与深度学习的融合研究

18-29 / 12

国家自然科学基金(61966002);江西省研究生创新专项资金(YC2022-s944)

10.13451/j.sxu.ns.2023138

评论