烟草科技2025,Vol.58Issue(8):19-27,65,10.DOI:10.16135/j.issn1002-0861.2025.0236
基于支持向量机-混合核算法的不均衡多产区烤烟产地判别模型构建
Construction of an origin discrimination model for flue-cured tobacco from imbalanced multi-regions based on Support Vector Machine with hybrid kernel algorithm
摘要
Abstract
To establish a robust and accurate model for discriminating the origin of flue-cured tobacco samples from imbalanced multi-regions,tobacco strip samples aged for three years were selected from a specific company.The growing areas of these samples cover 12 domestic and international regions.The contents of 68 chemical components,the pH values and the dichloromethane extract yield of the samples were obtained using near-infrared chemical component rapid analysis technology.The Particle Swarm Optimization(PSO)algorithm was used to optimize the parameters of the Support Vector Machine(SVM)kernels to construct the imbalanced multi-region flue-cured tobacco origin discrimination model.This model was then compared and evaluated against the Backpropagation Neural Network(BPNN),Random Forest(RF),and Fisher Discriminant Analysis(FDA)models.The results showed that:1)The flue-cured tobacco origin discrimination model based on the SVM with hybrid kernel algorithm effectively learned key features and achieved high discrimination accuracy for these samples from different regions.The overall discrimination accuracies of the training set and test set reached 99.69%and 99.59%,respectively.2)Compared with the BPNN,RF,and FDA models,the SVM with hybrid kernel model achieved an increased overall discrimination accuracy of 4.55,6.20,and 6.61 percentage points on the test set,respectively.3)When predicting samples from 12 regions with a highly imbalanced distribution of numbers,the macro recall,macro precision,and macro F1 score of the SVM with hybrid kernel model were 0.995 1,0.998 5,and 0.996 8,respectively.Compared with the BPNN,RF,and FDA models,the macro recall increased by 0.299 1,0.326 4,and 0.406 5;the macro precision increased by 0.347 6,0.291 3,and 0.412 4;and the macro F1 score increased by 0.324 1,0.309 4,and 0.409 5,respectively.The flue-cured tobacco origin discrimination model based on SVM with hybrid kernel algorithm outperformed BPNN,RF,and FDA models when discriminating tobacco samples from imbalanced multi-regions.关键词
烟叶/化学成分/不均衡多产区/产地判别/模型构建/支持向量机-混合核算法Key words
Tobacco leaf/Chemical component/Imbalanced multi-region/Origin discrimination/Model construction/Support Vector Machine with hybrid kernel algorithm分类
轻工纺织引用本文复制引用
寇冉冉,郭军伟,王洪波,赵乐,付瑜锋,许衡,刘泽春,聂聪,王聪,杨松,苏明亮,宛然,郭榕,张建平,李庆祥,毕一鸣..基于支持向量机-混合核算法的不均衡多产区烤烟产地判别模型构建[J].烟草科技,2025,58(8):19-27,65,10.基金项目
福建中烟工业有限责任公司科技项目"卷烟数字化辅助设计和维护技术研究"(2022-12号) (2022-12号)
浙江中烟工业有限责任公司科技项目"基于化学成分的烟叶风格数字化表征模型研究"(ZJZY2023C022) (ZJZY2023C022)
中国烟草总公司重大科技项目"产品设计导向的烟叶原料品质多维数字化表征与应用"[110202401030(SZ-04)]. (SZ-04)