| 注册
首页|期刊导航|河南农业大学学报|数据集划分及预处理方法对烟叶化学成分近红外定量模型的影响

数据集划分及预处理方法对烟叶化学成分近红外定量模型的影响

付博 杨永锋 刘向真 牛洋洋 刘茂林 赵森森 于建军 彭桂新 姬小明

河南农业大学学报2025,Vol.59Issue(3):516-527,12.
河南农业大学学报2025,Vol.59Issue(3):516-527,12.DOI:10.16445/j.cnki.1000-2340.20241009.001

数据集划分及预处理方法对烟叶化学成分近红外定量模型的影响

Influence of dataset partitioning and spectral pre-processing methods on the near infrared quantitative model of chemical ingredients in tobacco leaves

付博 1杨永锋 2刘向真 2牛洋洋 2刘茂林 2赵森森 2于建军 3彭桂新 2姬小明3

作者信息

  • 1. 河南农业大学烟草学院,河南 郑州 450002||河南中烟工业有限责任公司技术中心,河南 郑州 450016
  • 2. 河南中烟工业有限责任公司技术中心,河南 郑州 450016
  • 3. 河南农业大学烟草学院,河南 郑州 450002
  • 折叠

摘要

Abstract

[Objective]The aim of this study is to clarify the appropriate dataset division method,pro-portion and data preprocessing method for model construction,so as to lay a foundation for establishing an accurate and stable analysis model for chemical ingredients in tobacco leaves.[Method]A total of 210 tobacco leaves were used as research samples for the determination of the content of total sugar,reducing sugar,nitrogen,nicotine,potassium and chlorine.Meanwhile,the spectral data of these samples was collected.Influence of different partitioning methods,such as random stone(RS),uniformly-level stone(LS),sample set partitioning based on joint x-y distances(SPXY)and Kennard Stone(KS),as well as the pretreatment and combination of spectral data on the prediction accuracy of Partial Least Squares(PLS)quantitative model of conventional chemical components in tobacco leaves were studied.[Result]The results showed that the corrected set and prediction set was more evenly dis-tributed after the data set was divided by SPXY.When the proportion of prediction set was 24%,the con-structed model had stronger prediction ability.The optimal preprocessing combination for the quantita-tive model of total sugar and chloride was Multiplicative Scatter Correction(MSC)+Moving Average Smoothing(MA)+Wavelet Transform(WAVE).The value of rp of the quantitative model was 0.984 0 and 0.986 0,respectively.The optimal preprocessing combination for the quantitative model of reduced sugar and nicotine was max-min scaling(MAXMIN)+MSC+WAVE,and the value of rp was 0.990 0 and 0.985 2,respectively.The optimal preprocessing combination for potassium was MSC+WAVE(rp=0.969 4).However,the model based on raw spectral data had the strongest prediction ability for nitro-gen(rp=0.970 9).[Conclusion]The accuracy of the near infrared quantitative model for conventional chemical components in tobacco leaves based on NIR was significantly improved after data set division and pretreatment optimization.The results in this study provide a reference for the construction of near infrared quantitative models for other chemical ingredients in tobacco leaves.

关键词

烟叶/近红外光谱/数据集划分/数据预处理/定量模型

Key words

tobacco/near infrared spectroscopy/dataset partitioning/data pre-processing/quantita-tive model

分类

轻工业

引用本文复制引用

付博,杨永锋,刘向真,牛洋洋,刘茂林,赵森森,于建军,彭桂新,姬小明..数据集划分及预处理方法对烟叶化学成分近红外定量模型的影响[J].河南农业大学学报,2025,59(3):516-527,12.

基金项目

河南省科技攻关项目(232102110168) (232102110168)

河南中烟工业有限责任公司科技项目(C202023) (C202023)

河南农业大学学报

OA北大核心

1000-2340

访问量1
|
下载量0
段落导航相关论文