| 注册
首页|期刊导航|分析化学|基于数据增强策略和卷积神经网络的近红外光谱分析研究

基于数据增强策略和卷积神经网络的近红外光谱分析研究

郑运 杨思雨 王涛 邓焯文 兰维杰 云永欢 潘磊庆

分析化学2024,Vol.52Issue(9):1266-1276,11.
分析化学2024,Vol.52Issue(9):1266-1276,11.DOI:10.19756/j.issn.0253-3820.241155

基于数据增强策略和卷积神经网络的近红外光谱分析研究

Near Infrared Spectral Analysis Based on Data Augmentation Strategy and Convolutional Neural Network

郑运 1杨思雨 1王涛 1邓焯文 1兰维杰 2云永欢 1潘磊庆2

作者信息

  • 1. 海南大学食品科学与工程学院,海口 570228
  • 2. 南京农业大学食品科技学院,南京 430000
  • 折叠

摘要

Abstract

Near infrared spectroscopy(NIRS)technology combined with chemometrics algorithms has been widely used in quantitative and qualitative analysis of food and medicine.However,traditional chemometrics methods,especially linear classification methods,often yield unsatisfactory results when addressing multi-class classification problems.Convolutional neural network(CNN)is adept at extracting deep-level features from data and suitable for handling non-linear relationships.The modeling performance of CNN depends on the size and diversity of sample,while the collection and preprocessing of NIRS sample data is often time-consuming and labor-intensive.This study proposed a NIRS qualitative analysis method based on data augmentation strategies and CNN.The data augmentation strategy included two steps.Firstly,applying Bootstrap resampling and generative adversarial network(GAN)methods to augment three NIRS datasets(Medicine,coffee and grape).Secondly,combining the original samples(Y)with the Bootstrap augmented samples(B)and GAN augmented samples(G)to obtain three augmented datasets(Y-B,Y-G and Y-B-G).Based on this,a CNN model structure suitable for these datasets was designed,consisting of 2 one-dimensional convolutional layers,1 max-pooling layer,and 1 fully connected layer.The results showed that compared to the optimal models of partial least squares discriminant analysis(PLS-DA),support vector machine(SVM),and back propagation neural network(BP),the CNN model based on Y-B dataset achieved average accuracy improvements of 3.998%,9.364%,and 4.689%for medicine(Binary classification);the CNN model based on the Y-B-G dataset achieved average accuracy improvements of 6.001%,2.004%,and 7.523%for coffee(7-class classification);and the CNN model based on the Y-B dataset achieved average accuracy improvements of 33.408%,51.994%,and 34.378%for grapes(20-class classification).It was evident that the models established based on data augmentation strategies and CNN demonstrated better classification accuracy and generalization performance with different datasets and classification categories.

关键词

数据增强/近红外光谱/卷积神经网络/化学计量学

Key words

Data augmentation/Near infrared spectroscopy/Convolutional neural network/Chemometrics

引用本文复制引用

郑运,杨思雨,王涛,邓焯文,兰维杰,云永欢,潘磊庆..基于数据增强策略和卷积神经网络的近红外光谱分析研究[J].分析化学,2024,52(9):1266-1276,11.

基金项目

海南省重点研发项目(No.ZDYF2024XDNY197)、海南省自然科学基金项目(Nos.323QN202,322CXTD523)、国家自然科学基金项目(No.22164008)和海南省院士团队创新中心平台资助. Supported by the Key Research and Development Project of Hainan Province(No.ZDYF2024XDNY197),the Natural Science Foundation of Hainan Province(Nos.323QN202,322CXTD523),the National Natural Science Foundation of China(No.22164008)and the Innovation Center Platform for Academicians of Hainan Province. (No.ZDYF2024XDNY197)

分析化学

OA北大核心CSTPCD

0253-3820

访问量0
|
下载量0
段落导航相关论文