中国农业科学Issue(12):2374-2383,10.DOI:10.3864/j.issn.0578-1752.2014.12.010
基于可见-近红外光谱变量选择的土壤全氮含量估测研究
Study on Soil Total N Estimation by Vis-NIR Spectra with Variable Selection
摘要
Abstract
Objective]Variable selection or feature selection is a critical step in data analysis of visible-near infrared (Vis-NIR) spectrum research. The aim of this study was to determine the soil total nitrogen (TN) contents through building models based on absorption features of soil TN using variable selection methods combined with Vis-NIR spectroscopy, and to provide a basis for the fast estimation of the content of soil TN.[Method]Representative 120 soil samples were collected from the typical red soil area of Ji’an County, Jiangxi Province. The TN contents and the Vis-NIR were measured in the laboratory. Several variable selection methods including principal component analysis (PCA), uninformative variable elimination (UVE) and UVE coupled with successive projections algorithm (SPA) were employed for Vis/NIR data, the models of partial least squares regression (PLSR) with leave-one-out cross-validation, least squares-support vector machine (LS-SVM), the back-propagation neural network (BPNN) and BPNN with optimized threshold and weight using genetic algorithm (GA-BPNN) combined different variable selection methods were calibrated and validated using independent data sets. [Result] The results showed that the application of UVE to the wavelengths reduced wavelengths from original 200 to 59 of which located in visible range and the rest located in the region of overtones and combinations in near infrared range. The application of SPA to the wavelengths preselected by UVE further reduced the wavelengths to only 5 for TN, including 820, 940, 1 040, 1 060 and 1 990 nm. LS-SVM models achieved competitive prediction performance compared with PLSR, BPNN and GA-PBNN based on 59 wavelengths with coefficient of determination (R2) of 0.7492 and root mean square error (RMSE p ) of 0.2921 and residual prediction deviation (RPD) of 1.8904 for soil TN. Furthermore, LS-SVM models achieved excellent prediction performance with PLSR, BPNN and GA-PBNN based on 5 wavelengths using variable selection UVE-SPA, with coefficient of determination (R2) of 0.7945 and root mean square error (RMSEp) of 0.2499 and residual prediction deviation (RPD) of 2.0009 for soil total N. Nevertheless, LS-SVM, BPNN and GA-PBNN models based on 7 principal components was invalid.[Conclusion]The overall results indicated that SPA was a powerful way for the variable selection, and Vis-NIR spectroscopy incorporated to SPA-LS-SVM was successful for the accurate determination of soil TN.关键词
土壤全氮/无信息变量消除(UVE)/连续投影(SPA)/偏最小二乘回归(PLSR)/最小二乘支持向量机(LS-SVM)/遗传算法优化的反向传播神经网络(GA-BPNN)Key words
soil total nitrogen/uninformative variable elimination (UVE)/successive projections algorithm (SPA)/PLSR/LS-SVM/GA-BPNN the selection引用本文复制引用
杨梅花,赵小敏,方倩,谢碧裕..基于可见-近红外光谱变量选择的土壤全氮含量估测研究[J].中国农业科学,2014,(12):2374-2383,10.基金项目
国家自然科学基金(41361039)、土壤与农业可持续发展国家重点实验室开放基金(0812201202)、江西省自然科学基金 ()