高师理科学刊2025,Vol.45Issue(7):24-30,7.DOI:10.3969/j.issn.1007-9831.2025.07.006
空气质量监测缺失数据填补方法的比较与应用
Comparison and application of methods for filling in missing data for air quality monitoring
摘要
Abstract
Air quality forecasting is critical for controlling air pollution,however,air quality monitoring data obtained from monitoring stations contain a large number of missing values.Using the mass concentration monitoring data of PM2.5,PM10,SO2,CO,NO2,and O3 from 369 cities and counties in China during the period from December 2014 to April 2021 as samples,seven imputation methods including Missforest,K-nearest neighbors,multiple imputation,Bagging method,random forest,mean filling method and median method were employed to fill in missing data.Using mean absolute error,root mean square error and mean absolute percentage error as measurement standards,the accuracy of the seven imputation methods was compared,in order to identify the most suitable imputation method for each type of missing pollutant data.The experimental results indicate that the method for filling PM2.5 can choose between K-nearest neighbors and missing forests,the Missforest algorithm has the smallest error when filling in missing values for PM10,SO2,O3,multiple interpolation and random forest are most suitable for filling in missing values for CO,and the K-nearest neighbors algorithm outperforms other methods in evaluating NO2.Furthermore,based on the filled air quality monitoring data of Shenyang City,the impact of mass concentrations of CO and NO2 on the mass concentration of PM2.5 was analyzed using statistical graphics of variable correlation analysis.关键词
空气污染/监测数据/缺失值填补Key words
air pollution/monitoring data/missing value filling分类
信息技术与安全科学引用本文复制引用
刘佳星..空气质量监测缺失数据填补方法的比较与应用[J].高师理科学刊,2025,45(7):24-30,7.基金项目
2023年度黑龙江省教育厅项目(145309606) (145309606)