大连理工大学学报2024,Vol.64Issue(4):353-360,8.DOI:10.7511/dllgxb202404004
基于改进机器学习的PM2.5浓度预测模型研究
Study of PM2.5 concentration prediction model based on improved machine learning
摘要
Abstract
In response to the problem of performance decrease of existing machine learning model for predicting PM2.5 concentration because that the model is too complex,and does not consider spatio-temporal information and effective missing values imputation is not accurate,random forest is used instead of statistical methods to fill in missing values,and spatio-temporal factors are incorporated to improve model accuracy.Combining remote sensing data,meteorological and collaborative pollutant data,a model(K-means-RF-XGBoost model)suitable for PM2.5 concentration prediction in coastal cities is established,with a prediction time of only 4%of that of BP neural networks.The prediction of PM2.5 concentration of the model is trained and tested using real-time monitoring data from Dalian in 2019.The results show that the established K-means-RF-XGBoost model has high accuracy in predicting PM2.5 concentration,and compared to the same model without considering spatio-temporal information,the root mean square error(erms)decreases by about 48%,and coefficient of determination(R2)increases by about 10%.It effectively predicts high PM2.5 concentrations and is suitable for large fluctuation ranges,such as an R of 0.935 is achieved in the testing set for the spring model.At the same time,it performs well in daily prediction,with an R2 of 0.819.This study provides a new idea for predicting PM2.5 concentration in coastal cities.关键词
PM2.5浓度预测/时空信息/缺失值填补/机器学习Key words
PM2.5 concentration prediction/spatio-temporal information/missing values imputation/machine learning分类
资源环境引用本文复制引用
丁成亮,郑洪波..基于改进机器学习的PM2.5浓度预测模型研究[J].大连理工大学学报,2024,64(4):353-360,8.基金项目
国家自然科学基金资助项目(42071273) (42071273)
中央高校基本科研业务费专项资金资助项目(DUT22LAB132). (DUT22LAB132)