FORECASTING HOURLY PM2.5 CONCENTRATIONS USING STL DECOMPOSITION WITH MACHINE LEARNIG AND DEEP LEARNING ENSEMBLE MODEL
Awanindra Kumar Singh Kumar Singh
Paper Contents
Abstract
Fine particulate matter (PM2.5) poses significant health risks due to its ability to penetrate the respiratory system and bloodstream, originating from anthropogenic and natural sources. Accurate hourly forecasting is essential for public health warnings and emission control. This study develops a hybrid model for hourly PM2.5 concentration prediction using Seasonal-Trend decomposition via LOESS (STL) to separate data into trend, seasonal, and residual components. Data from Talkatora, Lucknow (India), collected via the Central Pollution Control Board, underwent preprocessing including missing value imputation and outlier removal. The trend component was forecasted with Linear Regression (LR) using 24-hour lags, the seasonal component with eXtreme Gradient Boosting (XGB) also incorporating 24-hour lags, and the residual with Long Short-Term Memory (LSTM) neural network (64 cells, Adam optimizer, MSE loss). Forecasts were aggregated for final predictions. The model was compared against standalone LR, XGB, LSTM, and STL variants using MAE, RMSE, Pearson's correlation coefficient r, and R on test data. Results showed the hybrid STL-XGB-LR-LSTM model outperformed others, achieving MAE of 8.4736, RMSE of 13.0953, r of 0.9541, and R of 0.9098, indicating superior accuracy in capturing temporal patterns. This approach enhances PM2.5 forecasting for proactive environmental management.
Copyright
Copyright © 2025 Awanindra Kumar Singh. This is an open access article distributed under the Creative Commons Attribution License.