Abstract
Air pollution affects not only the air in cities but also extends to all indoor environments (homes, offices, schools, public places, transportation, etc.), where we spend between 80% and 90% of our time. Both indoor and outdoor air quality have emerged as significant health concerns and are integral to national strategies implemented by health and environmental institutes in each country. Recently, complaints regarding outdoor air quality have risen in cities, primarily due to automobile traffic and industrial activities in urban areas, and also indoors within homes, offices, and schools. The following paper presents a methodology for the calibration of low-cost monitoring stations based on measurements in a couple of cities in Colombia as part of the development of a project to reduce the environmental awareness gap in urban areas for the estimation of the air quality through low-cost, flexible, modular, and mobile air quality monitoring station design that could be used to assess air pollution in different indoor and outdoor environments. With the implementation of the low-cost stations, we have calibrated and evaluated the performance of the stations using usual linear regression methods, but we have also explored the use of unsupervised estimation with the help of machine learning algorithms, specifically with Random Forest estimators. We have found a significant improvement with using Random Forest for station calibration compared with those found using simple linear regressions for calibration effects. We have found that all the models offer a significant improvement in terms of RMSE. The regression model improves RMSE by up to 70%, while the multiple regression model does so by up to 73%. However, it is the Random Forest that shows the most remarkable improvement, with a reduction in RMSE of up to 86%.