Abstract
Dengue and Chikungunya fever are two viral diseases of great public health concern in Colombia and other tropical countries as they are both transmitted by Aedes mosquitoes, which are endemic to this area. In recent years, there have been unprecedented outbreaks of these infections. Therefore, the development of computational models to forecast the number of cases based on available epidemiological data would benefit public surveillance health systems to take effective actions regarding the prevention and mitigation of these events. In this work, we present the application of machine learning algorithms to predict the morbidity dynamics of dengue and chikungunya in Colombia using time-series-forecasting methods. Available weekly incidence for dengue (2007–2016) and chikungunya (2014–2016) from the National Health Institute of Colombia was gathered and employed as input to generate and validate the models. Kernel Ridge Regression and Gaussian Processes were used at forecasting the number of cases of both diseases considering horizons of one and four weeks. In order to assess the performance of the algorithms, rolling-origin cross-validation was carried out, and the mean absolute percentage errors (MAPE), mean absolute errors (MAE), R2 and the percentages of explained variance calculated for each model. Kernel Ridge regression with one-step ahead horizon was found to be superior to other models in forecasting both dengue and chikungunya number of cases per week. However, the power of prediction for dengue incidence was higher as there is more epidemiological data available for this disease compared to chikungunya. The results are promising and urge further research and development to achieve a tool which could be used by public health officials to manage more adequately the epidemiological dynamics of these diseases. © Springer International Publishing AG 2017.