Informatics and Applications

2019, Volume 13, Issue 3, pp 34-40

HYBRID EXTREME GRADIENT BOOSTING MODELS TO IMPUTE THE MISSING DATA IN PRECIPITATION RECORDS

  • A. K. Gorshenin
  • O. P. Martynov

Abstract

The article compares the classical method of extreme gradient boosting implemented in the XGBoost (eXtreme Gradient Boosting) framework with the new modification CatBoost (Categorial Boosting), which is rarely involved in scientific researches. Some hybrid classification-regression models are proposed to improve the accuracy of imputation in missing values in real data using 14 meteorological stations in Germany. The achieved accuracy of the classification is up to 92% and the root-mean-square errors are quite moderate. The hybrid methods outperformed both simple classification and regression models in prediction accuracy. The proposed approaches can be successfully used for meteorological data analysis by machine learning methods as well as for improving the forecasting accuracy in physical models of atmospheric processes.

[+] References (17)

[+] About this article