Temporal and geographic extrapolation of soil moisture using machine learning algorithms

by Efthymios Chrysanthopoulos, Andreas Kallioras

CATENA, 257, 109156. https://doi.org/10.1016/j.catena.2025.109156

Abstract

The inherent characteristic of machine learning algorithms to extrapolate when the convex hull is expanded with new unseen instances, can be exploited in soil moisture prediction, concerning temporal and geographic extrapolation. This study describes the implementation of a machine learning framework, evaluating the performance of both individuals (Support Vector Regressor) and ensemble algorithms (Random Forests and Voting Regressor) in temporal and geographic extrapolation of soil moisture beyond the feature space of the calibration data. While most studies focus on temporal extrapolation and spatial interpolation of soil moisture in the framework of calibration stations, this study provides important insights on soil moisture prediction in distinct locations of a catchment where target variables are available, using pre-calibrated models at an individual station. The approach is originally based on the calibration of each machine learning algorithm with the soil moisture data from every agro-meteorological station of the monitoring networks and the evaluation both in temporal extrapolation context with future data of the same station and in geographic extrapolation with data concerning the location of rest of the stations.Overall the results indicate that in the context of temporal extrapolation the algorithms achieve adequate accuracy with the performance metrics to achieve values R² > 0.75, RMSE < 0.042 cm³cm⁻³ and MAE < 0.001 cm³cm⁻³, while in the context of geographic extrapolation algorithms trained using soil moisture data from a distinct agro-meteorological station are capable of predicting soil moisture with enhanced efficiency when applied to previously unseen datasets. The results of this research indicate the applicability of the framework in unmonitored sites.

Keywords: Soil moisture, Machine learning, Extrapolation, Unmonitored sites, Soil moisture modeling, Generalization