Investigating the mechanisms of flood susceptibility with the use of multi-basin machine learning models in data-scarce environments in Cyprus
Journal
Journal of Hydrology: Regional Studies
Date Issued
January 6, 2026
DOI
10.1016/j.ejrh.2025.103075
Abstract
Study region: The island of Cyprus is dominated by small-scale watersheds that favor the occurrence of flash floods. Climate projections indicate the increase in frequency and intensity of these
events.
Study focus: The development of rapid flood screening tools is essential for better urban planning.
This study uses four different machine learning algorithms, namely support vector machine
(SVM), extreme gradient boosting (XGBoost), random forest (RF), and multilayer perceptron
(MLP), to build models based on data collected from eight watersheds to enhance their withinregion (Cyprus) generalization. Seven features were selected for tuning and testing the performance of these models. T-based confidence intervals were calculated to quantify uncertainty.
New hydrological insights for the region: All models achieved good agreement with the inventory
database. RF model was selected to build multi-level susceptibility maps. Half of the Georskipou
watershed is classified as highly susceptible to flooding, mostly urban and semi-urban regions,
whereas 38 % of the test watershed is not expected to experience severe flood events. Simplified
RF models were developed by selecting different combinations of the most important features,
revealing that land-use, terrain slope, terrain elevation, and flow accumulation are sufficient to
achieve good accuracy (95 %) with flood inventory data. The results highlight the ability of
simple, computationally efficient data-driven models to provide rapid predictions, thus avoiding
the compilation of fully detailed physically-based models
events.
Study focus: The development of rapid flood screening tools is essential for better urban planning.
This study uses four different machine learning algorithms, namely support vector machine
(SVM), extreme gradient boosting (XGBoost), random forest (RF), and multilayer perceptron
(MLP), to build models based on data collected from eight watersheds to enhance their withinregion (Cyprus) generalization. Seven features were selected for tuning and testing the performance of these models. T-based confidence intervals were calculated to quantify uncertainty.
New hydrological insights for the region: All models achieved good agreement with the inventory
database. RF model was selected to build multi-level susceptibility maps. Half of the Georskipou
watershed is classified as highly susceptible to flooding, mostly urban and semi-urban regions,
whereas 38 % of the test watershed is not expected to experience severe flood events. Simplified
RF models were developed by selecting different combinations of the most important features,
revealing that land-use, terrain slope, terrain elevation, and flow accumulation are sufficient to
achieve good accuracy (95 %) with flood inventory data. The results highlight the ability of
simple, computationally efficient data-driven models to provide rapid predictions, thus avoiding
the compilation of fully detailed physically-based models
File(s)![Thumbnail Image]()
Name
1-s2.0-S2214581825009048-main.pdf
Size
8.27 MB
Format
Adobe PDF
Checksum (MD5)
6783bdc7ae5eb855276de0a925fa7ae1

