Product: Learning from the Rare: Overcoming Class Imbalance in Archaeological Object Detection with Boosting Methods
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Detecting surface potsherds using low-altitude remote sensing is challenging due to severe class imbalance and limited training data. This study explores boosting algorithms—Adaptive Boosting (AdaBoost) and Extreme Gradient Boosting (XGBoost)—to maximize detection recall for archaeological prospection in the Western Megaris landscape, Greece. Models were trained on only 15% of available data to simulate realistic field conditions. Evaluation emphasized recall oriented metrics (precision, recall, F1-score, AUC) for the minority class, addressing the accuracy paradox where high overall accuracy masks poor rare-class performance. Threshold optimization enabled AdaBoost and XGBoost to achieve substantially improved recall compared to baseline methods, with detection-to-ground-truth ratios of 2.5 and 3.2 respectively, reflecting deliberate prioritization of recall over precision for exploratory survey purposes. Results demonstrate that boosting methods with application-specific threshold optimization offer practical screening tools for flagging high-probability areas in archaeological landscape survey, enhancing field survey efficiency in data-constrained environments requiring expert validation.


