Physics-Informed Data Augmentation for Sediment Concentration Prediction in Data-Scarce Himalayan Rivers

Authors

  • Usan Adhikari IOE Tribhuvan University, Pulchowk, Lalitpur Nepal
  • Mukesh Raj Kafle IOE Tribhuvan University, Pulchowk, Lalitpur Nepal
  • Sushan Adhikari Kathmandu University, Dhulikhel, Kavre Nepal

DOI:

https://doi.org/10.3126/injet-indev.v2i2.95706

Keywords:

Sediment Transport Prediction, Data Augmentation, Physics Informed Modeling, Himalayan Hydrology, Hydropower Engineering, Rating Curves

Abstract

The correct prediction of sediment concentration has prime significance in the development of sustainable hydropower in the Himalayan region because of the large sediment loads being hazardous to the turbines as well as the reservoirs' longevity. This study tackles the challenges of the scarcity of data and the complexity of physics through the organized comparisons of the conventional statistical and advanced physics-based data augmentation techniques which can be used again for the sediment prediction driven by the ML approach in the Himalayan region characterized by the scarcity of data. This study pursued two interrelated objectives: first, organized comparisons of conventional and advanced data augmentation schemes; and second, the development of new physics-informed schemes for sediment transport to create a reproducible analytics framework suitable for data-scarce Himalayan watersheds. In this study, the effectiveness of ten data augmentation techniques: five classical statistical ones (forward-backward fill, linear interpolation, seasonal mean approach, simple rating curve models, and ensemble averaging) and five advanced models founded on physics (seasonal stochastic rating curve models, k-nearest neighbor discharge analogs, STL decomposition models,  physics-based constraints models, and weighted ensemble) was investigated using the same number of observed monthly sediment data points. Conservative pre-processing of the data resulted in the preservation of about 99.8% of the data points via the consensus approach of three methods of outlier detection. The advanced models based on physics were greatly superior to the classical statistical models for the augmentation of sediment concentrations regarding the enhancement of  value performance metric (by 5.5%), the Root Mean Squared Error (RMSE—by 24.3%), and the Mean Absolute Error (by 47.6%) all tested through rigorous 5-fold cross-validation. This study makes various contributions. Firstly, this study can be classified as research in the field of hydrology due to its subject of addressing various data scarcity challenges in this discipline. Additionally, this study lays the groundwork for further machine learning analysis by making sure that missing sediment data can be imputed effectively with minimal imputation errors.

Downloads

Download data is not yet available.
Abstract
6
PDF
3

Downloads

Published

2026-06-12

How to Cite

Adhikari, U., Kafle, M. R., & Adhikari, S. (2026). Physics-Informed Data Augmentation for Sediment Concentration Prediction in Data-Scarce Himalayan Rivers. International Journal on Engineering Technology and Infrastructure Development, 2(2), 113–123. https://doi.org/10.3126/injet-indev.v2i2.95706

Issue

Section

Articles