Loading...

Machine Learning Analysis of Phosphorus Dynamics in Surface Waters Across Diverse Climate Zones

Jalali, Mehrnoush | 2025

0 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 58223 (09)
  4. University: Sharif University of Technology
  5. Department: Civil Engineering
  6. Advisor(s): Sheikholeslami, Razi
  7. Abstract:
  8. Water quality is a fundamental pillar of public health, sustainable development, and food security, and monitoring pollutants such as total phosphorus (TP) plays a crucial role in the effective management of surface water resources. Phosphorus is an essential element for agricultural production; however, its loss from soil to aquatic systems can lead to environmental issues such as eutrophication, biodiversity loss, and degradation of water quality. Although monitoring TP concentrations is vital for effective water quality management, many countries face a lack of sufficient and consistent data. Numerical models, particularly data-driven approaches based on machine learning, can help bridge this data gap. In this study, five data-driven models based on the Random Forest algorithm were developed to simulate TP concentrations in surface waters at the global scale, with each model tailored to one of the five major climate zones: Temperate, Arid, Continental, Polar, and Tropical. TP concentration data were compiled and integrated from various reliable global sources, including the CESI, WQP, WaterBase, GEMStat, and DWS databases. The input variables, encompassing a broad range of influencing factors, consisted of 25 variables categorized into six main groups: Anthropogenic, Climatic, Topographic, Temporal, Land Use, and Nutrient Source factors. The models were trained on a monthly time scale and at a 0.5-degree spatial resolution for the period from 2012 to 2021, using data from key river basins located within the five climate zones. Model performance was evaluated using cross-validation and statistical metrics including R² and RMSE. Results indicated that the models performed well in predicting TP concentrations, with the coefficient of determination (R²) ranging from 0.46 to 0.83 across different climate zones, the highest accuracy being observed in temperate climates and the lowest in polar regions. To identify the most important factors influencing TP concentration, permutation feature importance and partial dependence plots were utilized. The permutation importance results indicated that temporal and topographic features, along with climatic variables, had the greatest overall impact on TP levels; however, the importance of these variables varied across diverse climate zones. Topographic features and temporal patterns dominated in tropical, temperate, and arid regions, while land use factors and topographic features were more influential in continental zones, and climatic variables were most significant in polar regions. Moreover, partial dependence analysis revealed that the proportion of cropland had a linear positive relationship with TP concentration across all climate zones, while the effects of precipitation and forest cover were nonlinear and climate-dependent. These variables acted as both mitigating and aggravating factors in different regions, highlighting the need for region-specific nutrient management strategies. Additionally, critical hotspots of phosphorus pollution were identified in major basins such as the Orange, Murray–Darling, Ganges, and Mississippi. The results of the Mann-Kendall test also showed that approximately 48% of the grids exhibited a significant increasing trend, highlighting the necessity of effective water quality management. By leveraging up-to-date datasets, extensive sensitivity analyses, inter-climate comparisons, and data-driven modeling, this study helps address existing research gaps. The presented dataset and findings not only support future research but also provide valuable insights for high-level policy-making in the field of water resource management
  9. Keywords:
  10. Phosphorus Pollution ; Machine Learning ; Random Forest Algorithm ; Large-Scale Modeling ; Water Quality

 Digital Object List

 Bookmark

No TOC