Using Machine Learning to Select Watershed Monitoring Sites and Understand Interactions Among Snow, Soil, and Plants

Satellite images and computer simulations guide watershed monitoring and generate hypotheses.

A is a principal component analysis biplot with the first and second components, and B is the zonation map with five zones.

Variability of watershed drought sensitivity, annual potential radiation, snowmelt timing variability, soil moisture variability, and elevation (a). Map of watershed separated by these co-varied features (b).

[Reprinted under a Creative Commons Attribution 4.0 International License (CC BY 4.0) from Wainwright, H. M., et al. "Model and Remote-Sensing-Guided Experimental Design and Hypothesis Generation for Monitoring Snow-Soil–Plant Interactions." Frontiers in Water 5 1220146 (2024). DOI:10.3389/frwa.2023.1220146.]

The Science

Hydrological simulations and machine learning (ML) approaches provide a systematic approach to guide placement of watershed monitoring locations, characterization, and experimental research so disturbances associated with climate change, such as droughts, can be monitored to determine their impact on downstream water availability and quality. Advanced computational technology could enable scientists to answer complex questions, such as the best locations for sensor and experimental plot placement or how representative a particular location might be of an entire watershed.

The Impact

A multi-institutional team of scientists developed a new ML-based approach that provides a systematic way to combine results from watershed simulations to study disturbances in addition to other key environmental factors like snowmelt and soil moisture variability. The approach groups watershed areas with similar environmental characteristics to identify and map zones that capture bedrock-to-canopy properties and identify the most representative hillslopes. This approach highlights the power of ML to extract critical information from multiple types of watershed data, including both simulation and satellite products, leading to more accurate model-guided monitoring design and hypothesis generation.

Summary

To optimize the selection of sites most representative of specific factors or conditions for a watershed, a multi-institutional research team developed a systematic method using ML that combines simulation and satellite data to identify the most appropriate watershed monitoring locations. The team applied the ML approach to study interactions among snow, soil, and plants using data from the East River watershed in Colorado. Results showed that drought sensitivity is significantly correlated with model-derived soil moisture and snowmelt over space and time. The approach also identified the watershed locations with high or low sensitivity to drought in addition to the most representative locations in the watershed accessible by trail or road in each of these areas. These findings can help scientists select the most suitable sites for monitoring watershed characteristics.

Principal Investigator

Haruko Wainwright
Massachusetts Institute of Technology
[email protected]

Co-Principal Investigator

Eoin Brodie
Lawrence Berkeley National Laboratory
[email protected]

Program Manager

Paul Bayer
U.S. Department of Energy, Biological and Environmental Research (SC-33)
Environmental System Science
[email protected]

Funding

This material is based upon work supported as part of the Watershed Function Science Focus Area funded by the Biological and Environmental Research program within the U.S. Department of Energy’s Office of Science.

References

Wainwright, H. M., et al. "Model and Remote-Sensing-Guided Experimental Design and Hypothesis Generation for Monitoring Snow-Soil–Plant Interactions." Frontiers in Water 5 1220146  (2024). https://doi.org/10.3389/frwa.2023.1220146.