Assessing the Transferability and Expandability of Random Forest Snow Distribution Models Over Two NGEE-Arctic Study Sites

Authors

Claire Bachand¹* (cbachand@lanl.gov), Katrina E. Bennett1, Colleen Iversen²

Institutions

¹Earth and Environmental Sciences Division, Los Alamos National Laboratory, Los Alamos, NM; ²Oak Ridge National Laboratory, Oak Ridge, TN

URLs

https://ngee-arctic.ornl.gov/

Abstract

Random forests are a flexible and interpretable machine learning (ML) technique which have been used to model snow water equivalent at fine scales and to assess the drivers of snow distribution. Unlike “black box” ML techniques such as neural networks, decisions made by random forests can be traced back to splits along specific variables, making their decision-making process relatively transparent and allowing for the quantification of feature importance. However, previous work applying these approaches often focused on very small study domains, which raises questions regarding their transferability, expandability, and whether random forest feature importance can be used to draw conclusions about snow distribution drivers in broader areas, or if these results are very localized. Researchers test the random forest created by Bennett et al. (2022), modified to predict snow depth rather than snow water equivalent, on a broad expanse surrounding two field study sites (NGEE-Arctic Kougarok 64 Hillslope and Teller 27 watershed), and evaluated results against a 2022 Lidar-derived snow depth product to better understand the expandability of the random forest model (NCALM 2022). Further, researchers trained the random forest model only on one NGEE site and tested on the other to evaluate the model’s performance across sites and over larger spatial scales outside of the training domain. Lastly, the team retrained the random forest model on the LiDAR data and examined the updated model’s feature importance and out-of-sample performance to see how these results change when trained on a larger domain and with different training data. Ultimately, this assessment of random forest transferability provides context for other random forest snow distribution studies and can help inform future snow modeling efforts.

References

Bennett, K. et al. 2022. “End-of-Winter Snow Depth, Temperature, Density, and SWE Measurements at Teller Road Site, Seward Peninsula, Alaska,” Data from 2019 Next Generation Ecosystem Experiments Arctic Data Collection. Accessed on February 24, 2022. DOI:10.5440/1798170.

Bennett, K. et al. 2022. “End-of-Winter Snow Depth, Temperature, Density, and SWE Measurements at Teller Road Site, Seward Peninsula, Alaska,” Data from 2022 Next Generation Ecosystem Experiments Arctic Data Collection. Accessed on February 24, 2022. DOI:10.5440/1887250.

Bennett, K. et al. 2022. “End-of-Winter Snow Depth, Temperature, Density, and SWE Measurements at Kougarok Road Site, Seward Peninsula, Alaska,” Data from 2022 Next Generation Ecosystem Experiments Arctic Data Collection. Accessed on February 24, 2022. DOI:10.5440/1888533.

Bennett, K., et al. 2002. “Spatial Patterns of Snow Distribution in the Sub-Arctic,” The Cryosphere 16, 3269–3293. DOI:10.5194/tc-16–3269.

NCALM. 2021. “National Center for Airborne Laser Mapping (NCALM) Lidar DEM data from Five NGEE Arctic Sites,” Data from August 2021 Next Generation Ecosystem Experiments Arctic Data Collection, Accessed on February 24, 2022. DOI:10.5440/1832016.