High-Resolution Maps of Near-Surface Permafrost on the Seward Peninsula, Alaska, Generated with Machine Learning

Authors

Evan A. Thaler¹* (thaler@lanl.gov), Sebastian Uhlemann², Joel C. Rowland¹, Stijn Wielandt², Ian Shirley², John Lamb², Baptiste Dafflon², Katrina E. Bennett¹, Colleen Iversen³

Institutions

¹Earth and Environmental Sciences Division, Los Alamos National Laboratory, Los Alamos, NM; ²Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA; ³Oak Ridge National Laboratory, Oak Ridge, TN

URLs

https://ngee-arctic.ornl.gov/

Abstract

Permafrost soils are a critical component of arctic ecosystems and the global carbon cycle, as they store vast amounts of carbon. They are also important on a regional and local scale, in part because they regulate water flow from uplands to rivers and oceans. Hence, identifying the distribution of permafrost soils is crucial to understanding local- to global-scale impacts of permafrost thaw. Regional and hemispherical maps of permafrost distributions are typically generated at the scale of tens of meters or kilometers. This is too coarse to resolve permafrost distributions to a scale needed to identify processes driving degradation, to assess infrastructure stability, or to illuminate geomorphic impacts of permafrost thaw such as soil carbon transport dynamics. Here, the team developed a machine learning (ML) model to generate meter-scale maps of near-surface permafrost distributions for the watersheds within the discontinuous permafrost region of the Seward Peninsula, Alaska. Ground-truth observations of the near-surface permafrost extent were determined in three watersheds using measurements of soil temperature from 0.75 to 1.2 m depth, electrical resistivity tomography, and observations from soil pits and frost probing. To predict the full distribution of near-surface permafrost at each of the three watersheds, a ML model was trained using the ground-truth observations, topographic and vegetation metrics derived from lidar point clouds, and multispectral indices for snow cover derived from high-resolution satellite imagery. Specifically, two ML models were trained: extremely randomized trees (ERTr) and a support vector machine (SVM). The transferability of the trained ML models was tested by running the models at sites where the models were not trained. Near-surface permafrost distributions predicted by the ERTr produced the highest balanced accuracy (BA) at the training site (“at-a-site”) (BA>90%). However, the transferability of the ERTr to other sites was low, with BA ranging from 50 to 60%. The SVM had lower accuracies for at-a-site prediction (BA=70 to 80%), yet greater accuracy when transferred to the non-training site (BA=70 to 80%). The accuracy of these models demonstrates that integrating geophysical measurements with topographic and multispectral data into a ML model provides a promising approach to generating fine-scale maps of permafrost distributions where sufficient ground-truth data exist. However, producing high-resolution permafrost extent maps across larger spatial scales remains challenging because of site specific variability in drivers of permafrost degradation.