February 07, 2022

New Guidelines for Publishing Terrestrial Model Data

Recommendations for archiving model data in a public repository to enable their reuse for a variety of purposes.

Image is described in caption.

Decision tree for grouping model-related files for public archiving.

[Reprinted under a Creative Commons Attribution 4.0 International License (CC BY 4.0) from Simmonds, M. B., et al. “Guidelines for Publicly Archiving Terrestrial Model Data to Enhance Usability, Intercomparison, and Synthesis.” Data Science Journal 21(1), 3 (2022). DOI:10.5334/dsj-2022-003.]

The Science

U.S. Department of Energy’s (DOE) researchers use a variety of “terrestrial” models (models of the processes that occur on land and their interactions with climate). However, scientists do not have guidelines for making these data public in a manner that enables their reuse. This study researched (1) the aspects of terrestrial model data considered scientifically useful and (2) the purposes served by publishing the data. Based on the results, guidelines for archiving model data are provided, to include inputs and testing data, model code, and workflow scripts. Easier ways to store and reuse model data are also included. 

The Impact

Model predictions from Earth science research are valuable for climate, water, land, and energy resource management. This research provides scientists with data publication guidelines to make their research more visible and valuable. In particular, datasets published with these guidelines will be easier to reuse for a variety of purposes. For example, it would be easier to compare observations with model predictions. It would also be easier to compare models to each other, in what are scientifically referred to as “model intercomparison studies.” Finally, publishing model data with these guidelines will increase research transparency and reproducibility. 

Summary

Earth science models provide valuable information that can be used to guide resource management and policy. Scientists and other stakeholders can more easily reuse model data if it is made public with adequate information on how to interpret and use the data. However, to date, no practical, established guidelines exist for how modelers should publish their data. In particular, terrestrial models (models of processes on land and their interactions with climate) are very diverse, with several types of models being used at different spatial and temporal scales. This study researched how, what, where, when, and why to publish model data and found that archiving model data for scientific purposes requires publishing different data components, including inputs and testing data, model code, and workflow scripts. A set of guidelines was created not only to offer practical suggestions to scientists seeking to publish their data but also to provide greater visibility to their research, making it easier to discover, access, and reuse the data. These guidelines are transferable to other model types and will enable efficient reuse of simulation data for purposes such as model intercomparisons, new model spin up, and field observation comparisons. 

Principal Investigator

Deborah Agarwal
Lawrence Berkeley National Laboratory
[email protected]

Co-Principal Investigator

Charuleka Varadharajan
Lawrence Berkeley National Laboratory
[email protected]

Program Manager

Daniel Stover
U.S. Department of Energy, Biological and Environmental Research (SC-33)
Environmental System Science
[email protected]

Funding

This research was supported by the Biological and Environmental Research (BER) Program within the U.S. Department of Energy’s (DOE) Office of Science.  It was also supported by DOE’s Advanced Scientific Computing Research (ASCR) and National Energy Research Scientific Computing Center (NERSC). 

Related Links

References

Simmonds, M. B., et al. "Guidelines for Publicly Archiving Terrestrial Model Data to Enhance Usability, Intercomparison, and Synthesis." Data Science Journal 21 (1), 3  (2022). http://doi.org/10.5334/dsj-2022-003.