Sample Identifiers and Metadata to Support Data Management and Reuse in Multidisciplinary Ecosystem Sciences

Guidance and tools for studies with a variety of biological and environmental samples.

Tracking multidisciplinary samples throughout the cycle of field collection, transport to collaborators and other labs, various analyses, and digital records.

[Reprinted under a CC BY 4.0 License From Damerow, J. E., C. Varadarajan, K. Boye, and E. L. Brodie, et al. (2021) "Sample Identifiers and Metadata to Support Data Management and Reuse in Multidisciplinary Ecosystem Sciences," Data Science Journal. 20(1), 11. DOI: 10.5334/dsj-2021-011.]

The Science

Many Environmental System Science projects have complicated sample analysis workflows and need an efficient system for tracking samples as they are sent to different collaborators, labs, user facilities, and published online. Work to improve project sample tracking for multidisciplinary projects was driven by the user community of the US Department of Energy’s (DOE’s) data repository for Earth and environmental sciences—Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE).  We provide recommendations for assigning sample identifiers and associated metadata to describe a variety of sample types for multidisciplinary ecosystem science projects.  

The Impact

Persistent identifiers for samples, along with common metadata to describe a variety of biological and environmental sample types, provides essential information to improve the efficiency of sample tracking, and makes sample data more findable, accessible, interoperable, and reusable (FAIR).

Summary

Physical samples are foundational entities for research across biological, Earth, and environmental sciences. Data generated from sample-based analyses are not only the basis of individual studies, but can also be integrated with other data to answer new and broader-scale questions. Ecosystem studies increasingly rely on multidisciplinary team-science to study climate and environmental changes. While there are widely adopted conventions within certain domains to describe sample data, these have gaps when applied in a multidisciplinary context. In this study, we reviewed existing practices for identifying, characterizing, and linking related environmental samples. We then tested practicalities of assigning persistent identifiers to samples, with standardized metadata, in a pilot field test involving eight United States Department of Energy projects. Participants collected a variety of sample types, with analyses conducted across multiple facilities. We address terminology gaps for multidisciplinary research and make recommendations for assigning identifiers and metadata that supports sample tracking, integration, and reuse. Our goal is to provide a practical approach to sample management, geared towards ecosystem scientists who contribute and reuse sample data.

Principal Investigator

Deb Agarwal
Lawrence Berkeley National Laboratory
daagarwal@lbl.gov

Program Manager

Paul Bayer
U.S. Department of Energy, Biological and Environmental Research (SC-33)
Environmental System Science
paul.bayer@science.doe.gov

Funding

JED, CV, MB, RCO, HE, VH, and DA were funded through the ESS-DIVE repository by the U.S. DOE’s Office of Science Biological and Environmental Research under contract number DE-AC02-05CH11231 to LBNL as part of its Earth and Environmental Systems Science Division Data Management program. KSE was supported by the United States Department of Energy contract No. DE-SC0012704 to Brookhaven National Laboratory. RJEA was supported by the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research under Contract No. DE-AC02-05CH11231 to LBNL as part of the Terrestrial Ecosystem Science Program. ELB, PS, ZK contributions were supported as part of the Watershed Function Scientific Focus Area funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research under Award Number DE-AC02-05CH11231. AEG and JCS were supported by the U.S. DOE-BER, as part of BER’s Subsurface Biogeochemistry Research (SBR) Program at Pacific Northwest National Laboratory (PNNL), which is operated by Battelle Memorial Institute for the U.S. DOE under Contract No. DE-AC05-76RL01830. ABK, NM, and MZ were supported by the Department of Energy, Office of Science, Biological and Environmental Research, Subsurface Biogeochemical Research program (SCW1053) and performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. RLW’s contribution was supported by a National Science Foundation grant 2004562.

References

Damerow, J. E., C. Varadarajan, K. Boye, and E. L. Brodie, et al. "Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences." Data Science Journal 20 (1), 11  (2021). http://doi.org/10.5334/dsj-2021-011.