Community-Developed (Meta)Data Reporting Formats to Enable Data Reuse in ESS-DIVE


Charuleka Varadharajan1* ([email protected]), Robert Crystal-Ornelas1, Dylan O’Ryan1, Kathleen Beilsmith2, Ben Bond- Lamberty3, Kristin Boye4, Madison Burrus1, Shreyas Cholia5, Danielle S. Christianson5, Michael Crow6, Joan Damerow1, Kim S. Ely7, Amy E. Goldman8, Susan Heinz6, Valerie C Hendrix5, Zarine Kakalia1, Kayla Mathes9, Fianna O’Brien5, Stephanie Pennington3, Emily Robles1, Alistair Rogers7, Shawn Serbin7, Maegen Simmonds1,10, Terri Velliquette6, Pamela Weisenhorn2, Jessica Nicole Welch6, Karen Whitenack1, Deborah A. Agarwal5


1Earth and Environmental Sciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA; 2Argonne National Laboratory, Lemont, IL; 3Joint Global Change Research Institute, Pacific Northwest National Laboratory, College Park, MD; 4Geochemistry and Biogeochemistry Group, SLAC National Accelerator Laboratory, Menlo Park, CA; 5Scientific Data Division, Lawrence Berkeley National Laboratory, Berkeley, CA; 6Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN; 7Environmental and Climate Sciences Department, Brookhaven National Laboratory, Upton, NY; 8Pacific Northwest National Laboratory, Richland, WA; 9Integrated Life Sciences, Virginia Commonwealth University, Richmond, VA; 10Now at: Pivot Bio, Berkeley, CA



Earth science data are diverse and multidisciplinary, which makes it difficult for researchers to determine and use the appropriate standards or formats that apply to the data. The Findable, Accessible, Interoperable, and Reusable (FAIR) principles are intended to enable the reuse of Earth and environmental science data beyond the purpose for which the data were originally collected. One pathway to making data more reusable is for repositories to encourage contributors to organize and publish data that follow established standards and guidelines. Researchers have developed 12 reporting formats which encompass instructions, templates, and tools for consistently formatting a diverse set of Earth science (meta)data. These formats were developed through a partnership between ESS-DIVE’s repository and researchers from the ESS science community. Researchers cover a broad range of Earth science (meta)data that includes cross-domain metadata (dataset metadata, location metadata, sample metadata), file-formatting guidelines (file-level metadata, CSV files, terrestrial model data), and domain-specific formats for biological, geochemical, and hydrological data types (amplicon abundance tables, leaf-level gas exchange, soil respiration, water and sediment chemistry, sensor-based hydrologic measurements, and Unoccupied Aerial System. The team adopted a community consensus process to develop these formats by obtaining extensive input, which has resulted in a pragmatic set of reporting formats that are based on scientific use cases. Such community-developed reporting formats lend themselves to easy adoption, enabling scientific data synthesis and knowledge discovery by making it easier for data contributors to provide (meta)data that are more FAIR.