2024 Abstracts

AmeriFlux BASE Data Pipeline to Support Network Growth and Data Sharing


Housen Chu1* (hchu@lbl.gov), Danielle S. Christianson1, You-Wei Cheah1, Gilberto Pastorello1, Fianna O’Brien1, Joshua Geden1, Sy Toan Ngo1, Rachel Hollowgrass2, Karla Leibowitz3, Norman Beekwilder4, Megha Sandesh1, Sigrid Dengel1, Stephen Chan1, André Santos1, Kyle B. Delwiche2, Koong Yi1, Christin Buechner1, Dennis D. Baldocchi2, Dario Papale5, Trevor F. Keenan2, Sebastien Biraud1, Deb Agarwal1, Margaret S. Torn1,2


1Lawrence Berkeley National Laboratory, Berkeley, CA; 2University of California–Berkeley, CA; 3HyperArts, Inc., Oakland, CA; 4University of Virginia, Charlottesville, VA; 5University of Tuscia, Viterbo, Italy



AmeriFlux is a network of independent research sites that measure carbon, water, energy, and momentum fluxes between ecosystems and the atmosphere using the eddy covariance technique to study a variety of Earth science questions. AmeriFlux’s diversity of ecosystems, instruments, and data processing routines all create challenges for data standardization, quality assurance, and sharing across the network. To address these challenges, the AmeriFlux Management Project (AMP) designed and implemented the BASE data processing pipeline. The pipeline begins with data uploaded by the site teams, followed by the AMP team’s quality assurance and quality control (QA/QC), ingestion of site metadata, and publication of the BASE data product. The pipeline automated and facilitated communication, tracking, QA/QC, and publication functions, enabling the team to keep pace with the rapid growth of the network. As of January 2024, the AmeriFlux BASE data product contained 3,360 site years of data from 478 sites with standardized units and variable names of more than 60 common variables, representing the largest long-term data repository for flux-met data in the world. AMP further applied the Open Network–Enabled Flux processing codes to generate the FLUXNET data product for AmeriFlux sites. The FLUXNET product contains footprint-aggregated data that have been gap-filled, partitioned, and corrected for energy balance closure and includes uncertainty analysis. As of January 2024, the FLUXNET data product contained 1,381 site years of data from 195 sites.