September 11, 2023
AmeriFlux BASE Data Pipeline to Support Network Growth and Data Sharing
The semi-automated pipeline enables the management team to keep pace with the rapid growth of the network and facilitates multisite comparisons and model evaluations.

AmeriFlux BASE data-processing pipeline showing the role of site teams, the AmeriFlux Management Project team, and data products available to users.
[Reprinted under a Creative Commons Attribution International License (CC BY 4.0) from Chu, H., et al. "AmeriFlux BASE Data Pipeline to Support Network Growth and Data Sharing." Scientific Data 10, 614 (2023). DOI:10.1038/s41597-023-02531-2.]
The Science
AmeriFlux is a network of hundreds of research sites established by individual site teams driven by diverse research questions. In 2012, the U.S. Department of Energy established the AmeriFlux Management Project (AMP) at Lawrence Berkeley National Laboratory to support data standardization, quality assurance, and data sharing across the network and the broader AmeriFlux community. AMP presents the new BASE data-processing pipeline, which (1) standardizes the flux and meteorological (flux-met) data formats, (2) ensures data quality, (3) facilitates regular and frequent data submissions and publications, and (4) tracks the data and communications with site teams through the pipeline.
The Impact
Between implementing the pipeline in May 2017 and December 2022, AmeriFlux has received 3,468 data uploads containing 6,195 files of flux-met data from 385 sites. The implemented pipeline enables the management team to keep up with growth, publishing an average of around 48 new sites and 330 new site years annually. As of 2024, there are 3,628 site years of AmeriFlux BASE data from 499 sites, representing the world’s largest data repository for flux-met data. The BASE pipeline facilitates more frequent data uploads and releases and allows data users to access recent-year data.
Summary
AmeriFlux is a group of research sites that measure carbon, water, and energy exchanges between ecosystems and the atmosphere using a method called eddy covariance. The variety of ecosystems, tools, and data methods in AmeriFlux makes it hard to standardize, assure quality, and share data. Therefore, the AMP created the BASE data-processing system. This system starts with site teams uploading data, followed by (1) AMP’s quality checks, (2) adding site metadata, and (3) publishing the data. By 2022, the BASE system held 3,130 site years of data from 444 sites, making it the largest long-term data repository for flux-met data. This data is used for multisite comparisons, model testing, and data syntheses.
Principal Investigator
Housen Chu
Lawrence Berkeley National Laboratory
[email protected]
Program Manager
Daniel Stover
U.S. Department of Energy, Biological and Environmental Research (SC-33)
Environmental System Science
[email protected]
Funding
AmeriFlux data portal and processing pipeline were supported by funding provided to the AmeriFlux Management Project by the U.S. Department of Energy’s Office of Science under contract no. DE-AC02-05CH11231.
References
Chu, H., et al. "AmeriFlux BASE Data Pipeline to Support Network Growth and Data Sharing." Scientific Data 10, 614 (2023). https://doi.org/10.1038/s41597-023-02531-2.