Science Focus Area Data and Wireless Communications Infrastructure Enabling Data-Field and Data-Model Integration


Charuleka Varadharajan1* (, Madison Burrus1, Stijn Wielandt1, Danielle Christianson1, Hesham Elbeshandy1, Boris Faybishenko1, Val Hendrix1, Doug Jones2, Dylan O’Ryan1, Roelof Versteeg2, Andrew Wiedlea3, Eoin Brodie1


1Lawrence Berkeley National Laboratory, Berkeley, CA; 2Subsurface Insights, Hanover, NH; 3Energy Sciences Network, Berkeley, CA



The Watershed Function SFA project generates large, heterogeneous datasets at its East River field site that include hydrological, geochemical, geophysical, vegetation, microbiological, and remote sensing datasets. Field data is collected by the project team and collaborators, wireless sensors, and external sources (e.g., the U. S. Geological Service and the USDA’s Natural Resources Conservation Service). Researchers will present the SFA’s data and wireless communications infrastructure that address the challenges associated with environmental data acquisition, curation, quality analysis (QA), quality control (QC), integration, visualization, and download.

The SFA data management framework provides infrastructure and services to support various aspects of the project’s data lifecycle. It establishes wireless connectivity for sensors in the field, performs automated quality checks of sensor datasets, and enables integration of diverse time-series data for use in analysis and models. The framework also includes tools for data management, preservation, discovery, and visualization. Sensor and geochemical data from the SFA project have been curated in a queryable database since the start of field observation efforts at the East River. A statistical QA/QC workflow is applied to priority meteorological, hydrological, and geochemical data streams. Machine learning (ML) and statistical algorithms are being explored to conduct basic screening of incoming data in near real-time for early detection of sensor malfunctions. An open source brokering service tool BASIN-3D (Broker for Assimilation, Synthesis and Integration of eNvironmental Diverse, Distributed Datasets) enables on-demand integration of diverse time-series datasets into a common format based on the Open Geospatial Consortium’s observations and measurements standards. The software has also been used to integrate data for ML applications and can be used to support other data streams in the future.

In order to facilitate real-time data streams from the field, researchers deployed an advanced wireless communication system at the East River field site in partnership with the DOE’s Energy Sciences Network user facility. The system leverages Starlink and 5G Citizens Broadband Radio Service technology to create a standalone, private cellular network. Low power 5G hubs in the field connect to this network and provide local sensor connectivity over Wi-Fi, long range radio, and Ethernet. For this pilot, researchers demonstrated real-time imagery, as well as live data transfers from a wireless network of distributed snow and soil temperature sensors. The combined systems, workflows, and tools are used for building crosscutting data products needed for hypothesis testing and numerical modeling of hydrological and biogeochemical processes in the East River watershed. It adopts a holistic view for data collection, assessment, and integration, which dramatically improves the products generated and enables a codesign approach wherein data collection is informed by model results and vice versa.