October 08, 2024
Implementing Reproducible Cookbooks for Advanced Analyses on Science Gateways
Researchers are integrating science gateways with reproducible environments to reduce barriers to entry.

Architectural design of the cookbook workflow.
[Reprinted under a Creative Commons Attribution 4.0 International License (CC BY 4.0) from Mobley, W., et al. "Implementing Reproducible Cookbook Environments for Advanced Analyses on Science Gateways." Science Gateways 2024 (2024). DOI:10.5281/zenodo.13869305.]
The Science
A team of researchers designed and created workflows and infrastructure allowing researchers to more easily share their reproducible analyses on Texas Advanced Computing Center (TACC) systems. This cookbook workflow provides four levels of security. The application can be completely private, shared with individuals, shared with an allocation, or shared publicly with the greater portal community. This approach advances the science gateways domain and enables reproducibility as the environments will remain static once installed.
The Impact
The cookbook workflow will significantly lower barriers to accessing and utilizing high-performance computing (HPC) resources, particularly for users with limited technical expertise. This work also streamlines the process to develop and share analyses. With complex environments, large datasets, and large compute clusters, sharing information is not always straightforward. To reduce lag, analysis often needs to occur in close physical proximity to the datasets. The cookbooks allow these analyses to occur but more importantly allow researchers to share these analyses with less technical users.
Summary
Researchers developed “reproducible computational cookbooks” to simplify HPC environments for scientific research. These cookbooks integrate standardized tools like Docker and Binder with science gateways, allowing researchers to create, share, and reuse complex software setups without needing advanced technical skills. Hosted by TACC, the cookbooks streamline the process of configuring research environments, especially for nontechnical users, by offering pre-designed templates and flexible workflows. This approach addresses existing barriers such as complicated installations and dependency management, enhancing collaboration and accessibility of HPC resources.
Natural language processing is highlighted as a key application as these cookbooks have supported research involving sensitive data and diverse user expertise levels. By creating a secure and reproducible environment, the system enabled efficient analysis of text data such as community interviews and engineering manuals, which informed projects like “Sites and Stories,” an artificial intelligence–driven modeling tool. Using platforms like Tapis and JupyterLab, the cookbook framework also facilitates a wide range of scientific experiments, ensuring consistency and reproducibility while reducing costs. This innovation represents a significant advancement in democratizing access to HPC resources for diverse research needs.
Principal Investigator
William Mobley
University of Texas
[email protected]
Program Manager
Sally McFarlane
U.S. Department of Energy, Biological and Environmental Research (SC-33)
Urban Integrated Field Laboratories
[email protected]
Funding
This material is based upon work supported by the Biological and Environmental Research program within the U.S. Department of Energy Office of Science under Award Number DE-SC0023216.
References
Mobley, W., et al. "Implementing Reproducible Cookbook Environments for Advanced Analyses on Science Gateways." Science Gateways 2024 (2024). https://doi.org/10.5281/zenodo.13869305.