Jeffery S. Horsburgh


2021

DOI bib
Toward open and reproducible environmental modeling by integrating online data repositories, computational environments, and model Application Programming Interfaces
Young-Don Choi, Jonathan L. Goodall, Jeffrey M. Sadler, Anthony M. Castronova, Andrew Bennett, Zhiyu Li, Bart Nijssen, Shaowen Wang, Martyn P. Clark, Daniel P. Ames, Jeffery S. Horsburgh, Yi Hong, C. Bandaragoda, M. Seul, Richard Hooper, David G. Tarboton
Environmental Modelling & Software, Volume 135

Cyberinfrastructure needs to be advanced to enable open and reproducible environmental modeling research. Recent efforts toward this goal have focused on advancing online repositories for data and model sharing, online computational environments along with containerization technology and notebooks for capturing reproducible computational studies, and Application Programming Interfaces (APIs) for simulation models to foster intuitive programmatic control. The objective of this research is to show how these efforts can be integrated to support reproducible environmental modeling. We present first the high-level concept and general approach for integrating these three components. We then present one possible implementation that integrates HydroShare (an online repository), CUAHSI JupyterHub and CyberGIS-Jupyter for Water (computational environments), and pySUMMA (a model API) to support open and reproducible hydrologic modeling. We apply the example implementation for a hydrologic modeling use case to demonstrate how the approach can advance reproducible environmental modeling through the seamless integration of cyberinfrastructure services. • New approaches are needed to support open and reproducible environmental modeling. • Efforts should focus on integrating existing cyberinfrastructure to build new systems. • Our focus is on integrating repositories, computational environments, and model APIs. • An example implementation is shown using HydroShare, JupyterHub, and pySUMMA. • We demonstrate how the approach fosters reproducibility using a modeling case study.