Journal cover Journal topic
Geoscientific Model Development An interactive open-access journal of the European Geosciences Union
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.
Development and technical paper
22 Mar 2018
Review status
This discussion paper is a preprint. It is a manuscript under review for the journal Geoscientific Model Development (GMD).
Requirements for a global data infrastructure in support of CMIP6
Venkatramani Balaji1,2, Karl E. Taylor3, Martin Juckes4, Michael Lautenschlager5, Chris Blanton2,6, Luca Cinquini7, Sebastien Denvil8, Paul J. Durack3, Mark Elkington9, Francesca Guglielmo8, Eric Guilyardi8,10, David Hassell10, Slava Kharin11, Stefan Kindermann5, Bryan N. Lawrence4,10, Sergey Nikonov1,2, Aparna Radhakrishnan2,6, Martina Stockhause5, Tobias Weigel5, and Dean Williams3 1Princeton University, Cooperative Institute of Climate Science, Princeton NJ, USA
2NOAA/Geophysical Fluid Dynamics Laboratory, Princeton NJ, USA
3PCMDI, Lawrence Livermore National Laboratory, Livermore, CA, USA
4Science and Technology Facilities Council, Abingdon, UK
5Deutsches KlimaRechenZentrum GmbH, Hamburg, Germany
6Engility Inc., NJ, USA
7Jet Propulsion Laboratory (JPL), 4800 Oak Grove Drive, Pasadena, CA 91109, USA
8Institut Pierre-Simon Laplace, CNRS/UPMC, Paris, France
9Met Office, FitzRoy Road, Exeter, EX1 3PB, UK
10National Center for Atmospheric Science and University of Reading, UK
11Canadian Centre for Climate Modelling and Analysis, Atmospheric Environment Service, University of Victoria, BC, Canada
Abstract. The World Climate Research Programme (WCRP)'s Working Group on Climate Modeling (WGCM) Infrastructure Panel (WIP) was formed in 2014 in response to the explosive growth in size and complexity of Coupled Model Intercomparison Projects (CMIPs) between CMIP3 (2005-06) and CMIP5 (2011-12). This article presents the WIP recommendations for the global data infrastructure needed to support CMIP design, future growth and evolution. Developed in close coordination with those who build and run the existing infrastructure (the Earth System Grid Federation), the recommendations are based on several principles beginning with the need to separate requirements, implementation, and operations. Other important principles include the consideration of data as a commodity in an ecosystem of users, the importance of provenance, the need for automation, and the obligation to measure costs and benefits. This paper concentrates on requirements, recognising the diversity of communities involved (modelers, analysts, software developers, and downstream users). Such requirements include the need for scientific reproducibility and accountability alongside the need to record and track data usage for the purpose of assigning credit. One key element is to generate a dataset-centric rather than system-centric focus, with an aim to making the infrastructure less prone to systemic failure. With these overarching principles and requirements, the WIP has produced a set of position papers, which are summarized here. They provide specifications for managing and delivering model output, including strategies for replication and versioning, licensing, data quality assurance, citation, long-term archival, and dataset tracking. They also describe a new and more formal approach for specifying what data, and associated metadata, should be saved, which enables future data volumes to be estimated. The paper concludes with a future-facing consideration of the global data infrastructure evolution that follows from the blurring of boundaries between climate and weather, and the changing nature of published scientific results in the digital age.
Citation: Balaji, V., Taylor, K. E., Juckes, M., Lautenschlager, M., Blanton, C., Cinquini, L., Denvil, S., Durack, P. J., Elkington, M., Guglielmo, F., Guilyardi, E., Hassell, D., Kharin, S., Kindermann, S., Lawrence, B. N., Nikonov, S., Radhakrishnan, A., Stockhause, M., Weigel, T., and Williams, D.: Requirements for a global data infrastructure in support of CMIP6, Geosci. Model Dev. Discuss.,, in review, 2018.
Venkatramani Balaji et al.
Venkatramani Balaji et al.
Venkatramani Balaji et al.


Total article views: 593 (including HTML, PDF, and XML)

HTML PDF XML Total BibTeX EndNote
450 137 6 593 8 12

Views and downloads (calculated since 22 Mar 2018)

Cumulative views and downloads (calculated since 22 Mar 2018)

Viewed (geographical distribution)

Total article views: 593 (including HTML, PDF, and XML)

Thereof 593 with geography defined and 0 with unknown origin.

Country # Views %
  • 1



Latest update: 26 Apr 2018
Publications Copernicus
Short summary
We present recommendations for the global data infrastructure needed to support CMIP scientific design, and its future growth and evolution. We follow a dataset-centric design less prone to systemic failure. Scientific publication in the digital age is evolving to make data a primary scientific output, alongside articles. We design toward that future scientific data ecosystem, informed by the need for reproducibility, data provenance, future data technologies, and measures of costs and benefits.
We present recommendations for the global data infrastructure needed to support CMIP scientific...