Overview of WDCC Data Workflow

The WDCC is a data repository which offers long-term archival and publication of datasets relevant to Climate and Earth System Research.
The WDCC follows the guidelines given in the OAIS reference model across the whole workflow. The WDCC-internal workflow is described in the figure below and includes four different steps:
  1. provisioning of the metadata by data provider
  2. ingestion, checks and updates of metadata
  3. archiving of the data itself and connection of data with metadata
  4. quality assurance (QA) of data and curation of data and metadata
The metadata (MD) are provided by the data provider either by the WDCC MD tool MetaXa, a graphical user interface, or generated from specific CSV files, or extracted from external resources (e.g. CMIP6 metadata are retrieved from ESGF). MD is stored in the first step in a temporary WDCC database (WDCC temp) and checked against the WDCC metadata scheme before the ingestion into the productive WDCC MD database. Hereafter the data filling is ordered and the data is archived into the WDCC tape archive. The technical quality assurance (TQA) is a necessary step for the assignment of DataCite DOIs for data collections and part of the data curation. Curation processes like TQA and DOI assessment generate new MD, added to the WDCC MD database immediately.
WDCC workflow overview
MD = Metadata; TQA = Technical Quality Assurance; MetaXA = Metadata ApeX Application (graphical user interface)