We present Cloudify, a modular application designed to serve Earth System Model (ESM) simulation output as cloud-optimized-like datasets from a range of storage and format backends. Based on an adopted version of Xpublish and extended via its plugin architecture, Cloudify exposes simulation data through RESTful Zarr endpoints using FastAPI, effectively emulating the behavior of truly cloud-native datasets. By abstracting heterogeneous file formats and storage systems into a unified interface, Cloudify enables efficient, scalable and convenient data access. This includes fast random access to chunks of complete datasets without relying on additional software like file-based catalogs, making the data well-suited for AI and machine learning workflows independent of source formats.
By introducing a light-weight Kerchunk Plugin, designed to stream raw data as it is, Cloudify enhances Xpublish´s data provision and simplifies access with reduced dependencies on server-side resource and client-side software. At the same time, Cloudify enables asynchronous access to Dask-backed Xarray datasets, laying the foundation for Data-as-a-Service workflows. It facilitates server-side computation, making hosted data fit-for-use, even for clients with limited compute or storage capabilities. With its dynamic plugins enabled, Cloudify enables runtime changes to datasets, supporting online data streaming and diagnostics registration.
A plugin for STAC (SpatioTemporal Asset Catalog) catalog endpoints enhances the FAIRness (Findability, Accessibility, Interoperability, and Reusability) of ESM output hosted through Xpublish, enabling seamless discovery and integration across infrastructures. Cloudify thus acts as a bridge between traditional High-Performance Computing (HPC) environments and modern, cloud-native access paradigms, offering a powerful approach to modernizing and optimizing ESM data services.