Cloud Optimized Raster Encoding (CORE) format

DOI

DISCLAIMER: CORE is still in development. Interested parties are warmly invited to join common development, to comment, discuss, find bugs, etc.

Acknowledgement: The CORE format was proudly inspired by the Cloud Optimized GeoTIFF (COG) format, by considering how to leverage the ability of clients issuing ​HTTP GET range requests for a time-series of remote sensing and aerial imagery (instead of just one image).

License: The Cloud Optimized Raster Encoding (CORE) specifications are released to the public domain under a Creative Commons 1.0 CC0 "No Rights Reserved" international license. You can reuse the information contained herein in any way you want, for any purposes and without restrictions.


Summary: The Cloud Optimized Raster Encoding (CORE) format is being developed for the efficient storage and management of gridded data by applying video encoding algorithms. It is mainly designed for the exchange and preservation of large time series data in environmental data repositories, while in the same time enabling more efficient workflows on the cloud. It can be applied to any large number of similar (in pixel size and image dimensions) raster data layers. CORE is not designed to replace COG but to work together with COG for a collection of many layers (e.g. by offering a fast preview of layers when switching between layers of a time series).

WARNING: Currently only applicable to RGB/Byte imagery. The final CORE specifications may probably be very different from what is written herein or CORE may not ever become productive due to a myriad of reasons (see also 'Major issues to be solved').

With this early public sharing of the format we explicitly support the Open Science agenda, which implies "shifting from the standard practices of publishing research results in scientific publications towards sharing and using all available knowledge at an earlier stage in the research process" (quote from: European Commission, Directorate General for Research and Innovation, 2016. Open innovation, open science, open to the world).

CORE Specifications:

1) a MP4 or WebM video digital multimedia container format (or any future video container playable as HTML video in major browsers)

2) a free to use or open video compression codec such as H.264, VP9, or AV1 (or any future video codec that is open sourced or free to use for end users)

Note: H.264 is currently recommended because of the wide usage with support in all major browsers, fast encoding due to acceleration in hardware (which is currently not the case for AV1 or VP9) and the fact that MPEG LA has allowed the free use for streaming video that is free to the end users. However, please note that H.264 is restricted by patents and its use in proprietary or commercial software requires the payment of royalties to MPEG LA. However, when AV1 matures and accelerated hardware encoding becomes available, AV1 is expected to offer 30% to 50% smaller file size in comparison with H.264, while retaining the same quality.

3) the encoding frame rate should be of one frame per second (fps) with each layer segmented in internal tiles, similar to COG, ordered by the main use case when accessing the data: either layer contiguous or tile contiguous;

Note: The internal tile arrangement should support an easy navigation inside the CORE video format, depending on the use case.

4) a CORE file is optimised for streaming with the moov atom at the beginning of the file (e.g. with -movflags faststart) and optional additional optimisations depending on the codec used (e.g. -tune fastdecode -tune zerolatency for H.264)

5) metadata tags inside the moov atom for describing and using geographic image data (that are preferably compatible with the OGC GeoTIFF standard or any future standard accepted by the geospatial community) as well as list of original file names corresponding to each CORE layer

6) it needs to encode similar source rasters (such as time series of rasters with the same extent and resolution, or different tiles of the same product; each input raster should be having the same image and pixel size)

7) it provides a mechanism for addressing and requesting overviews (lower resolution data) for a fast display in web browser depending on the map scale (currently external overviews)

Major issues to be solved:

  • Internal overviews (similar to COG), by chaining lower resolution videos in the same MP4 container for fast access to overviews first); Currently, overviews are kept as separate files, as external overviews.

  • Metadata encoding (how to best encode spatial extent, layer names, and so on, for each of the layer inside the series, which may have a different geographical extent, etc...; Known issues: adding too many tags with FFmpeg which are not part of the standard MP4 moov atom; metadata tags have a limited string length.

  • Applicability beyond RGB/Byte datasets; defining a standard way of converting cell values from Int16/UInt16/UInt32/Int32/Float32/Float64/ data types into multi-band Byte values (and reconstructing them back to the original data type within acceptable thresholds)

Example

Notice: The provided CORE (.mp4) examples contain modified Copernicus Sentinel data [2018-2021].

For generating the CORE examples provided, 50 original Sentinel 2 (S-2) TCI data images from an area located inside Switzerland were downloaded from www.copernicus.eu, and then transformed into CORE format using ffmpeg with H.264 encoding using the x264 library.

For full reproducibility, we provide the original data set and results, as well scripts for data encoding and extraction (see resources).

Identifier
DOI https://doi.org/10.16904/envidat.230
Metadata Access https://www.envidat.ch/api/action/package_show?id=387cd755-33df-4543-ae50-529df462b94f
Provenance
Creator Ionut, Iosifescu Enescu, 0000-0002-1770-7833; Dominik, Haas-Artho,; Lucia, de Espona, 0000-0002-1477-6999; Marius, Rüetschi,
Publisher EnviDat
Publication Year 2021
Funding Reference WSL, EnviDat
Rights CC0-1.0; Creative Commons Zero - No Rights Reserved (CC0 1.0)
OpenAccess true
Contact envidat(at)wsl.ch
Representation
Language English
Resource Type Dataset
Version 0.1
Discipline Environmental Sciences
Spatial Coverage (8.455 LON, 47.361 LAT); Switzerland