.. _stac: Publishing files to a SpatioTemporal Asset Catalog ================================================== The `SpatioTemporal Asset Catalog (STAC)`_ family of specifications aim to standardize the way geospatial asset metadata is structured and queried. A "spatiotemporal asset" is any file that represents information about the Earth at a certain place and time. The original focus was on scenes of satellite imagery, but the specifications now cover a broad variety of uses, including sources such as aircraft and drone and data such as hyperspectral optical, synthetic aperture radar (SAR), video, point clouds, lidar, digital elevation models (DEM), vector, machine learning labels, and composites like NDVI and mosaics. STAC is intentionally designed with a minimal core and flexible extension mechanism to support a broad set of use cases. This specification has matured over the past several years, and is used in numerous production deployments. pygeoapi built-in providers to browse STAC catalogs are described below: Static catalog -------------- FileSystem Provider ^^^^^^^^^^^^^^^^^^^ The FileSystem Provider implements STAC as a geospatial file browser through the server's file system, supporting any level of file/directory nesting/hierarchy. Configuring STAC in pygeoapi is done by simply pointing the ``data`` provider property to the given directory and specifying allowed file types: Connection examples ******************* .. code-block:: yaml my-stac-resource: type: stac-collection ... providers: - type: stac name: FileSystem data: /Users/tomkralidis/Dev/data/gdps file_types: - .grib2 .. note:: ``rasterio`` and ``fiona`` are required for describing geospatial files. pygeometa metadata control files ******************************** pygeoapi's STAC filesystem functionality supports `pygeometa`_ MCF files residing in the same directory as data files. If an MCF file is found, it will be used as part of generating the STAC item metadata (e.g. a file named ``birds.csv`` having an associated ``birds.yml`` file). If no MCF file is found, then pygeometa will generate the STAC item metadata from configuration and by reading the data's properties. Publishing ESRI Shapefiles ************************** ESRI Shapefile publishing requires to specify all required component file extensions (``.shp``, ``.shx``, ``.dbf``) with the provider ``file_types`` option. Data access examples ******************** * STAC root page * http://localhost:5000/stac From here, browse the filesystem accordingly. Azure Blob Storage Provider ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The AzureBlobStorage Provider implements STAC as a geospatial file browser through Azure Blob Storage, supporting any level of file/directory nesting/hierarchy. Configuring STAC in pygeoapi is done by simply pointing the ``data`` provider property to the given container and specifying allowed file types: Connection examples ******************* .. code-block:: yaml my-stac-resource: type: stac-collection ... providers: - type: stac name: AzureBlobStorage data: my-container-name file_types: - .grib2 .. note:: The `AZURE_STORAGE_CONNECTION_STRING` environment variable is required and should be set accordingly. .. note:: ``rasterio`` and ``fiona`` are required for describing geospatial files. Hateoas Provider ^^^^^^^^^^^^^^^^ HATEOAS (Hypermedia as the Engine of Application State) is a way of implementing a REST application that allows the client to dynamically navigate to the appropriate resources by browsing hypermedia links. This type of navigation is similar to WEB navigation and requires a very precise data structure that must be respected to allow the HATEOAS Provider to behave correctly. There are three component specifications (Catalog, Collection, Item) that together make up the core SpatioTemporal Asset Catalog specification. An Item represents a single spatiotemporal asset as GeoJSON. The Catalog specification provides structural elements, to group Items and Collections. Collections are catalogs, that add more required metadata and describe a group of related Items. The full catalog structure of links down to sub-catalogs and Items, and their links back to their parents and roots, must be done with **relative** URL's for the HATEOAS Provider work correctly. The structural *rel* types include *root*, *parent*, *child*, *item*, and *collection*. Assets links must be **absolute** URL's. Other links can be absolute, especially if they describe a resource that makes less sense in the catalog, like derived_from or even license (it can be nice to include the license in the catalog, but some licenses live at a canonical online location which makes more sense to refer to directly). This enables the full catalog (excluding the assets) to be downloaded or copied to another location and to still be valid. This also implies no self link, as that link must be absolute. So, the following rules must be respected: 1. Root documents (Catalogs / Collections) must be at the root of a directory tree containing the static catalog. 2. Catalogs must be named catalog.json and Collections must be named collection.json. 3. Sub-Catalogs or sub-Collections must be stored in subdirectories of their parent (and only 1 subdirectory deeper than a document's parent, e.g. .../sample/sub1/catalog.json). 4. Limit the number of Items in a Catalog or Collection, grouping / partitioning as relevant to the dataset. 5. Use structural elements (Catalog and Collection) consistently across each 'level' of your hierarchy. For example, if levels 2 and 4 of the hierarchy only contain Collections, don't add a Catalog at levels 2 and 4. 6. Items must be named <*id*>.json. 7. Items must be stored in subdirectories (1 level deeper) of their parent Catalog or Collection. The subdirectory must have the same name (<*id*>) as the Item without the *.json* extension. This means that each Item are contained in a unique subdirectory. 8. The links to the actual assets must be an absolute URL. ------------- File examples ************* Structure of the ``catalog.json`` file: .. code-block:: json { "id": "STAC-Catalog", "type": "Catalog", "stac_version": "1.0.0", "description": "A description of the STAC Catalog", "links": [ { "rel": "root", "href": "./catalog.json", "type": "application/json" }, { "rel": "child", "href": "./eo4ce/catalog.json", "type": "application/json" }, { "rel": "child", "href": "./dem/catalog.json", "type": "application/json" } ], "stac_extensions": [], "title": "STAC Catalog" } The code above shows the root catalog. The sub-catalogs have an additional ``rel`` entry pointing to the parent. .. code-block:: json { "id": "dem", "type": "Catalog", "stac_version": "1.0.0", "description": "Digital Elevation Data", "links": [ { "rel": "root", "href": "../catalog.json", "type": "application/json" }, { "rel": "child", "href": "./hrdsm/collection.json", "type": "application/json" }, { "rel": "parent", "href": "../catalog.json", "type": "application/json" } ], "stac_extensions": [], "title": "DEM" } ------------------------------------- Structure of the ``collection.json`` file: Collections are similar to Catalogs with extra fields. .. code-block:: json { "id": "hrdsm", "stac_version": "1.0.0", "description": "High Resolution Digital Surface Model", "links": [ { "rel": "root", "href": "../../catalog.json", "type": "application/json" }, { "rel": "item", "href": "./arcticdem-frontiere-0/arcticdem-frontiere-0.json", "type": "application/json" }, { "rel": "item", "href": "./arcticdem-frontiere-9/arcticdem-frontiere-9.json", "type": "application/json" }, { "rel": "parent", "href": "../catalog.json", "type": "application/json" } ], "stac_extensions": [], "extent": { "spatial": { "bbox": [ [ -142.76516601842533, 59.65274347822059, -138.41658819177135, 69.81052152420365 ] ] }, "temporal": { "interval": [ [ "2014-09-03T14:00:00Z", "2020-09-28T15:49:00.559166Z" ] ] } }, "license": "proprietary" } Structure of the Item ``.json`` file: The example below shows the content of a file named ``arcticdem-frontiere-0.json``: .. code-block:: json { "type": "Feature", "stac_version": "1.0.0", "id": "arcticdem-frontiere-0", "properties": { "layer:ids": [ "dem-hrdsm" ], "collection": "hrdsm", "datetime": "2020-09-28T15:48:56.483794Z" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -140.27389595735178, 59.65274347822059 ], [ -138.41658819177135, 59.65274347822059 ], [ -138.41658819177135, 60.579416456816496 ], [ -140.27389595735178, 60.579416456816496 ], [ -140.27389595735178, 59.65274347822059 ] ] ] }, "links": [ { "rel": "root", "href": "../../../catalog.json", "type": "application/json" }, { "rel": "collection", "href": "../collection.json", "type": "application/json" }, { "rel": "parent", "href": "../collection.json", "type": "application/json" } ], "assets": { "image": { "href": "https://example.com/path/to/resource/arcticdem-frontiere-0.tif", "type": "image/tiff; application=geotiff; profile=cloud-optimized", "roles": [] } }, "bbox": [ -140.27389595735178, 59.65274347822059, -138.41658819177135, 60.579416456816496 ], "stac_extensions": [], "collection": "hrdsm" } HATEOAS configuration ********************* Configuring HATEOAS STAC Provider in pygeoapi is done by simply pointing the ``data`` provider property to the local directory or remote URL and specifying the root file name (``catalog.json`` or ``collection.json``) in the ``file_types`` property: Connection examples ******************* .. code-block:: yaml my-remote-stac-resource: type: stac-collection ... providers: - type: stac name: Hateoas data: https://datacube-dev-data-public.s3.ca-central-1.amazonaws.com/catalog/water file_types: catalog.json my-local-stac-resource: type: stac-collection ... providers: - type: stac name: Hateoas data: tests/stac file_types: catalog.json STAC API -------- `STAC API`_ support is provided as a wrapper on top of resources that have feature or record providers configured. pygeoapi implements the following conformance classes: * `STAC API - Core`_ * `STAC API - Item Search`_ To enable STAC API support, configure a resource with a feature or record provider, and set the resource ``type`` to ``stac-collection``: .. code-block:: yaml canada-metadata: type: stac-collection title: en: Open Canada sample data fr: Exemple de donn\u00e9es Canada Ouvert description: en: Sample metadata records from open.canada.ca fr: Exemples d'enregistrements de m\u00e9tadonn\u00e9es sur ouvert.canada.ca keywords: en: - canada - open data fr: - canada - donn\u00e9es ouvertes links: - type: text/html rel: canonical title: information href: https://open.canada.ca/en/open-data hreflang: en-CA - type: text/html rel: alternate title: informations href: https://ouvert.canada.ca/fr/donnees-ouvertes hreflang: fr-CA extents: spatial: bbox: [-180,-90,180,90] crs: http://www.opengis.net/def/crs/OGC/1.3/CRS84 providers: - type: record name: TinyDBCatalogue data: tests/data/open.canada.ca/sample-records.tinydb id_field: externalId time_field: created title_field: title STAC API queries will search all feature or record based resources configured as ``stac-collection``. Results are decorated with the required STAC elements (unless they already exist). .. note:: pygeoapi STAC API support is minimally designed to leverage the OGC API - Features and OGC API - Records implementations. A typical setup would be a features or records backend of STAC Items. pygeoapi does not add or implement any STAC Catalog/Item relationships beyond what is encoded in a STAC resource. Data access examples -------------------- * landing page * http://localhost:5000/stac-api * query all STAC resources * http://localhost:5000/stac-api/search * query features (spatial) * http://localhost:5000/stac-api/search?bbox=-142,52,-140,55 * paging * http://localhost:5000/stac-api/search?offset=10&limit=10 * query features (temporal) * http://localhost:5000/stac-api/search?datetime=2019-11-11T11:11:11Z/.. * http://localhost:5000/stac-api/search?datetime=2018-11-11T11:11:11Z/2019-11-11T11:11:11Z * http://localhost:5000/stac-api/search?datetime=../2019-11-11T11:11:11Z .. code-block:: bash # query features (spatial) curl -X POST http://localhost:5000/stac-api/search \ -H "Content-Type: application/json" \ -d "{\"bbox\": [-142, 52, -140, 55]}" # paging curl -X POST http://localhost:5000/stac-api/search \ -H "Content-Type: application/json" \ -d "{\"offset\": 10, \"limit\": 10}" # query features (temporal) curl -X POST http://localhost:5000/stac-api/search \ -H "Content-Type: application/json" \ -d "{\"datetime\": \"2019-11-11T11:11:11Z/..\"}" .. _`SpatioTemporal Asset Catalog (STAC)`: https://stacspec.org .. _`pygeometa`: https://geopython.github.io/pygeometa .. _`STAC API`: https://github.com/radiantearth/stac-api-spec .. _`STAC API - Core`: https://github.com/radiantearth/stac-api-spec/blob/v1.0.0/core .. _`STAC API - Item Search`: https://github.com/radiantearth/stac-api-spec/blob/v1.0.0/item-search