Publishing processes via OGC API - Processes

OGC API - Processes provides geospatial data processing functionality in a standards-based fashion (inputs, outputs).

pygeoapi implements OGC API - Processes functionality by providing a plugin architecture, thereby allowing developers to implement custom processing workflows in Python.

The pygeoapi offers two processes: a default hello-world process which allows you to quickly explore the capabilities of processes, and an optional shapely-functions process with more advanced features that leverages Shapely to expose various geometric processing functionality.

Configuration

The below configuration is an example of a process defined within the pygeoapi internal plugin registry:

processes:
    # enabled by default
    hello-world:
        processor:
            name: HelloWorld

The below configuration is an example of a process defined as part of a custom Python process:

processes:
    # enabled by default
    hello-world:
        processor:
            # refer to a process in the standard PYTHONPATH
            # e.g. my_package/my_module/my_file.py (class MyProcess)
            # the MyProcess class must subclass from pygeoapi.process.base.BaseProcessor
            name: my_package.my_module.my_file.MyProcess

See Example: custom pygeoapi processing plugin for processing plugin examples.

Processing and response handling

pygeoapi processing plugins must return a tuple of media type and native outputs. Multipart responses are not supported at this time, and it is up to the process plugin implementor to return a single payload defining multiple artifacts (or references to them).

By default (or via the OGC API - Processes response: raw execution parameter), pygeoapi provides processing responses in their native encoding and media type, as defined by a given plugin (which needs to set the response content type and payload accordingly).

pygeoapi also supports a JSON-based response type (via the OGC API - Processes response: document execution parameter). When this mode is requested, the response will always be a JSON encoding, embedding the resulting payload (part of which may be Base64 encoded for binary data, for example).

Asynchronous support

By default, pygeoapi implements process execution (jobs) as synchronous mode. That is, when jobs are submitted, the process is executed and returned in real-time. Certain processes that may take time to execute, or be delegated to a scheduler/queue, are better suited to an asynchronous design pattern. This means that when a job is submitted in asynchronous mode, the server responds immediately with a reference to the job, which allows the client to periodically poll the server for the processing status of a given job.

In keeping with the OGC API - Processes specification, asynchronous process execution can be requested by including the Prefer: respond-async HTTP header in the request.

Job management is required for asynchronous functionality.

Job management

pygeoapi provides job management by providing a ‘manager’ concept which, well, manages job execution. The manager concept is implemented as part of the pygeoapi Customizing pygeoapi: plugins architecture. pygeoapi provides a default manager implementation based on TinyDB for simplicity. Custom manager plugins can be developed for more advanced job management capabilities (e.g. Kubernetes, databases, etc.).

Job managers

TinyDB

TinyDB is the default job manager for pygeoapi when enabled.

server:
    manager:
        name: TinyDB
        connection: /tmp/pygeoapi-process-manager.db
        output_dir: /tmp/

MongoDB

As an alternative to the default, a manager employing MongoDB can be used. The connection to a MongoDB instance must be provided in the configuration. MongoDB uses localhost and port 27017 by default. Jobs are stored in a collection named job_manager_pygeoapi.

server:
    manager:
        name: MongoDB
        connection: mongodb://host:port
        output_dir: /tmp/

PostgreSQL

As another alternative to the default, a manager employing PostgreSQL can be used. The connection to a PostgreSQL database must be provided in the configuration. PostgreSQL uses localhost and port 5432 by default. Jobs are stored in a table named jobs.

server:
    manager:
        name: PostgreSQL
        connection:
            host: localhost
            port: 5432
            database: test
            user: postgres
            password: ${POSTGRESQL_PASSWORD:-postgres}
        # Alternative accepted connection definition:
        # connection: postgresql://postgres:postgres@localhost:5432/test
        # connection: postgresql://postgres:${POSTGRESQL_PASSWORD:-postgres}@localhost:5432/test
        output_dir: /tmp

Putting it all together

To summarize how pygeoapi processes and managers work together:

  • process plugins implement the core processing / workflow functionality

  • manager plugins control and manage how processes are executed

Processing examples

Hello World (Default)

# list all processes
curl http://localhost:5000/processes

# describe the ``hello-world`` process
curl http://localhost:5000/processes/hello-world

# show all jobs
curl http://localhost:5000/jobs

# execute a job for the ``hello-world`` process
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"}}"

# execute a job for the ``hello-world`` process with a raw response (default)
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"}}"

# execute a job for the ``hello-world`` process with a response document
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"},\"response\":\"document\"}"

# execute a job for the ``hello-world`` process in asynchronous mode
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -H "Prefer: respond-async" \
    -d "{\"inputs\":{\"name\": \"hi there2\"}}"
# execute a job for the ``hello-world`` process with a success subscriber
 curl -X POST http://localhost:5000/processes/hello-world/execution \
     -H "Content-Type: application/json" \
     -d "{\"inputs\":{\"name\": \"hi there2\"}, \
         \"subscriber\": {\"successUri\": \"https://www.example.com/success\"}}"

Shapely Functions (Optional)

The shapely-functions process exposes some selected Shapely functions as sample process. The selection cut across different operations in Shapely. To avoid function collision, it uses the name of the function category as the namespace. E.g union operation under the set module is described as set:union.

The process is configured to accept a list of geometry inputs (WKT and/or GeoJSON geometry), operation and an optional output_format. It performs the specified operation and returns the result in the specified output_format (If the operation does not return a geometry, then this is ignored).

Configuration

processes:
     shapely-functions:
        processor:
            name: ShapelyFunctions

Supported operations

  • measurement:bounds - Computes the bounds (extent) of a geometry.

  • measurement:area - Computes the area of a (multi)polygon.

  • measurement:distance - Computes the Cartesian distance between two geometries.

  • predicates:covers - Returns True if no point in geometry B is outside geometry A.

  • predicates:within - Returns True if geometry A is completely inside geometry B.

  • set:difference - Returns the part of geometry A that does not intersect with geometry B.

  • set:union - Merges geometries into one.

  • constructive:buffer - Computes the buffer of a geometry for positive and negative buffer distance.

  • constructive:centroid - Computes the geometric center (center-of-mass) of a geometry.

Limitation

There is no support for passing optional function arguments yet. E.g when computing buffer on a geometry, no option to pass in the buffer distance.

# describe the ``shapely-functions`` process
curl http://localhost:5000/processes/shapely-functions

# execute a job for the ``shapely-functions`` process that computes the bounds of a WKT
curl -X POST http://localhost:5000/processes/shapely-functions/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"operation\": \"measurement:bounds\",\"geoms\": [\"POINT(83.27651071580385 22.593553859283745)\"]}}"

# execute a job for the ``shapely-functions`` process that calculates the area of a WKT Polygon
curl -X POST http://localhost:5000/processes/shapely-functions/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"operation\": \"measurement:area\",\"geoms\": [\"POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))\"]}}"

# execute a job for the ``shapely-functions`` process that calculates the distance between two WKTs
curl -X POST http://localhost:5000/processes/shapely-functions/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"operation\": \"measurement:distance\",\"geoms\": [\"POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))\",\"POINT(83.27651071580385 22.593553859283745)\"]}}"

# execute a job for the ``shapely-functions`` process that calculates the predicate difference between two WKTs and returns a GeoJSON feature
curl -X POST http://localhost:5000/processes/shapely-functions/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"operation\": \"set:difference\",\"geoms\": [\"POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))\",\"POINT(83.27651071580385 22.593553859283745)\"],\"output_format\":\"geojson\"}}"

# execute a job for the ``shapely-functions`` process that calculates the predicate difference between two WKTs and returns a WKT
curl -X POST http://localhost:5000/processes/shapely-functions/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"operation\": \"set:difference\",\"geoms\": [\"POLYGON ((0 0, 1 0, 1 1, 0 1, 0 0))\",\"POINT(83.27651071580385 22.593553859283745)\"],\"output_format\":\"wkt\"}}"

# execute a job for the ``shapely-functions`` process that computes the buffer of a GeoJSON feature and returns a WKT
curl -X POST http://localhost:5000/processes/shapely-functions/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"operation\": \"constructive:buffer\",\"geoms\": [{\"type\": \"LineString\",\"coordinates\": [[102.0,0.0],[103.0, 1.0],[104.0,0.0]]}],\"output_format\":\"wkt\"}}"