Publishing processes via OGC API - Processes

OGC API - Processes provides geospatial data processing functionality in a standards-based fashion (inputs, outputs).

pygeoapi implements OGC API - Processes functionality by providing a plugin architecture, thereby allowing developers to implement custom processing workflows in Python.

A sample hello-world process is provided with the pygeoapi default configuration.

Configuration

processes:
    hello-world:
        processor:
            name: HelloWorld

Asynchronous support

By default, pygeoapi implements process execution (jobs) as synchronous mode. That is, when jobs are submitted, the process is executed and returned in real-time. Certain processes that may take time to execute, or be delegated to a scheduler/queue, are better suited to an asynchronous design pattern. This means that when a job is submitted in asynchronous mode, the server responds immediately with a reference to the job, which allows the client to periodically poll the server for the processing status of a given job.

pygeoapi provides asynchronous support by providing a ‘manager’ concept which, well, manages job execution. The manager concept is implemented as part of the pygeoapi Customizing pygeoapi: plugins architecture. pygeoapi provides a default manager implementation based on TinyDB for simplicity. Custom manager plugins can be developed for more advanced job management capabilities (e.g. Kubernetes, databases, etc.).

In keeping with the OGC API - Processes specification, asynchronous process execution can be requested by including the Prefer: respond-async HTTP header in the request

server:
    manager:
        name: TinyDB
        connection: /tmp/pygeoapi-process-manager.db
        output_dir: /tmp/

MongoDB

As an alternative to the default a manager employing MongoDB can be used. The connection to an installed MongoDB instance must be provided in the configuration. MongoDB uses the localhost and port 27017 by default. Jobs are stored in a collection named job_manager_pygeoapi.

server:
    manager:
        name: MongoDB
        connection: mongodb://host:port
        output_dir: /tmp/

Putting it all together

To summarize how pygeoapi processes and managers work together:

* process plugins implement the core processing / workflow functionality
* manager plugins control and manage how processes are executed

Processing examples

# list all processes
curl http://localhost:5000/processes

# describe the ``hello-world`` process
curl http://localhost:5000/processes/hello-world

# show all jobs
curl http://localhost:5000/jobs

# execute a job for the ``hello-world`` process
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"}}"

# execute a job for the ``hello-world`` process with a raw response (default)
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"}}"

# execute a job for the ``hello-world`` process with a response document
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"},\"response\":\"document\"}"

# execute a job for the ``hello-world`` process in asynchronous mode
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -H "Prefer: respond-async"
    -d "{\"inputs\":{\"name\": \"hi there2\"}}"

# execute a job for the ``hello-world`` process with a success subscriber
curl -X POST http://localhost:5000/processes/hello-world/execution \
    -H "Content-Type: application/json" \
    -d "{\"inputs\":{\"name\": \"hi there2\"}, \
         \"subscriber\": {\"successUri\": \"https://www.example.com/success\"}}"