Publishing vector data to OGC API - Features
OGC API - Features provides geospatial data access functionality to vector data.
To add vector data to pygeoapi, you can use the dataset example in Configuration as a baseline and modify accordingly.
Providers
pygeoapi core feature providers are listed below, along with a matrix of supported query parameters.
Provider |
property filters/display |
resulttype |
bbox |
datetime |
sortby |
skipGeometry |
CQL |
transactions |
crs |
---|---|---|---|---|---|---|---|---|---|
✅/✅ |
results/hits |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
|
✅/✅ |
results/hits |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
❌/❌ |
results/hits |
✅ |
✅ |
❌ |
❌ |
❌ |
❌ |
||
✅/✅ |
results/hits |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
|
✅/✅ |
results/hits |
❌ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
|
✅/❌ |
results |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
|
✅/❌ |
results/hits |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
|
✅/✅ |
results/hits |
✅ |
✅ |
✅ |
✅ |
✅ |
❌ |
✅ |
|
✅/❌ |
results/hits |
✅ |
❌ |
❌ |
✅ |
❌ |
❌ |
✅ |
|
✅/✅ |
results/hits |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
|
✅/✅ |
results/hits |
✅ |
✅ |
✅ |
✅ |
❌ |
❌ |
✅ |
Note
All Providers that support bbox also support the bbox-crs parameter. bbox-crs is handled within pygeoapi core.
All Providers support the crs parameter to reproject (transform) response data. Some, like PostgreSQL and OGR, perform this natively: ‘✅n’.
Connection examples
Below are specific connection examples based on supported providers. To support crs on queries, one needs to configure both a list of supported CRSs, and a ‘Storage CRS’. See also CRS support and Configuration. When no CRS information is configured the default CRS/’Storage CRS’ value http://www.opengis.net/def/crs/OGC/1.3/CRS84 is assumed. That is: WGS84 with lon,lat axis-ordering as in standard GeoJSON.
CSV
To publish a CSV file, the file must have columns for x and y geometry
which need to be specified in geometry
section of the provider
definition.
providers:
- type: feature
name: CSV
data: tests/data/obs.csv
id_field: id
geometry:
x_field: long
y_field: lat
crs:
- http://www.opengis.net/def/crs/EPSG/0/28992
- http://www.opengis.net/def/crs/OGC/1.3/CRS84
- http://www.opengis.net/def/crs/EPSG/0/4326
storage_crs: http://www.opengis.net/def/crs/EPSG/0/28992
GeoJSON
To publish a GeoJSON file, the file must be a valid GeoJSON FeatureCollection.
providers:
- type: feature
name: GeoJSON
data: tests/data/file.json
id_field: id
Elasticsearch
Note
Requires Python packages elasticsearch and elasticsearch-dsl
Note
Elasticsearch 8 or greater is supported.
To publish an Elasticsearch index, the following are required in your index:
indexes must be documents of valid GeoJSON Features
index mappings must define the GeoJSON
geometry
as ageo_shape
providers:
- type: feature
name: Elasticsearch
editable: true|false # optional, default is false
data: http://localhost:9200/ne_110m_populated_places_simple
id_field: geonameid
time_field: datetimefield
This provider has the support for the CQL queries as indicated in the table above.
See also
CQL support for more details on how to use Common Query Language (CQL) to filter the collection with specific queries.
ESRI Feature Service
To publish an ESRI Feature Service or ESRI Map Service specify the URL for the service layer in the data
field.
id_field
will often beOBJECTID
,objectid
, orFID
.If the map or feature service is not shared publicly, the
username
andpassword
fields can be set in the configuration to authenticate into the service.
providers:
- type: feature
name: ESRI
data: https://sampleserver5.arcgisonline.com/arcgis/rest/services/NYTimes_Covid19Cases_USCounties/MapServer/0
id_field: objectid
time_field: date_in_your_device_time_zone # Optional time field
crs: 4326 # Optional crs (default is EPSG:4326)
username: username # Optional ArcGIS username
password: password # Optional ArcGIS password
OGR
Note
Requires Python package gdal
GDAL/OGR supports a wide range of spatial file formats, such as shapefile, dxf, gpx, kml, but also services such as WFS. Read the full list and configuration options at https://gdal.org/drivers/vector. Additional formats and features are available via the virtual format, use this driver for example for flat database files (CSV).
The OGR provider requires a recent (3+) version of GDAL to be installed.
providers:
- type: feature
name: OGR
data:
source_type: ESRI Shapefile
source: tests/data/dutch_addresses_shape_4326/inspireadressen.shp
source_options:
ADJUST_GEOM_TYPE: FIRST_SHAPE
gdal_ogr_options:
SHPT: POINT
id_field: fid
layer: inspireadressen
providers:
- type: feature
name: OGR
data:
source_type: WFS
source: WFS:https://geodata.nationaalgeoregister.nl/rdinfo/wfs?
source_options:
VERSION: 2.0.0
OGR_WFS_PAGING_ALLOWED: YES
OGR_WFS_LOAD_MULTIPLE_LAYER_DEFN: NO
gdal_ogr_options:
GDAL_CACHEMAX: 64
GDAL_HTTP_PROXY: (optional proxy)
GDAL_PROXY_AUTH: (optional auth for remote WFS)
CPL_DEBUG: NO
crs:
- http://www.opengis.net/def/crs/OGC/1.3/CRS84
- http://www.opengis.net/def/crs/EPSG/0/4326
- http://www.opengis.net/def/crs/EPSG/0/4258
- http://www.opengis.net/def/crs/EPSG/0/28992
storage_crs: http://www.opengis.net/def/crs/EPSG/0/28992
id_field: gml_id
layer: rdinfo:stations
providers:
- type: feature
name: OGR
data:
source_type: ESRIJSON
source: https://map.bgs.ac.uk/arcgis/rest/services/GeoIndex_Onshore/boreholes/MapServer/0/query?where=BGS_ID+%3D+BGS_ID&outfields=*&orderByFields=BGS_ID+ASC&f=json
source_capabilities:
paging: True
open_options:
FEATURE_SERVER_PAGING: YES
gdal_ogr_options:
EMPTY_AS_NULL: NO
GDAL_CACHEMAX: 64
# GDAL_HTTP_PROXY: (optional proxy)
# GDAL_PROXY_AUTH: (optional auth for remote WFS)
CPL_DEBUG: NO
id_field: BGS_ID
layer: ESRIJSON
providers:
- type: feature
name: OGR
data:
source_type: PostgreSQL
source: "PG: host=127.0.0.1 dbname=test user=postgres password=postgres"
id_field: osm_id
layer: osm.hotosm_bdi_waterways # Value follows a 'my_schema.my_table' structure
geom_field: foo_geom
Note
NB: Formerly the config parameters source_srs
and target_srs
could be used to
transform/reproject the data for every request. Starting with pygeoapi release 0.15.0 these fields are no longer supported.
Reason is that pygeoapi now supports CRS-handling as per the OGC API Features Standard “Part 2”.
storage_crs: is basically the same as source_crs but complying with standards (and axis ordering!)
It should be set to the actual or default CRS of the source data/service. When omitted the default http://www.opengis.net/def/crs/OGC/1.3/CRS84
if assumed.
crs is an array of supported CRSs, also the same default applies when omitted.
The crs or bbox-crs query parameter can now be used and must be present in the crs array (or
the default applies).
The crs query parameter is used as follows:
e.g. http://localhost:5000/collections/foo/items?crs=http%3A%2F%2Fwww.opengis.net%2Fdef%2Fcrs%2FEPSG%2F0%2F28992
.
MongoDB
Note
Requires Python package pymongo
Note
Mongo 5 or greater is supported.
MongoDB is a powerful and versatile NoSQL database that provides numerous advantages, making it a preferred choice for many applications. One of the main reasons to use MongoDB is its ability to handle large volumes of unstructured data, making it ideal for managing diverse data types such as text, geospatial, and multimedia data. Additionally, MongoDB’s flexible document model allows for easy schema evolution, enabling developers to iterate quickly and adapt to changing requirements.
GeoJSON support is available officially by MongoDB , thus a GeoJSON file can be added to MongoDB using following command
mongoimport –db test -c points –file “path/to/file.geojson” –jsonArray
Here test is the name of database , points is the target collection name.
each document must be a GeoJSON Feature, with a valid geometry.
providers:
- type: feature
name: MongoDB
data: mongodb://localhost:27017/testdb
collection: testplaces
Oracle
Note
Requires Python package oracledb
providers:
- type: feature
name: OracleDB
data:
host: 127.0.0.1
port: 1521 # defaults to 1521 if not provided
service_name: XEPDB1
# sid: XEPDB1
user: geo_test
password: geo_test
# external_auth: wallet
# tns_name: XEPDB1
# tns_admin /opt/oracle/client/network/admin
# init_oracle_client: True
id_field: id
table: lakes
geom_field: geometry
title_field: name
# sql_manipulator: tests.test_oracle_provider.SqlManipulator
# sql_manipulator_options:
# foo: bar
# mandatory_properties:
# - bbox
# source_crs: 31287 # defaults to 4326 if not provided
# target_crs: 31287 # defaults to 4326 if not provided
The provider supports connection over host and port with SID or SERVICE_NAME. For TNS naming, the system environment variable TNS_ADMIN or the configuration parameter tns_admin must be set.
The providers supports external authentication. At the moment only wallet authentication is implemented.
Sometimes it is necessary to use the Oracle client for the connection. In this case init_oracle_client must be set to True.
The provider supports a SQL-Manipulator-Plugin class. With this, the SQL statement could be manipulated. This is useful e.g. for authorization at row level or manipulation of the explain plan with hints.
An example an more informations about that feature you can find in the test class in tests/test_oracle_provider.py.
PostgreSQL
Note
Requires Python packages sqlalchemy, geoalchemy2 and psycopg2-binary
Must have PostGIS installed.
Note
Geometry must be using EPSG:4326
providers:
- type: feature
name: PostgreSQL
data:
host: 127.0.0.1
port: 3010 # Default 5432 if not provided
dbname: test
user: postgres
password: postgres
search_path: [osm, public]
id_field: osm_id
table: hotosm_bdi_waterways
geom_field: foo_geom
A number of database connection options can be also configured in the provider in order to adjust properly the sqlalchemy engine client. These are optional and if not specified, the default from the engine will be used. Please see also SQLAlchemy docs.
providers:
- type: feature
name: PostgreSQL
data:
host: 127.0.0.1
port: 3010 # Default 5432 if not provided
dbname: test
user: postgres
password: postgres
search_path: [osm, public]
options:
# Maximum time to wait while connecting, in seconds.
connect_timeout: 10
# Number of *milliseconds* that transmitted data may remain
# unacknowledged before a connection is forcibly closed.
tcp_user_timeout: 10000
# Whether client-side TCP keepalives are used. 1 = use keepalives,
# 0 = don't use keepalives.
keepalives: 1
# Number of seconds of inactivity after which TCP should send a
# keepalive message to the server.
keepalives_idle: 5
# Number of TCP keepalives that can be lost before the client's
# connection to the server is considered dead.
keepalives_count: 5
# Number of seconds after which a TCP keepalive message that is not
# acknowledged by the server should be retransmitted.
keepalives_interval: 1
id_field: osm_id
table: hotosm_bdi_waterways
geom_field: foo_geom
The PostgreSQL provider is also able to connect to Cloud SQL databases.
providers:
- type: feature
name: PostgreSQL
data:
host: /cloudsql/INSTANCE_CONNECTION_NAME # e.g. 'project:region:instance'
dbname: reference
user: postgres
password: postgres
id_field: id
table: states
This is what a configuration for Google Cloud SQL connection looks like. The host
block contains the necessary socket connection information.
This provider has support for the CQL queries as indicated in the Provider table above.
See also
CQL support for more details on how to use Common Query Language (CQL) to filter the collection with specific queries.
SQLiteGPKG
Note
Requries Spatialite installation
SQLite file:
providers:
- type: feature
name: SQLiteGPKG
data: ./tests/data/ne_110m_admin_0_countries.sqlite
id_field: ogc_fid
table: ne_110m_admin_0_countries
GeoPackage file:
providers:
- type: feature
name: SQLiteGPKG
data: ./tests/data/poi_portugal.gpkg
id_field: osm_id
table: poi_portugal
SensorThings API
The STA provider is capable of creating feature collections from OGC SensorThings
API endpoints. Three of the STA entities are configurable: Things, Datastreams, and
Observations. For a full description of the SensorThings entity model, see
here.
For each entity of Things
, pygeoapi will expand all entities directly related to
the Thing
, including its associated Location
, from which the
geometry for the feature collection is derived. Similarly, Datastreams
are expanded to
include the associated Thing
, Sensor
and ObservedProperty
.
The default id_field is @iot.id
. The STA provider adds one required field,
entity
, and an optional field, intralink
. The entity
field refers to
which STA entity to use for the feature collection. The intralink
field controls
how the provider is acted upon by other STA providers and is by default, False.
If intralink
is true for an adjacent STA provider collection within a
pygeoapi instance, the expanded entity is instead represented by an intra-pygeoapi
link to the other entity or it’s uri_field
if declared.
providers:
- type: feature
name: SensorThings
data: https://sensorthings-wq.brgm-rec.fr/FROST-Server/v1.0/
uri_field: uri
entity: Datastreams
time_field: phenomenonTime
intralink: true
If all three entities are configured, the STA provider will represent a complete STA
endpoint as OGC-API feature collections. The Things
features will include links
to the associated features in the Datastreams
feature collection, and the
Observations
features will include links to the associated features in the
Datastreams
feature collection. Examples with three entities configured
are included in the docker examples for SensorThings.
Socrata
To publish a Socrata Open Data API (SODA) endpoint, pygeoapi heavily relies on sodapy.
data
is the domain of the SODA endpoint.resource_id
is the 4x4 resource id pattern.geom_field
is required for bbox queries to work.token
is optional and can be included in the configuration to pass an app token to Socrata.
providers:
- type: feature
name: Socrata
data: https://soda.demo.socrata.com/
resource_id: emdb-u46w
id_field: earthquake_id
geom_field: location
time_field: datetime # Optional time_field for datetime queries
token: my_token # Optional app token
ERDDAP Tabledap Service
Note
Requires Python package requests
To publish from an ERDDAP Tabledap service, the following are required in your index:
providers:
- type: feature
name: ERDDAPTabledap
data: http://osmc.noaa.gov/erddap/tabledap/OSMC_Points
id_field: PLATFORM_CODE
time_field: time
options:
filters: "¶meter=\"SLP\"&platform!=\"C-MAN%20WEATHER%20STATIONS\"&platform!=\"TIDE GAUGE STATIONS (GENERIC)\""
max_age_hours: 12
Note
If the datetime
parameter is passed by the client, this overrides the options.max_age_hours
setting.
Controlling the order of properties
It is possible to control the order and which properties are exposed/unexposed for any supported feature provider using properties
key within a provider definition, see the example below:
properties:
- waterway
- depth
- name
Data access examples
list all collections * http://localhost:5000/collections
overview of dataset * http://localhost:5000/collections/foo
queryables * http://localhost:5000/collections/foo/queryables
browse features * http://localhost:5000/collections/foo/items
paging * http://localhost:5000/collections/foo/items?offset=10&limit=10
CSV outputs * http://localhost:5000/collections/foo/items?f=csv
query features (spatial) * http://localhost:5000/collections/foo/items?bbox=-180,-90,180,90
query features (spatial with bbox-crs) * http://localhost:5000/collections/foo/items?bbox=120000,450000,130000,460000&bbox-crs=http%3A%2F%2Fwww.opengis.net%2Fdef%2Fcrs%2FEPSG%2F0%2F28992
query features (attribute) * http://localhost:5000/collections/foo/items?propertyname=foo
query features (temporal) * http://localhost:5000/collections/foo/items?datetime=2020-04-10T14:11:00Z
query features (temporal) and sort ascending by a property (if no +/- indicated, + is assumed) * http://localhost:5000/collections/foo/items?datetime=2020-04-10T14:11:00Z&sortby=+datetime
query features (temporal) and sort descending by a property * http://localhost:5000/collections/foo/items?datetime=2020-04-10T14:11:00Z&sortby=-datetime
query features in a given (and supported) CRS * http://localhost:5000/collections/foo/items?crs=http%3A%2F%2Fwww.opengis.net%2Fdef%2Fcrs%2FEPSG%2F0%2F32633
query features in a given bounding BBOX and return in given CRS * http://localhost:5000/collections/foo/items?bbox=120000,450000,130000,460000&bbox-crs=http%3A%2F%2Fwww.opengis.net%2Fdef%2Fcrs%2FEPSG%2F0%2F28992&crs=http%3A%2F%2Fwww.opengis.net%2Fdef%2Fcrs%2FEPSG%2F0%2F32633
fetch a specific feature * http://localhost:5000/collections/foo/items/123
fetch a specific feature in a given (and supported) CRS * http://localhost:5000/collections/foo/items/123?crs=http%3A%2F%2Fwww.opengis.net%2Fdef%2Fcrs%2FEPSG%2F0%2F32633
Note
when no crs
and/or bbox-crs
is provided, the default CRS http://www.opengis.net/def/crs/OGC/1.3/CRS84 (WGS84 in lon, lat ordering) is assumed.
pygeoapi may perform the necessary transformations if the storage_crs
differs from this default. Features are then always returned in
that default CRS (as per the GeoJSON Standard).
In all cases, weather or not these query parameters are supplied, the HTTP Header Content-Crs
denotes the CRS of the Feature(s) in the response.
Note
.../items
queries which return an alternative representation to GeoJSON (which prompt a download)
will have the response filename matching the collection name and appropriate file extension (e.g. my-dataset.csv
)
Note
provider id_field values support slashes (i.e. my/cool/identifier
). The client request would then
be responsible for encoding the identifier accordingly (i.e. http://localhost:5000/collections/foo/items/my%2Fcool%2Fidentifier
)