Occurrence API
https://api.gbif.org/v1/
This API works against the GBIF Occurrence Store, which handles occurrence records and makes them available through the web service and download files. In addition we also provide a Map API that offers spatial services.
Internally we use a Java web service client for the consumption of these HTTP-based, RESTful web services.
Occurrences
This API provides services related to the retrieval of single occurrence records.
Resource URL | Method | Response | Description |
---|---|---|---|
/occurrence/{key} | GET | Occurrence | Gets details for a single, interpreted occurrence |
/occurrence/{datasetKey}/{occurrenceId} | GET | Occurrence | Gets details for a single, interpreted occurrence by its dataset key and occurrenceId in that dataset. |
/occurrence/{key}/fragment | GET | Occurrence | Get a single occurrence fragment in its raw form (xml or json) |
/occurrence/{datasetKey}/{occurrenceId}/fragment | GET | Occurrence | Gets a single occurrence fragment in its raw form (xml or json) by its dataset key and occurrenceId in that dataset. |
/occurrence/{key}/verbatim | GET | VerbatimOccurrence | Gets the verbatim occurrence record without any interpretation |
/occurrence/{datasetKey}/{occurrenceId}/verbatim | GET | Occurrence | Gets the verbatim occurrence record without any interpretation by its dataset key and occurrenceId in that dataset. |
Searching
This API provides services for searching occurrence records that have been indexed by GBIF. In order to retrieve all results for a given search filter you need to issue individual requests for each page, which is limited to a maximum size of 300 records per page. Note that for technical reasons we also have a hard limit for any query of 100,000 records. You will get an error if the offset + limit exceeds 100,000. To retrieve all records beyond 100,000 you should use our asynchronous download service instead.
Please be aware that the following parameters are in a experimental phase and its definition could change in the future: q, facet, facetOffset, facetLimit, facetMincount and facetMultiselect
Resource URL | Method | Response | Description | Paging | Parameters |
---|---|---|---|---|---|
/occurrence/search | GET | Occurrence | Full search across all occurrences. Results are ordered by relevance. | true | q, basisOfRecord, catalogNumber, classKey, collectionCode, continent, coordinateUncertaintyInMeters, country, crawlId, datasetId, datasetKey, datasetName, decimalLatitude, decimalLongitude, depth, distanceFromCentroidInMeters, elevation, establishmentMeans, eventDate, eventId, familyKey, gadmGid, gadmLevel0Gid, gadmLevel1Gid, gadmLevel2Gid, gadmLevel3Gid, genusKey, geometry, hasCoordinate, hasGeospatialIssue, geoDistance, identifiedBy, identifiedByID, institutionCode, issue, kingdomKey, lastInterpreted, license, locality, mediaType, modified, month, networkKey, occurrenceId, occurrenceStatus, orderKey, organismId, organismQuantity, organismQuantityType, otherCatalogNumbers, phylumKey, preparations, programme, projectId, protocol, publishingCountry, publishingOrg, recordNumber, recordedBy, recordedByID, relativeOrganismQuantity, repatriated, sampleSizeUnit, sampleSizeValue, samplingProtocol, scientificName, speciesKey, stateProvince, subgenusKey, taxonKey, taxonId, typeStatus, verbatimScientificName, waterBody, year, facet, facetMincount, facetMultiselect, facet, paging |
/occurrence/search/catalogNumber | GET | Occurrence | Search that returns matching catalog numbers. Results are ordered by relevance. | false | q, limit |
/occurrence/search/collectionCode | GET | Occurrence | Search that returns matching collection codes. Results are ordered by relevance. | false | q, limit |
/occurrence/search/occurrenceId | GET | Occurrence | Search that returns matching occurrence identifiers. Results are ordered by relevance. | false | q, limit |
/occurrence/search/recordedBy | GET | Occurrence | Search that returns matching collector names. Results are ordered by relevance. | false | q, limit |
/occurrence/search/recordNumber | GET | Occurrence | Search that returns matching record numbers. Results are ordered by relevance. | false | q, limit |
/occurrence/search/institutionCode | GET | Occurrence | Search that returns matching institution codes. Results are ordered by relevance. | false | q, limit |
Occurrence Downloads
This API provides services to download occurrence records and retrieve information about those downloads.
Occurrence downloads are created asynchronously — the user requests a download and, once complete, is sent an email with a link to the resulting file.
It is necessary to register as a user of the website to create a download request, and use HTTP authentication using the username (not the email) and password.
Internally we use a Java web service client for the consumption of these HTTP-based, RESTful web services. It may be of interest to those coding against the API, and can be found in the occurrence-download-ws-client.
Resource URL | Method | Response | Description | Auth | Paging |
---|---|---|---|---|---|
/occurrence/download/request | POST | Download key | Starts the process of creating a download file. See the predicates section to consult the requests accepted by this service and the limits section to refer for information of how this service is limited per user. | true | false |
/occurrence/download/request/{key} | GET | Download file | Retrieves the download file if it is available. | false | false |
/occurrence/download/request/{key} | DELETE | Cancels the download process. | true | false | |
/occurrence/download | GET | Download Page | Lists all the downloads. This operation can be executed by the role ADMIN only. | true | true |
/occurrence/download/{key} | GET | Download | Retrieves the occurrence download metadata by its unique key. | false | false |
/occurrence/download/{key} | PUT | Updates the status of an existing occurrence download. This operation can be executed by the role ADMIN only. | true | false | |
/occurrence/download/{key} | POST | Creates the metadata about an occurrence download. This operation can be executed by the role ADMIN only. | true | false | |
/occurrence/download/{key}/datasets | GET | Datasets | Lists all the datasets of an occurrence download. | false | true |
/occurrence/download/{key}/datasets/export | GET | Datasets | Export all the datasets data of an occurrence download into CSV or TSV file. | false | true |
/occurrence/download/user/{user} | GET | Download Page | Lists the downloads created by a user. Only role ADMIN can list downloads of other users. | true | true |
/occurrence/download/dataset/{datasetKey} | GET | Downloads list | Lists the downloads activity of dataset. | true | true |
Occurrence Download Statistics
This API provides statistics about Occurrence Downloads.
Resource URL | Method | Response | Description | Paging | Parameters |
---|---|---|---|---|---|
/occurrence/download/statistics | GET | DownloadStatistics | Lists download statistics (datasetKey, number of records, number of downloads) grouped by year and month and filtered by: publishing organization, publishing country, a date range and a dataset key. | true | publishingCountry, fromDate, toDate, datasetKey, publishingOrgKey |
/occurrence/download/statistics/export | GET | TSV or CSV | Exports download statistics to a TSV or CSV file, data can be filtered by: publishing organization, publishing country, a date range and a dataset key. | false | publishingCountry, fromDate, toDate, datasetKey, publishingOrgKey, format |
/occurrence/download/statistics/downloadsByUserCountry | GET | YearMonthCounts | Lists download counts grouped by year and month and filtered by the country of the user who requested the download and a date range. | false | userCountry, fromDate, toDate |
/occurrence/download/statistics/downloadedRecordsByDataset | GET | YearMonthCounts | Lists downloaded occurrence records counts grouped by year and month and filtered by: publishing organization, publishing country, a date range and a dataset key. | false | publishingCountry, fromDate, toDate, datasetKey, publishingOrgKey |
/occurrence/download/statistics/downloadsByDataset | GET | YearMonthCounts | Lists download counts grouped by year and month and filtered by: publishing organization, publishing country, a date range and a dataset key. | false | publishingCountry, fromDate, toDate, datasetKey, publishingOrgKey |
/occurrence/download/statistics/downloadsBySource | GET | YearMonthCounts | Lists download counts grouped by year and month and filtered by source. | false | fromDate, toDate, source |
Occurrence Metrics
This API provides services to retrieve various counts and metrics provided about occurrence records. The kind of counts that are currently supported are listed by the schema, method, see below for details.
Resource URL | Method | Response | Description | Parameters |
---|---|---|---|---|
/occurrence/count | GET | Count | Returns occurrence counts for a predefined set of dimensions. The supported dimensions are enumerated in the /occurrence/count/schema service. An example for the count of georeferenced observations from Canada: /occurrence/count?country=CA&isGeoreferenced=true&basisOfRecord=OBSERVATION. | |
/occurrence/count/schema | GET | Count | List the supported metrics by the service. |
Occurrence Inventories
This API provides services that list all distinct values together with their occurrence count for a given occurrence property. Only a few properties are supported, each with its own service to call.
Resource URL | Method | Response | Description | Parameters |
---|---|---|---|---|
/occurrence/counts/basisOfRecord | GET | Counts | Lists occurrence counts by basis of record. | |
/occurrence/counts/year | GET | Counts | Lists occurrence counts by year. | year |
/occurrence/counts/datasets | GET | Counts | Lists occurrence counts for datasets that cover a given taxon or country. | country, taxonKey |
/occurrence/counts/countries | GET | Counts | Lists occurrence counts for all countries covered by the data published by the given country. | publishingCountry |
/occurrence/counts/publishingCountry | GET | Counts | Lists occurrence counts for all countries that publish data about the given country. | country |
Query parameters explained
The following parameters are for use exclusively with the Occurrence API described above.
Parameter | Description |
---|---|
basisOfRecord | Basis of record, as defined in our BasisOfRecord enum |
catalogNumber | An identifier of any form assigned by the source within a physical collection or digital dataset for the record which may not be unique, but should be fairly unique in combination with the institution and collection code. |
classKey | Class classification key. |
collectionCode | An identifier of any form assigned by the source to identify the physical collection or digital dataset uniquely within the context of an institution. |
continent | Continent, as defined in our Continent enum |
coordinateUncertaintyInMeters | The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. Supports range queries. |
country | The 2-letter country code (as per ISO-3166-1) of the country in which the occurrence was recorded. |
crawlId | Crawl attempt that harvested this record. |
datasetId | The ID of the dataset. |
datasetKey | The occurrence dataset key (a uuid). |
datasetName | The name of the dataset. |
decimalLatitude | Latitude in decimals between -90 and 90 based on WGS 84. Supports range queries. |
decimalLongitude | Longitude in decimals between -180 and 180 based on WGS 84. Supports range queries. |
depth | Depth in meters relative to altitude. For example 10 meters below a lake surface with given altitude. Supports range queries. |
distanceFromCentroidInMeters | Distance in metres to a known centroid of a country or area, if that distance is 5000m or less. Occurrences (especially specimens) near a country centroid may have a poor-quality georeference, especially if coordinateUncertaintyInMeters is blank or large. |
elevation | Elevation (altitude) in meters above sea level. Supports range queries. |
establishmentMeans | EstablishmentMeans, as defined in our EstablishmentMeans enum |
eventDate | Occurrence date in ISO 8601 format: yyyy, yyyy-MM, yyyy-MM-dd, or MM-dd. Supports range queries. |
eventId | An identifier for the information associated with a sampling event. |
facet | A facet name used to retrieve the most frequent values for a field. Facets are allowed for all the parameters except for: eventDate, geometry, lastInterpreted, locality, organismId, stateProvince, waterBody. This parameter may by repeated to request multiple facets, as in this example /occurrence/search?facet=datasetKey&facet=basisOfRecord&limit=0 |
facetMincount | Used in combination with the facet parameter. Set facetMincount={#} to exclude facets with a count less than {#}, e.g. /search?facet=type&limit=0&facetMincount=10000 only shows the type value 'OCCURRENCE' because 'CHECKLIST' and 'METADATA' have counts less than 10000. |
facetMultiselect | Used in combination with the facet parameter. Set facetMultiselect=true to still return counts for values that are not currently filtered, e.g. /search?facet=type&limit=0&type=CHECKLIST&facetMultiselect=true still shows type values 'OCCURRENCE' and 'METADATA' even though type is being filtered by type=CHECKLIST |
facetOffset, facetLimit | Facet parameters allow paging requests using the parameters facetOffset and facetLimit as this example /occurrence/search?facet=datasetKey&datasetKey.facetLimit=5&datasetKey.facetOffset=10&limit=0 |
familyKey | Family classification key. |
format | Export format, accepts TSV(default) and CSV |
fromDate | Start partial date of a date range, accepts the format yyyy-MM, for example: 2015-11 |
gadmGid | A GADM geographic identifier at any level, for example AGO, AGO.1_1, AGO.1.1_1 or AGO.1.1.1_1 |
gadmLevel | A GADM region level, valid values range from 0 to 3 |
gadmLevel0Gid | A GADM geographic identifier at the zero level, for example AGO |
gadmLevel1Gid | A GADM geographic identifier at the first level, for example AGO.1_1 |
gadmLevel2Gid | A GADM geographic identifier at the second level, for example AFG.1.1_1 |
gadmLevel3Gid | A GADM geographic identifier at the third level, for example AFG.1.1.1_1 |
genusKey | Genus classification key. |
geoDistance | Filters to match occurrence records with coordinate values within a specified distance of a coordinate, it supports units: in (inch), yd (yards), ft (feet), km (kilometers), mmi (nautical miles), mm (millimeters), cm centimeters, mi (miles), m (meters), for example /occurrence/search?geoDistance=90,100,5km |
geometry | Searches for occurrences inside a polygon described in Well Known Text (WKT) format. Only POINT, LINESTRING, LINEARRING, POLYGON and MULTIPOLYGON are accepted WKT types. For example, a shape written as POLYGON ((30.1 10.1, 40 40, 20 40, 10 20, 30.1 10.1)) would be queried as is, i.e. /occurrence/search?geometry=POLYGON((30.1 10.1, 40 40, 20 40, 10 20, 30.1 10.1)). Polygons must have anticlockwise ordering of points, or will give unpredictable results. (A clockwise polygon represents the opposite area: the Earth's surface with a 'hole' in it. Such queries are not supported.) |
hasCoordinate | Limits searches to occurrence records which contain a value in both latitude and longitude (i.e. hasCoordinate=true limits to occurrence records with coordinate values and hasCoordinate=false limits to occurrence records without coordinate values). |
hasGeospatialIssue | Includes/excludes occurrence records which contain spatial issues (as determined in our record interpretation), i.e. hasGeospatialIssue=true returns only those records with spatial issues while hasGeospatialIssue=false includes only records without spatial issues. The absence of this parameter returns any record with or without spatial issues. |
hl | Set hl=true to highlight terms matching the query when in fulltext search fields. The highlight will be an emphasis tag of class 'gbifH1' e.g. /search?q=plant&hl=true. Fulltext search fields include: title, keyword, country, publishing country, publishing organization title, hosting organization title, and description. One additional full text field is searched which includes information from metadata documents, but the text of this field is not returned in the response. |
identifiedBy | The person who provided the taxonomic identification of the occurrence. |
identifiedByID | Identifier (e.g. ORCID) for the person who provided the taxonomic identification of the occurrence. |
institutionCode | An identifier of any form assigned by the source to identify the institution the record belongs to. Not guaranteed to be unique. |
issue | A specific interpretation issue as defined in our OccurrenceIssue enum |
kingdomKey | Kingdom classification key. |
lastInterpreted | This date the record was last modified in GBIF, in ISO 8601 format: yyyy, yyyy-MM, yyyy-MM-dd, or MM-dd. Supports range queries. Note that this is the date the record was last changed in GBIF, not necessarily the date the record was first/last changed by the publisher. Data is re-interpreted when we change the taxonomic backbone, geographic data sources, or interpretation processes. |
license | The type license applied to the dataset or record. |
limit | The maximum number of results to return. This can't be greater than 300, any value greater is set to 300. |
locality | The specific description of the place. |
mediaType | The kind of multimedia associated with an occurrence as defined in our MediaType enum |
modified | The most recent date-time on which the resource was changed, according to the publisher |
month | The month of the year, starting with 1 for January. Supports range queries. |
networkKey | The GBIF Network to which the occurrence belongs. |
occurrenceId | A single globally unique identifier for the occurrence record as provided by the publisher. |
occurrenceStatus | Either 'ABSENT' or 'PRESENT'; the presence or absence of the occurrence. |
orderKey | Order classification key. |
organismId | An identifier for the Organism instance (as opposed to a particular digital record of the Organism). May be a globally unique identifier or an identifier specific to the data set. |
organismQuantity | A number or enumeration value for the quantity of organisms. |
organismQuantityType | The type of quantification system used for the quantity of organisms. |
otherCatalogNumbers | Previous or alternate fully qualified catalog numbers. |
phylumKey | Phylum classification key. |
preparations | Preparation or preservation method for a specimen. |
programme | A group of activities, often associated with a specific funding stream, such as the GBIF BID programme. |
projectId | The identifier for a project, which is often assigned by a funded programme. |
protocol | Protocol or mechanism used to provide the occurrence record. |
publishingCountry | The 2-letter country code (as per ISO-3166-1) of the owining organization's country. |
publishingOrg | The publishing organization key (a uuid). |
publishingOrgKey | The publishing organization key (a uuid). |
q | Simple search parameter. The value for this parameter can be a simple word or a phrase. |
recordedBy | The person who recorded the occurrence. |
recordedByID | Identifier (e.g. ORCID) for the person who recorded the occurrence. |
recordNumber | An identifier given to the record at the time it was recorded in the field. |
relativeOrganismQuantity | The relative measurement of the quantity of the organism (i.e. without absolute units). |
repatriated | Searches for records whose publishing country is different to the country where the record was recorded in. |
sampleSizeUnit | The unit of measurement of the size (time duration, length, area, or volume) of a sample in a sampling event. |
sampleSizeValue | A numeric value for a measurement of the size (time duration, length, area, or volume) of a sample in a sampling event. |
samplingProtocol | The name of, reference to, or description of the method or protocol used during a sampling event |
scientificName | A scientific name from the GBIF backbone. All included and synonym taxa are included in the search. Under the hood a call to the species match service is done first to retrieve a taxonKey. Only unique scientific names will return results, homonyms (many monomials) return nothing! Consider to use the taxonKey parameter instead and the species match service directly |
source | Source of a download, for example, GBIF-portal, python, etc. |
speciesKey | Species classification key. |
stateProvince | he name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs. |
subgenusKey | Subgenus classification key. |
taxonId | The taxon identifier provided to GBIF by the data publisher. |
taxonKey | A taxon key from the GBIF backbone. All included and synonym taxa are included in the search, so a search for aves with taxonKey=212 (i.e. /occurrence/search?taxonKey=212) will match all birds, no matter which species. |
toDate | End partial date of a date range, accepts the format yyyy-MM, for example: 2019-12 |
typeStatus | Nomenclatural type (type status, typified scientific name, publication) applied to the subject. |
userCountry | Country country of the user who made the requested |
verbatimScientificName | The scientific name provided to GBIF by the data publisher, before interpretation and processing by GBIF. |
waterBody | The name of the water body in which the Locations occurs. |
year | The 4 digit year. A year of 98 will be interpreted as AD 98. Supports range queries. |
Occurrence Download Predicates
A download predicate is an query expression to retrieve occurrence record downloads. A working example using curl: Put this in a file called query.json:
{
"creator": "userName",
"notificationAddresses": [
"userEmail@example.org"
],
"sendNotification": true,
"format": "SIMPLE_CSV",
"predicate": {
"type": "and",
"predicates": [
{
"type": "equals",
"key": "BASIS_OF_RECORD",
"value": "PRESERVED_SPECIMEN"
},
{
"type": "in",
"key": "COUNTRY",
"values": [
"KW",
"IQ",
"IR"
]
}
]
}
}
Where creator
is your GBIF username, notificationAddress
is a email address to notify when the download is ready, and format
one of the following values: SIMPLE_CSV
, DWCA
, or SPECIES_LIST
(information about download formats).
Then issue:
curl --include --user userName:PASSWORD --header "Content-Type: application/json" --data @query.json https://api.gbif.org/v1/occurrence/download/request
A download ID is returned. This shows the download information, including the download link and DOI once the download is ready:
curl -Ss https://api.gbif.org/v1/occurrence/download/0001005-130906152512535 | jq .
(| jq .
is optional, but formats the JSON nicely.)
curl --location --remote-name https://api.gbif.org/v1/occurrence/download/request/0001005-130906152512535.zip
It's also possible to format the Curl request without an external query file:
curl --include --user userName:PASSWORD --header "Content-Type: application/json" --data '{"creator": "userName","notificationAddresses": ["userEmail@example.org"],"format": "SIMPLE_CSV","predicate": {"type": "in","key": "COUNTRY","values": ["FJ","TO"]}}' https://api.gbif.org/v1/occurrence/download/request
The table below lists the supported predicates that can be combined to build download requests that can be POSTed to the download API.
Predicate | Description | Example |
---|---|---|
equals | equality comparison |
{
or
{
or
{
( matchCase is optional, default is false .)
|
and | logical AND (conjuction) |
{
|
or | logical OR (disjunction) |
{ When requesting many values of the same field (for example, multiple taxa or countries) the "in" predicate (below) is more appropriate. |
lessThan | is less than |
{
|
lessThanOrEquals | is less than or equals |
{
|
greaterThan | is greater than |
{
|
greaterThanOrEquals | is greater than or equals |
{
|
in | specify multiple values to be compared |
{
matchCase can be added if required.
|
within | geospatial predicate that checks if the coordinates are inside a POLYGON |
{
|
geoDistance | geoDistance predicate that checks if coordinates are within a specified distance of a geographical coordinate. Supported units: in (inch), yd (yards), ft (feet), km (kilometers), mmi (nautical miles), mm (milimiters), cm centimeters, mi (miles) and m (meters) |
{
|
not | logical negation |
{
|
like | search for a pattern, ? matches one character, * matches zero or more characters |
{
matchCase can be added if required.
|
isNull | has an empty value |
{
|
isNotNull | has a non-empty value |
{
|
Occurrence Download Limits
Occurrence downloads are a very resource demanding service which needs to be monitored and limited according to the GBIF platform load. In order to avoid that downloads requested by a single user utilize most of the resources two rules have been set:
- A download predicate may contain a maximum of 101,000 items (taxon keys, kingdom keys, phylum keys etc, catalogue numbers, occurrence ids etc).
- A download predicate may contain a maximum of 10,000 points in any "within" predicate geometries.
- If the total number of downloads is less than 100 any given user can have no more than 3 downloads simultaneously.
- If the total number of downloads is less than 1000 any given user can only have 1 download.
The number of user downloads currently in progress can be seen on the System Health page. Your own downloads can be seen on your My Downloads page.
GADM regions search and browsing
Database of Global Administrative Areas, is a high-resolution database of country administrative areas.
This API provides services to search and browse regions and sub-regions down to three levels of sub-regions.
Resource URL | Method | Response | Description | Paging | Parameters |
---|---|---|---|---|---|
/geocode/gadm/search | GET | GadmRegion |
Search for GADM regions.
When paramters are used the results are narrowed to results at the gadmLevel parameter and that are sub-regions gadmGid
|
true | q, gadmLevel, gadmGid |
/geocode/gadm/{gid} | GET | GadmRegion | Gets details for GADM region | false | |
/geocode/gadm/{gid}/subdivisions | GET | GadmRegion | Gets sub-regions or divisions of region | false | q |
/geocode/gadm/browse | GET | GadmRegion | Lists GADM regions at the highest level. | false | |
/geocode/gadm/browse/DNK/{gid} | GET | GadmRegion | Lists sub-regions of a region at the first level. | false | |
/geocode/gadm/browse/DNK/{gid0}/{gid1} | GET | GadmRegion | Lists sub-regions of a region at the second level. | false | q |
/geocode/gadm/browse/DNK/{gid0}/{gid1}/{gid2} | GET | GadmRegion | Lists sub-regions of a region at the third level. | false | q |