Derived dataset

About derived datasets

Derived datasets are citable records of GBIF-mediated occurrence data derived either from:

  • a download that has been filtered/reduced significantly, or
  • data accessed through a cloud service, e.g. Microsoft AI for Earth (Azure), or
  • data obtained by any means for which no DOI was assigned, but one is required (e.g. third-party tools accessing the GBIF search API)

When created, a derived dataset is assigned a unique DOI that can be used to cite the data. To create a derived dataset you will need to authenticate using a account and provide:

  • a title of the dataset,
  • a list of the GBIF datasets (by DOI or datasetKey) from which the data originated, ideally with counts of how many records each dataset contributed,
  • a persistent URL of where the extracted dataset can be accessed,
  • a description of how the dataset was prepared,
  • (optional) the GBIF download DOI, if the dataset is derived from an existing download , and
  • (optional) a date for when the derived dataset should be registered if not immediately .