gbif.org
Informatics
Participation
Governance
Communications
gbif.orggbif.org

Sharing information about scientific names in a common, interoperable way is the foundation of the Global Names Architecture. In its simplest form, a list of name strings can be shared, but it’s better to have more detailed nomenclatural and taxonomic information. The ability to share even more species information in a flexible way is key not only to GBIF, but especially to other projects that have their own specific needs.

The simplest and most efficient way to do this is the Darwin Core Archive (DwC-A), which holds an entire dataset, referred to here as a ‘checklist’. In addition, GBIF will support the use of the Taxonomic Concept Transfer Schema (TCS) developed by TDWG.

Darwin Core Checklist Archives

For general information about Darwin Core Archives and extensions, please read its introduction page. At the core of such a checklist archive is a data file that holds scientific names and makes use of the taxonomic Darwin Core terms. The ‘dwc:taxonID’ is required to uniquely identify a row. No other files are needed, but additional information can be shared by using extension data files that refer to a core record, i.e. name or taxon, via the dwc:taxonID, in a  similar way that foreign keys are used in a relational database.

Taxonomic DwC terms

The new Darwin Core, which provides more taxonomic terms, is currently under review and therefore subject to change. At present, the GNA is using the fixed April version of terms that have also been used in creating IPT 1.0RC1.

The only essential terms are the dwc:taxonID and dwc:scientificName, and the most basic checklist could have just those two columns. The following Darwin Core terms are also relevant for checklists and should be shared if possible:


Some of the terms refer to other names and exist in two versions, for example higherTaxon and higherTaxonID. One uses an ID and the other the explicit full name string. If possible, the term using the ID is preferred; if the ID version is provided, it takes precedence over the explicit one even if they contradict each other.
The denormalised higher taxonomy terms (kingdom, phylum, class, order, family, genus) are deprecated in favour of the more explicit higherTaxonID and can be ignored if such an ID exists. The higherTaxonID also allows one to publish taxonomies including more than just the major ranks.

Vocabularies

Some terms have a controlled vocabulary, which should be favoured over free text. The TDWG community provides some of these vocabularies, e.g. for taxonomic ranks. For other terms, controlled vocabularies are still being developed and will be listed here as soon as they agreed upon in the GNA.

Recommended Extensions

Extensions are under development and are not yet officially released. We are working on simple standards for sharing vernacular names, type specimen, literature references, species distributions, alternative identifiers, generic textual descriptions and a minimalistic species ‘profile’. GBIF maintains the latest version of these extensions as a Google Spreadsheet.

The Taxonomic Concept Transfer Schema (TCS) developed by TDWG provides a comprehensive standard standard for organism names and concepts. GBIF will support the indexing of TCS in either XML or RDF.