
Almost all information about any species recorded in the past 250 years is tied to its scientific name. Names link historic information about a taxon – such as the original description of the type that served as the basis of the species name – with contemporary information such as gene sequences and new scholarly publications. For this reason, names are essential to keyword searches in online information systems.
Species are grouped into taxonomic hierarchies that may have a global, geographic or thematically defined scope. Such lists are often referred to as ‘checklists’ and provide taxonomic ’backbones’ around which species information can be organized, viewed and retrieved.
A species by any other name: special challenges
Biodiversity science lacks a complete list of scientific names of organisms. Also missing is a complete list of species (although efforts are underway to create one). But note that the two lists are not the same, for the simple reason that a single species may be known by more than one scientific name. This is sometimes called the ’synonym problem' and is one of a set of challenges sometimes referred to as the ’names problem’ in biology. Another problem is that the same name may refer to taxa as diverse as a plant and an animal: the ‘homonym problem’. The synonym and homonym problems extend to common names, which occur in all languages. Latinised names are also notoriously difficult to spell, which can create special groups of synonyms all based on the same name. There is also a more complex 'taxon concept problem' where a single species may be defined differently by different experts.
Tackling these challenges requires the means to discover, access and use information about names from a wide range of nomenclatural and taxonomic databases. This means being able to identify gaps and overlaps in taxonomic and nomenclatural coverage between these systems and the full array of names that exist in the vast corpus of literature, specimens, observations, reports, images, gene sequences and other data that collectively form the knowledge base of all information for all species.
GBIF requires infrastructure that permits interoperability between those who mobilize and serve content tied to scientific names (the raw materials that define the complete list of scientific names) with those who can provide information about them (the syntactic and semantic information that enable names to be organized into a framework that has taxonomic integrity). At the moment, there is no collective mechanism for coordinating and collating the activities between these two domains in order to evaluate the degree of completeness and overlap both within and among them.
At the most basic level, this infrastructure needs a common discovery and access architecture that allows those who provide information about names to be visible to those who curate content containing them. This is the rationale for the development of the Global Names Architecture, and GBIF is playing a key role in its development.



