Is taxonomic bias in GBIF explained by research focus or societal preferences?

Study of taxonomic bias in GBIF-mediated occurrences finds correlations between societal preferences and occurrence coverage

Data resources used via GBIF : 649,767,741 species occurrences
Polyrhachis armata
Polyrhachis armata by budak via iNaturalist licensed under CC BY-NC 4.0.

Describing and understanding taxonomic biases and their causes are undeniable priorities. Attempts to explain taxonomic bias include apart from obvious, intrinsic reasons (such as species in remote locations), also extrinsic factors, such as research focus and societal pressures.

In this study, researchers quantify the relative impact of taxonomic research and societal preferences on taxonomic bias in data from GBIF, the world’s largest open access biodiversity database.

Using a range of measures the researchers assessed both bias and precision across 626 million GBIF-mediated occurrences representing one million species in 24 taxonomic classes. They find that 94 per cent of occurrences are identified (at least) at species level. Highest precision is found among plants, fungi and birds, whereas classes Maxillopoda (crustaceans) and Anthozoa (corals) lack species-level identification in a third of the occurrences. Within certain insect orders, however, taxonomic presicion is as low as 0.5 per cent.

Further comparing number of species with occurrences, with known species richness within each class, the researchers found birds and insects to be the most over- and under-represented classes, respectively. This bias is not new, but recent growth in data also shows that this gap is not becoming smaller.

Only in three classes (birds, amphibians and ray-finned fishes) did more than half the species have more than 20 occurrences. In comparisson, less than nine per cent of arthropod species has a similar coverage.

Relating these findings to number of web results for different species as a proxy for societal preferences, their model uncovered several significant correlations showing that a high societal preference is related to a high number of GBIF occurrences. This was, however, not the case for taxonomic research (proxied by number of research papers published), where the model revealed fewer correlations that in some cases were even negative. For instance, for the class Agaricomycetes having a large volume of research was related to a lower number of GBIF occurrences.

The authors conclude:

Many international projects have been developed since the Convention on Biological Diversity, illustrating an increased awareness of the astonishing diversity of functions and services that biodiversity supports. Nevertheless, while biodiversity declines at an unprecedented rate, taxonomic bias is still a burden on biodiversity studies. It is urgent that we get rid of this burden and that we start embracing the whole of biodiversity. (CC BY 4.0)


Troudet J, Grandcolas P, Blin A, Vignes-Lebbe R and Legendre F (2017) Taxonomic bias in biodiversity data and societal preferences. Scientific Reports. Springer Nature 7(1). Available at: