More than half the species in the IUCN Red List's Data Deficient category face risk of extinction

New GBIF-enabled study analysing machine learning-derived probabilities of extinction suggests that up to 85 per cent of amphibians categorized as Data Deficient could disappear

Oedipina capitalina
Oedipina capitalina Solis, Espinal, Valle, O'Reilly, Itgen & Townsend, 2016 observed in Comayagüela, Honduras by Josue Ramos Galdamez (CC BY-NC 4.0). This salamander was assessed in March 2019 and categorized as Data Deficient. It is among the species with the top 25 highest predicted extinction risks according to the study.

In a paper published today in Communications Biology, researchers from the Norwegian University of Science and Technology (NTNU) presented results from novel machine learning classifier suggesting that more than half the species categorized as Data Deficient (DD) in the IUCN Red List of Threatened Species™ (Red List) face greater probability of extinction.

The threats are even more shocking for some groups, with the classifier predicting that 85 per cent of DD-listed amphibians and more than half of the marine invertebrates, insects, mammals and reptiles in the same category threatened by extinction.

The Red List categorizes nearly 15 per cent of the nearly 150,000 assessed species as Data Deficient, which often gives an impression of lower risks and leads to their exclusion from studies of biodiversity impact and change. In line with some previous studies, the present analysis suggests that a larger portion of DD species may actually be more threatened than data-sufficient species.

While species listed as Data Deficient may be reasonably well understood, the available abundance and distribution data that Red List experts rely on to produce direct or indirect assessments of their risk of extinction remains inadequate.

"GBIF-mediated data was vital for training this algorithm," said Jan Borgelt, PhD candidate with the Department of Energy and Process Engineering at NTNU and lead author of the research. "The number of cells in which GBIF data was available for a species is amongst the most important features for accurately predicting the species’ extinction risk."

The classifier relied on a range of more than 400 predictors, human pressures and environmental stressors including taxonomy, habitat preferences, expert range maps and GBIF-mediated occurrences, based on which variables on climate, land cover, human footprint, pesticide uses and several other factors were derived.

The authors used a dataset of 28,363 DS species for training and testing the classifier, obtaining an overall accuracy of 85 per cent. Applied to a dataset of 7,699 DD species, PE scores were higher on average for these than DS species.

The paper's authors have created a interactive website on which users can test the algorithm of the classifier and explore the full dataset of DD species and their extinction risk estimates.

Earlier this year, GBIF and IUCN achieved a milestone in their collaboration with the release of a new feature that allows users to filter occurrences by global IUCN Red List Category, including Data Deficient

Borgelt J, Dorber M, Høiberg MA and Verones F (2022) More than half of data deficient species predicted to be threatened by extinction. Communications Biology. Springer Science and Business Media LLC. Available at: https://doi.org/10.1038/s42003-022-03638-9