Skip to main content

    Data use 9 December 2016

    Using DNA sequences to improve taxonomic identifications

    Some species are easily misidentified because they appear very similar to other species. Affecting large biodiversity repositories, such as GBIF, how can such misidentifications be corrected without having to go through millions of records?

    GBIF-mediated data resources used : 4,672 species occurrences
    <p><em>Usnea longissima</em>&nbsp;(now known as&nbsp;<em>Dolichousnea longissima</em>)<em>&nbsp;</em>- or it it? <a href="https://www.flickr.com/photos/brewbooks/24288586394/">Photo</a> by J Brew via iNaturalist, licensed under <a href="https://creativecommons.org/licenses/by-sa/2.0/">CC BY-SA 2.0</a>.</p>

    Some species are easily misidentified because they appear very similar to other species. This affects large biodiversity repositories, such as GBIF, but how can such misidentifications be corrected without having to go through millions of records? In this study, researchers present a strategy that combines DNA sequence data and specimen occurrence data to potentially find incorrectly identified specimens in large repositories such as GBIF. The researchers create ecological niche models for the lichen fungus, Usnea longissima, by using georeferenced specimen data that at the same time have been confirmed to represent a single species by DNA sequence data. When plotting GBIF-mediated occurrences against the verified distribution of the fungus, outliers identified potentially records for taxonomic scrutiny and revision. Revision of these outliers revealed that most were, in fact, misidentified and belonged to similar species with different distributions. The study raises interesting questions about the potential of DNA sequence data to improve the quality of species information in GBIF.

    Citation

    TopicBiodiversity science
    TopicData analysisData curation & quality