Tweets, chirps and hoots: collaborative project Xeno-canto adds 170,000 sound-based bird records

New Dutch dataset more than doubles the number of species occurrences with audio recordings

woodpecker-sonogram
Great spotted woodpecker (Dendrocopos major) occurrences: Photo by katunchik (CC BY-NC 4.0) and audio (illustrated by sonogram by Xeno-Canto) by Bram Vogels (CC BY-NC-SA 4.0).

Images and videos recorded by professional researchers and citizen scientists often provide evidence of species occurring in time and space. But sound recordings can be equally important for identifying species–especially those more easily heard than seen.

While more than 31 million species occurrences available in GBIF.org have images attached to them, only about 100,000 records have accompanying sound files. However, since its addition last month, a new dataset has more than doubled this number, bringing 170,041 more audio-enabled occurrence records to GBIF.org.

Xeno-canto (XC) is a long-term collaborative project dedicated to collecting and sharing the sounds of wild birds from across the world. Started in 2005, XC accepts contributions from anyone–professional researchers, dedicated amateur birders or aspiring citizen scientists–who is willing and able to record bird sounds. More than 4,000 XC contributors have recorded and uploaded the sounds of 10,063 avian species–data that has already been used in scientific studies (e.g. Avendaño et al, 2017). The dataset metrics can also give users a sense of the taxonomic and geographical scope of the XC contribution.

Identifications of recordings are subject to crowd-sourced validation by the Xeno-canto community, ensuring accurate, high-quality species identification. As data becomes discoverable through GBIF.org, XC contributors not only help popularize bird sound recordings worldwide, but also add knowledge about avian distributions for use in research and policy-making.

XC intends to update the dataset on GBIF.org regularly, as users add around 5,000 new recordings every month. The next update is expected to bring the total number of records to more than 250,000 records, a spike prompted by current users adopting more open licences for the occurrence information (while maintaining different ones applied to audio files).

As multimedia evidence in occurrence records have become more frequent, it's important to make it easy for users to view, watch and listen to them. The arrival of the XC dataset prompted improvements to the GBIF.org occurrence pages, making the interface for playback and viewing more accessible and intuitive. Now, in addition to viewing images, users can now also watch videos and listen to audio recordings–directly on the occurrence page.

Members of GBIF community in the Netherlands play a critical role in making the XC dataset available through GBIF.org. Naturalis Biodiversity Center has provided the project with long-term support and funding, and persistent, multi-year engagement by the staff of NLBIF has led to them hosting the Dutch national node to host this version of the project dataset.

While audio-enabled records available in GBIF.org are still dominated by the avian variety, users of GBIF.org may be surprised be the taxanomic breadth of other audio content–including sounds of spiders, bees, monkeys–and even an underwater recording of spawning cod.

The XC project is also featured in two Dutch articles–linked to below.