ShinyBIOMOD wins 2020 GBIF Ebbe Nielsen Challenge

International team led by Ian Ondo of Royal Botanic Gardens, Kew, develops platform that improves visualization, accounts for data biases and familiarizes users with best practices in species distribution modelling

Ian Ondo of Royal Botanic Gardens, Kew, led the team developing ShinyBIOMOD, the winning entry in the 2020 Ebbe Nielsen Challenge. Photo courtesy of Ian Ondo.

A multinational team led by Ian Ondo, a research assistant at the Royal Botanic Gardens, Kew, has won first prize in the 2020 GBIF Ebbe Nielsen Challenge. Their winning entry, ShinyBIOMOD, is a user-friendly open-source interface that extends the functionalities of biomod2, a well-established species ensemble modelling platform.

Ondo developed ShinyBIOMOD with RBG Kew Director of Science Alexandre Antonelli (also of the Gothenburg Global Biodiversity Centre in Sweden) and Samuel Pironon alongside Wilfried Thuiller and Maya Gueguen of the Laboratoire d’Ecologie Alpine, a joint research unit of the CNRS at Université Grenoble-Alpes.

Their R-based toolbox aims to help users of all experience levels build species distribution models (SDMs) and ecological niche models (ENMs), providing them with step-by-step options for exploring, selecting and previewing additional data throughout the modelling process.

SDMs and EDMs are tools that allow biogeographers and conservation biologists to predict the potential distribution of species across space and time. While they have become essential to research, both for understanding the drivers behind shifting spatial patterns and for detecting variation in species ranges, especially within the context of global environmental changes, developing such models is a complex process.

"The processes of building and accurately interpreting ecological niche models follows a series of potentially error-prone steps and requires data manipulation skills and a reasonable understanding of statistics—all of which makes it very challenging for non-GIS experts to produce," says Ondo. "ShinyBIOMOD gives people access to a full range of statistical tools that examine relationships between species and their environment while giving them an understanding of good practices and uncertainties associated with the predictions."

The expert jury selected five more winners in the Challenge, an annual incentive prize that honours the memory of Dr Ebbe Schmidt Nielsen, an inspirational leader in the fields of biosystematics and biodiversity informatics and one of the principal founders of GBIF.

"Knowing where species live and how they interact will help humanity address the major challenges we face in the context of the biodiversity crisis and global climate change," said Jurate de Prins of the Société royale belge d'Entomologie-Koninklijke Belgische Vereniging voor Entomologie (SRBE-KBVE), GBIF Science Committee member and the Challenge jury chair. "The ShinyBIOMOD team has applied the best practices of bioinformatics, linking their platform to key biodiversity and environmental data resources to enable others to explore, understand and reveal the factors that shape species distribution patterns across space and time."

Both second-place winners hail from Belgium's Meise Botanic Garden. With his entry for Linking nomenclature to type specimens, Maarten Trekels automates the process of bringing together the essential materials needed for botanical taxonomic revisions. His MeiseBG colleague Quentin Groom prepared InteractIAS, which combines and visualizes species interactions and occurrence data to support national expert risk assessments on invasive species—tasks reflecting his fellowship with the Centre for Invasion Biology at South Africa's Stellenbosch University.

A set of three entries by individuals—two of them previous prizewinners—earned the final places in this year's Ebbe Nielsen Challenge.

  • DNA barcode browser from 2018 Challenge co-winner Rod Page of the University of Glasgow, which adds sequence search and phylogenetic information and enriches the information display for the increasing amount of DNA-derived occurrence records available through GBIF
  • Voyager, a toolkit developed by Ivvet Abdullah-Modinou of the British Science Association and Ben Scott of the Natural History Museum, London that fits occurrence data to historical nautical voyages and supports inferences that can correct, improve or even add to associated marine specimen records
  • Mass Georeferencing Tool, an open-source tool created by Luis J. Villanueva of the Smithsonian Institution (who shared second prize in 2018) that separates georeferencing from locality selection and treats it as a service with the goal of reducing errors and scaling up to meet the ongoing demands of digitizing a collection that grows by thousands of new records every month

As this year’s first-place winner, the ShinyBIOMOD team will receive €15,000 from a total prize pool of €34,000. The two second-prize entries will each receive €5,000 prizes, while the third-prize winners will each receive €3,000.

2020 GBIF Ebbe Nielsen Challenge prize winners

First Prize

ShinyBIOMOD: An R application for modelling species distribution
This R application helps users of all experience levels build distribution models (SDMs) and ecological niche models (ENMs) by guiding them step-by-step through the modelling process. ShinyBIOMOD providing a user-friendly interface for the well-known R package that simplifies the selection and fine-tuning of data and provides access to a full range of statistical tools that examine relationships between species and their environments. Users also gain familiarity with good modelling practices in the course of an intuitive workflow that allows exploration and visualization of the effect their choices, improving their understanding of good practice as well as the uncertainty behind modelled predictions.

Second Prize

Linking nomenclature to type specimens
Type specimens are essential for any taxonomic revision, but they can be frustratingly difficult to find, whether they are mislabelled, not labelled as types or scattered across the world's collections. Drawing on open taxonomic data, this Jupyter Notebook application assembles an overview of the botanical materials needed to revise a taxon, starting with a search of GBIF for type materials. Additional scripts link them to original taxonomic protologues from Plazi's Treatment Bank, collection identifiers from Index Herbariorum and synonyms from both International Plant Names Index (IPNI) and Plants of the World Online. Users can also enrich the available taxonomic information by identifying possible mistakes in specimen naming, flagging potentially unlabelled types and leveraging knowledge of related specimens to discover types hidden in the collections. video | GitHub

InteractIAS: A Jupyter notebook to support expert risk assessment on invasive species
Expert risk assessments of invasive and alien species play an important role in informing national environmental, trade and infrastructure policies. The interactions of invasives with other species offer the best means of assessing such risks, but in the absence of sophisticated models for ecosystems or complex species interaction networks, national experts often lack evidence for evaluating the possible risks from new invasions—particularly for invasive species from other continents. InteractIAS combines occurrences records from the GBIF network with species interactions data from Global Biotic Interactions (GloBI), enabling users to visualize the resulting network of interactions weighted by species' areas of occupancy. By highlighting where a country's existing species may face direct or indirect impacts from invasives, the application can help guide experts’ risk assessments and even trigger targeted research. video | GitHub

Third Prize

DNA barcode browser
Genomic data from DNA barcoding and metagenomics is becoming an increasingly important contribution to GBIF. The taxonomic name, locality and date that comprise traditional occurrence records readily lends itself to display in lists or as dots on a map, but such visualizations ignore the very thing that makes sequence data unique: the sequence. This proof of concept explores methods for integrating relevant tools for DNA-based occurrences, particularly the addition of a sequence search interface and dynamic "alignment-free" construction of phylogenetic trees. Future extensions could integrate measures of phylogenetic diversity and sample geographically comparable areas, using grids to construct a global map of phylogenetic diversity.
video | GitHub repo

By fitting GBIF-mediated occurrence records to the tracks of historical maritime voyages of discovery, this web application visualizes the progress of those trips as collecting expeditions. The result reveals the work of the many naturalists who crewed these expeditions, gathering specimens that still form the backbone of many natural history collections around the world. Perhaps more importantly, combining GBIF-mediated records with this open data for ship logs from the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) enables the community to begin correcting, augmenting, making inferences and even "rediscovering" marine data that might be missing or missing key metadata in GBIF. Github

Mass Georeferencing Tool
Georeferencing is an important but complex step in the process of preparing and digitizing museum specimen data. That challenge increases considerably with the 146 million specimens in the Smithsonian's global-scale collections, which adds the thousands of new records every month, both from new specimens and from newly digitized historical specimens collected over the past 150 years. This open-source tool is a critical component in new workflows that the Smithsonian is preparing to support georeferencing of records on a massive scale. By partitioning off technical, computing and software elements and treating georeferencing as a separate service, collections staff can direct their focus toward selecting the correct locality for each record and spend less time worrying about the software and data. The approach is also expected to reduce errors and empty data fields and fix other geospatial issues by automating the quality assurance and quality control processes.

Jury for 2020 Ebbe Nielsen Challenge