Not just a numbers game: Small collections with big impact

Study suggests that small collections contribute unique information with the power to improve species distribution models significantly

GBIF-mediated data resources used : 15,967 species occurrences

Great bulrush (*Schoenoplectus tabernaemontani*), one of the species modelled in the study. Photo by Arild Omberg via the Norwegian Species Observation Service–licensed under CC BY 4.0.

Natural history collections of many sizes contribute data to GBIF.org–used extensively every week to model the distribution of species for a variety of purposes. Every record counts, however, small collections may be more regional in scope with a specific taxomic or ecological focus compared to larger collections.

To quantify the impact of small collections, authors of this study built distribution models of five test case species relying on GBIF-mediated data partitioned by size of source collection. Despite having fewer records, the dataset based on small collections contributed unique information that when combined with the data from the large collections led to more refined and robust predictions of habitat suitability–compared to the large collections alone–across all test species.

While using high numbers of species occurrences as input for distribution models can improve performance and reliability, the present study suggests that the nature of data source can be important too. This potentional of small, regional collections should be considered when planning digitization and data publication efforts.

Link to original article

Glon HE, Heumann BW, Carter JR, Bartek JM and Monfils AK (2017) The contribution of small collections to species distribution modelling: A case study from Fuireneae (Cyperaceae). Ecological Informatics. Elsevier BV 42: 67–78. Available at: https://doi.org/10.1016/j.ecoinf.2017.09.009.

{{'resourceSearch.filters.countriesOfResearcher' | translate}}:
United States of America
{{'resourceSearch.filters.topics' | translate}}:
Species distributions
{{'resourceSearch.filters.audiences' | translate}}:
Data users
GBIF network
{{'resourceSearch.filters.purposes' | translate}}:
Data analysis