Review reveals how biodiversity data supports research on human diseases

First systematic analysis of studies detailing links between wild biodiversity and human health outlines trends and contrasts in patterns of data use in support of the One Health approach

Lutzomyia evansi-Yale Peabody-hero
Lutzomyia evansi, a vector of the parasitic disease leishmaniasis, collected in Colombia. Photo 2021 Daniel J. Drew for Yale Peabody Museum, public domain under CC0.

A GBIF-commissioned study has outlined the contribution of open data on biodiversity to research on the complex links between wild organisms and human diseases, highlighting the possible benefits of better coordination on data sharing around global disease surveillance.

An international team led by veterinary ecologist Dr Francisca Astorga produced the first-ever systematic analysis of the patterns of use of biodiversity data for human health, published in the journal One Health.

The term "One Health" describes a collaborative, transdisciplinary approach ranging local to global scales and recognizing the intimate connections between the health and well-being of humans, ecosystems and animals, both wild and domesticated. Although this approach was not envisioned as one of the primary uses of GBIF at its outset, the network has supported a growing body of peer-reviewed literature using its data that broadly seeks to improve understanding of the links between wild organisms and human health. The emergence of the COVID-19 pandemic provided a stark reminder of the urgent need to support and encourage further interdisciplinary and collaborative research along such lines.

Astorga and her co-authors, who include members of the GBIF Secretariat staff and the "vectors" task group began by generating two lists of scientific studies related to human health: a "positive list" that used GBIF-mediated data on biodiversity and a "negative list" that did not. Those that did consisted of 107 studies published between 2015 and 2021 that relied on biodiversity data from the GBIF network to explore aspects of human infectious diseases. Those that did not comprised a random but equally sized group of papers whose keyword terms (e.g. "zoonoses," "bat borne disease," "rodent borne") matched those found in the positive list.

The positive and negative lists both displayed consistent, similar annual growth rates of about 18 per cent, with papers from both groups appearing in four of the domain's most relevant journals. However, bibliometric analyses revealed clear distinctions, as the positive list concentrated on biological and ecological sciences and the negative list on medical, public health and veterinarian topics. In total, the positive list examined 42 diseases compared to the 34 studied in the other group—although malaria and leishmaniasis were the most frequently studied diseases in both.

Fig. 4 from Astorga et al. 2023, showing the number of variables by taxon class (Y-axis), epidemiological level (bar colour) and the use (or not) of GBIF-mediated data.

Key contrasts emerged from the analyses. For example, the studies citing GBIF-mediated data considered a larger number of species—2,669 in total, for an average of 32.5 per study—than did those in the other group (1,136 species, or 12 per study). In addition to including more species, the studies on the positive list tended to include more variables as well as topics and terms related to more complex analyses, like theoretical models and ecological niche modelling, which often require robust data quality.

"The findings suggest that access to standardized, interoperable data provided by integrated repositories may facilitate more complex, broader scale analyses," said Sylvie Manguin, Full Research Professor/Director at Institut de Recherche pour le Développement (IRD) and a previous chair of the GBIF task group. "More broadly, the review marks an important step in acknowledging, strengthening and promoting the current and potential contributions of biodiversity data in understanding infectious disease dynamics."

Another notable trend was an apparent dearth of data on pathogen species. Almost half (52) of the positive studies lacked such data, but the same gap also existed for one third of the negative list. The reduced availability of pathogen data seemed not to obstruct research outputs, as 49 of the studies that used GBIF-mediated data explored viral diseases. This result indicates not only hte possibility of developing disease research, modelling and risk assessments using occurrences only for vector and host species—as well as room for improvements but also the value of occurrence data on hosts and vectors.

The issue of common data formats also arose, as the use of standardized data proved the exception, not the rule, in the negative list. In addition, most data journals and repositories for supplementary materials do not require standardized protocols.

"Our study confirms that GBIF and other biodiversity repositories play a key role in providing infectious disease research with non-pathogen data, since documenting all occurrence organisms involved in disease circulation is fundamental to the One Health approach," said Astorga, a researcher and professor at Universidad Mayor and Universidad Andrés Bello in Santiago de Chile. "The lack of centralized repositories for multi-species pathogens may begin to explain researchers' use of multiple and scattered sources, but by connecting data managers across ecological, veterinarian, biomedical and human health-related fields, we can improve the availability and interoperability of information all the species involved in infectious disease dynamics."

In addition to mapping a baseline of opportunities from crosslinking data across disciplines, diseases and taxa, the final appendix of the study's supplementary materials outlines gaps and challenges and proposes recommendations aimed at improving support of biodiversity-focused research into infectious diseases in humans and increasing data integration with the public health, medical and veterinary domains.


Astorga F, Groom Q, Shimabukuro PHF, Manguin S, Noesgaard D, Orrell T, Sinka M, Hirsch T & Schigel D (2023) Biodiversity data supports research on human infectious diseases: Global trends, challenges, and opportunities. One Health 16: 100484. https://doi.org/10.1016/j.onehlt.2023.100484