OpenBiodiv: a knowledge graph for linked open biodiversity data

Tool combines text mining, semantic workflows and graph technologies to integrate knowledge generated through hundreds of years of biodiversity research

Data resources used via GBIF : Backbone taxonomy
Colocasia esculenta
Coco yam (Colocasia esculenta) observed in Taiwan by jodyhsieh. Photo via iNaturalist (CC BY-NC 4.0)

Decades of biodiversity research have contributed immense knowledge about life on Earth. To fully exploit this knowledge, data must available, openly and freely, while sources of data should be linkable through the use of stable and unique identifiers.

OpenBioDiv is a Open Biodiversity Knowledge Management System that uses semantic publishing workflows, graph databased technologies, and text and data mining to establish a robust infrastructure for managing biodiversity knowledge.

Presented as a Linked Open Dataset, OpenBioDiv builds on data extracted from more than 5,000 scientific articles from Pensoft Publishers and even more taxonomic treatments mediated by Plazi. All this data is matched against an RDF version of the GBIF backbone taxonomy to ensure consistency.

The tool enables fast answers to complicated questions that would otherwise require querying numerous separate databases—e.g. "how many articles about taxon X has author Y published in the past 10 years", or, "which taxon treatments mention both scientific name X and Y?". Users can perform simply searches in the web portal while more complex queries is possible through a SPARQL endpoint.

Original article

Penev L, Dimitrova M, Senderov V, Zhelezov G, Georgiev T, Stoev P and Simov K (2019) OpenBiodiv: A Knowledge Graph for Literature-Extracted Linked Open Data in Biodiversity Science. Publications. MDPI AG 7(2): 38. Available at: https://doi.org/10.3390/publications7020038