The demand for technical support, reliability and resilience continues to grow not just due to greater volumes of data available through GBIF (§3), but also from the increasing ranks of people and organizations who rely on GBIF for both data sharing and data access (§5).
Wider availability of sharing open biodiversity data and effective tracking of its reuse have made data publishing more important to institutions and stakeholders. The Integrated Publishing Toolkit(IPT) (https://www.gbif.org/ipt) plays perhaps the network’s most critical role in this area, with 1,350 organizations in 74 countries relying on it to manage more than 22 thousand datasets. To support the expansion of BID and BIFA into areas where data-publishing institutions have been sparsely represented, GBIF introduced cloud-hosted IPTs for Africa, Asia, Latin America and the Caribbean, and Europe and Central Asia in 2019.
Interest in these shared repositories and the need to support investigations of new data model (see §4) helped prompt a steady stream of improvements. Following a TDWG workshop in 2020 focused on user needs, GBIF developers have released 15 versions of the IPT between 2020 and 2022, while volunteers continued the work of maintaining translated IPT interfaces in seven languages.
A more reliable and robust system better equipped to manage increasing volumes of biodiversity data: that was the result of an extended collaboration to develop a data processing and indexing codebase shared by both GBIF and the Atlas of Living Australia (ALA), which hosts both its country’s GBIF delegation and node. Its users benefited greatly from these core infrastructure improvements, which are responsible for delivering species occurrence data while growing by more than one million records a year. But the better performance and response from searches, maps and analytical tools also reduced ALA’s annual cost of infrastructure operations by 43 per cent, or nearly €50,000, leading other GBIF member countries Germany, New Zealand and Sweden to explore adoption of these same data pipelines.