Call for proposals to enhance cloud-based use of GBIF-mediated data

Selected contractor will implement protocols for using GBIF-mediated occurrence data to analyse phylogenetic diversity in the Microsoft Azure environment—deadline: 15 Nov 2021

Map and cluster analysis showing phylogenetic similarity relationships among centres of endemism for Australian Acacia. Figure from Mishler B, Knerr N, González-Orozco C et al. (2014) Phylogenetic measures of biodiversity and neo- and paleo-endemism in Australian Acacia. Nature Communications 5: 4473. DOI: 10.1038/ncomms5473

GBIF Secretariat seeks applications from individuals or institutions that can improve the quality and usability of GBIF-mediated data in cloud-computing environments. With funding through a grant from the GEO-Microsoft Planetary Computer Programme, the selected contractor will implement protocols for analysing phylogenetic diversity on the Microsoft (MS) Azure environment using Biodiverse software, occurrence data from the GBIF network, and phylogenies from OpenTree of Life (OToL).

The GBIF Secretariat will lead the project with the support of Shawn Laffan (Biodiverse software | University of New South Wales) and Emily Jane McTavish (OpenTree of Life | University of California Merced) and in partnership of the Phylogenetic Diversity Task Force (PDTF) of the IUCN Species Survival Commission.

Background and scope of work

In May 2021, GBIF began placing monthly snapshots of GBIF occurrences in the Microsoft Azure Data Catalogue. The contracted individual or institution will extend this work, first by implementing Biodiverse software in MS Azure and then developing and assessing the quality of data-filtering pipelines for the GBIF-mediated occurrence data in the Microsoft Azure Data Catalogue.

The filtered subset of data will then be name-matched with the latest Open Tree of Life phylogeny, producing spatially explicit phylogenetic diversity products for analysing various clades and geographic areas. The PDTF will help assess the quality of the resulting data products.

The goal of the project is to generate automated monthly data products suitable for use in research on phylogenetic diversity by the end of the grant period.

Primary tasks include:

  • Preparing workflows for filtering GBIF-mediated data
  • Assessing the quality of filtered data
  • Matching names between OToL and GBIF-mediated data
  • Running and assessing metrics on phylogenetic diversity that integrate OToL and GBIF-mediated occurrence data using Biodiverse
  • Drafting a first-authored manuscript based on the work in either a methodological paper or an analysis of a large clade

The selected candidate is expected to carry out the work remotely or at their host institution. Candidates must show the ability to work independently and meet virtually with project leads on three continents.

Preferred skills and experience

The incumbent should possess outstanding bioinformatics skills, good knowledge of GBIF-mediated data and an understanding of cloud computing and phylogenetic analyses.

Preferred skills include:

  • Experience in analysis of GBIF-mediated data
  • Knowledge of R, Perl or Python and APIs
  • Demonstrable experience of development of open-source software and repeatable data-processing workflows
  • Knowledge of phylogenetic diversity, as shown by other phylogenetic and spatially explicit biodiversity analyses
  • Experience in cloud-based or distributed computing systems
  • Advanced degree in a field relevant to biodiversity or informatics or GBIF’s work, or equivalent experience
  • Full professional proficiency in English
  • Demonstrated experience in writing scientific publications
  • Ability to work remotely with limited supervision

This is an exciting opportunity for the right individual to strengthen cloud-based biodiversity informatic skills while working with a global data infrastructure. The contract is open to any professional stage from graduate student/postdoc to a sabbatical scientist. If your motivation, interests and experience match these requirements, we look forward to hearing from you.


Payment for the contract will be US$60,000, with its term length dependent on the experience of the contractor and meeting project deliverables. The contract and deliverables must be completed in one year.

Application procedure and deadline

The deadline for receipt of email applications at is 15 November 2021.

Applications in English must include a letter addressing the candidate(s)’s experience, qualifications and availability for the work, curriculum vitae, and a sample scientific publication. Please indicate in the application where you saw this advertisement. Enquiries concerning the contract can be addressed to Executive Secretary Joe Miller.

If the work is to be carried out in parallel with another role, the Secretariat will require written confirmation from the candidate's employer indicating their awareness of the additional hours committed through this contract.

Candidate interviews are expected to take place starting in late November 2021. The successful candidate should be prepared to start work in January 2022 or soon thereafter.

GBIF—the Global Biodiversity Information Facility—is an international network and data infrastructure funded by the world’s governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.

GBIF is an equal opportunities employer and accepts applications without distinction on the grounds of gender, colour, racial, social or ethnic origin, genetic features, language, religion or belief, political or any other opinion, membership of a national minority, property, birth, disability, age or sexual orientation, marital status or family situation, or any other status. Staff are recruited on the broadest possible geographical basis.