GBIF seeks senior data-engineering contractor

Twelve-month contractor needed to build, maintain and expand the shared data infrastructure that supports the GBIF data publication and ingestion platform
DEADLINE: 10 January 2024

Castor canadensis-iNat-acomerford-hero
American beaver (Castor canadensis), observed in the United States of America. Photo 2023 Amy Comerford via iNaturalist Research-grade Observations, licensed under CC BY-NC 4.0.

The GBIF Secretariat is looking for a Senior Data Engineer to work remotely with our informatics team. As a Data Engineer, you will be responsible for building, maintaining and expanding the shared data infrastructure that supports the GBIF data publication and ingestion platform. These systems include everything from data repositories, open data standards, scalable pipeline infrastructure, and public-facing data services.

This will be a fully remote position, initially for twelve (12) months, with regular visits to Copenhagen and potential travel to workshops or meetings with GBIF partners. Open to candidates able to overlap at least half of the working day with that of the Secretariat informatics team (09:00-17:00 UTC+1).

Responsibilities

  • Building scalable data pipeline infrastructure, libraries and processes using Airflow, Spark, Beam, Hive, Elasticsearch, HBase and PostgreSQL
  • Implementing open data standards
  • Implementing data quality monitoring that alerts of possible data issues
  • Designing and evolving the shared data platform to support the work to expand the GBIF data model
  • Improving the operational excellence of the data platform

Skills and Experience

  • Seven (7) or more years of relevant industry experience
  • Experience building scalable data pipelines using tools such as Airflow, Spark, Yarn, etc.
  • Advanced working knowledge of Lucene-based search systems, ideally Elasticsearch
  • Experience deploying systems on Kubernetes
  • Advanced working knowledge of SQL, relational databases, query authoring, and implementation of custom functions
  • Excellent Java skills and experience with one or more programming languages such as Python, Scala, or R
  • Excellent English written and verbal communication skills
  • Strong interpersonal and collaboration skills
  • BS or MS degree, preferably in Computer Science, or equivalent work experience
  • Experience with Hadoop
  • Experience working with site reliability engineers
  • Experience working with the GBIF codebases and/or systems that build on the open TDWG Data Standards

This position will require regular visits to Copenhagen.

Application procedure and deadline

Applications for the role must be submitted in English and should include a letter addressing your experience and qualifications for the job, your curriculum vitae and your GitHub username. Applications must be emailed to engineer-contractor@gbif.org by 10 January 2024. Please indicate in the application where you saw this advertisement.

Enquiries concerning the role can be addressed to Federico Mendez for technical matters or to the Head of Administration, Anne Mette Nielsen.

Interviews are expected to take place starting mid-end January 2024.

GBIF—the Global Biodiversity Information Facility—is an international network and data infrastructure funded by the world’s governments and aimed at providing anyone, anywhere, free and open access to data about all types of life on Earth.

GBIF is an equal opportunities employer and accepts applications without distinction on the grounds of gender, colour, racial, social or ethnic origin, genetic features, language, religion or belief, political or any other opinion, membership of a national minority, property, birth, disability, age or sexual orientation, marital status or family situation, or any other status. Staff are recruited on the broadest possible geographical basis.

  • {{'resourceSearch.filters.topics' | translate}}:
  • Infrastructure
  • {{'resourceSearch.filters.audiences' | translate}}:
  • GBIF network