Developed in parallel with explorations into diversifying the GBIF data model, the first release candidate will offer many users their first glimpse of some major evolutionary changes ahead for this widely used free, open-source software.
As a release candidate this software is intended only for testing and preview purposes, not production use. Users should set up their installations in test mode and register published datasets to GBIF's dedicated testing environment. Descriptions of the new features and how to use them will be added to the IPT user manual soon. Feedback is most welcome and greatly appreciated through the GitHub IPT repository.
The release maintains all the functionality of previous versions of the IPT while accommodating richer data for testing environments by removing the constraints of the star schema at the heart of the Darwin Core Standard. The greater flexibility this affords enables support for new and emerging Frictionless Data Packages, in particular:
- Camtrap DP, a community-developed data exchange format for camera trap data
- ColDP, a table-based exchange format used by the Catalogue of Life and ChecklistBank
Changes introduced to IPT v3.0 should have minimal impact on current users when its production release appears later this year. Users will upload spreadsheets or connect to databases, just as they currently do. The IPT will also continue to perform key validation checks, including the existence and uniqueness of the necessary IDs, and ensure the integrity of relationships when generating Darwin Core Archive (DwC-A) files.
Meanwhile, the flexibility introduced to support new data packages should enable the quick addition of publishing models for other new formats, like, for example, ecological and environmental monitoring data or plant-pollinator interactions (the latter the subject of an upcoming webinar).
"GBIF has proven its agility once again with this update to the IPT," said John Wieczorek, the convener of the Darwin Core Maintenance Group currently consulting on GBIF's new data model. "Having such a useful tool that supports existing data publishing paradigms while allowing us to design and easily test new deeper and richer ones is critical to the goal of enabling a much broader range of scientific questions with GBIF-mediated data."
"This version of the IPT is instrumental to the adoption of Camtrap DP, and Frictionless Data Packages in general," said Peter Desmet, open data coordinator at the Research Institute for Nature and Forest and a member of the team that developed the Camtrap DP exchange format. "Thanks to the work of lead developer Mike Podolskiy, the IPT remains the easiest and most feature-rich way to publish data through GBIF."
"With these updates, the IPT has become the first tool to automate the process of publishing datasets in the ColDP format," said Markus Döring, the ChecklistBank technical lead for GBIF and Catalogue of Life. "The change will make it easier for taxonomic experts to share rich taxonomic data to ChecklistBank without having to know how to write scripts."
In addition to the richer publishing models, this IPT allows testing of the outputs of the TDWG Material Sample Task Group by allowing Darwin Core datasets organized around the Material Entity Core to standardize information relating to physical objects.