In connection with the CESP2024-016 project under the Capacity Enhancement Support Programme a workshop was organized in Warsaw to continue working on an AI tool that can support the process of digitizing natural collections. Event’s guests included representatives of GBIF Norway, DiSSCo (Naturalis Biodiversity Center), University of Trieste, University of Tartu, Swedish Museum of Natural History, University of Gothenburg, GBIF Slovakia, State Museum of Natural History of the NAS of Ukraine and several Polish universities and science institutions, including University of Warsaw, University of Gdańsk, Museum and Institute of Zoology and Institute of Botany from Polish Academy of Science.
While the team of developers was building the connection between the tool and sandbox DiSSCo infrastructure (Michal Torma - the main author of the tool - GBIF.NO, Soulaine Theocharides - DiSSCo, Karolina Kuczkowska - GBIF.PL), the other participants were working on testing the existing tool and refining the LLM prompts that it uses. We have also prepared a dataset for further testing and refining the tool. As a result of the event, we were able to publish the tool as a MAS (Machine Annotation Service) in the sandbox DiSSCo environment.
The tool was named SpLAT, which stands for Specimen Label Automated Transcriber. Further refinement and dissemination of the tool is planned for the ECA Nodes meeting in May.