Cleaning and digitizing plant specimen records from Heilongjiang Province

How to link datasets to a project

{{result.description | stripTags | limitTo:200 }}

{{ result.publishingOrganizationTitle | limitTo:100 }}

... ...

How to link events and news to a project

Euryale ferox
Prickly water lily (Euryale ferox) observed in Xiaoxingkai Lake, Heilongjiang, China by John Howes. Photo via iNaturalist (CC BY-NC 4.0)

Despite being a hotspot of botanical research for more than 100 years, the Chinese province of Heilongjiang is only represented by about 2,000 specimens in GBIF—out of an estimated 150,000 specimens available.

This project will develop a workflow for processing specimen metadata to promote digitization and data cleaning in Heilongjiang. The project team will demonstrate the efficiency and reliability of the workflow on about 50,000 specimen records available at the Northeast Forestry University.

Following this initial exercise, the workflow will be expanded to clean up all specimens located in Heilongjiang Province and complete the Heilongjiang Digital Herbarium and Online Flora.

Project progress

The project began with obtaining specimen data from the Chinese Virtual Herbarium (CVH), as well gazetteer and scientist information from the National Statistics Bureau of China, Specimens data, Floras, Manchuria historical atlas, and in October 2020 held a seminar with a group of 16 data experts to discuss tools for data cleaning and the data obtained, for which a set of workflows was developed.

During the first half of the project an inventory analysis was undertaken of the data obtained, including checking different types of information missing, the type of error information and throughout the project the team proceeded to complement this with missing information and correcting errors.

Starting the project with approx. 2,000 specimen records from Northeastern China included in GBIF, upon completion of the project the total number of records in the dataset of plant Specimens records of NEFI reached nearly 50,000 - an increase of more than 20 times.

By final reporting, other achievements by the project included the publication to GBIF of a Gazetteer with place names of Heilongjiang and the first checklist of “Spermatophyta and invasive plants in Wanda Mountains, Heilongjiang Province, China”. A project member also received certification from the BIFA capacity enhancement workshop and was able to disseminated the knowledge learnt to others of the group

An additional activity undertaken during implementation, to present the work of the project, was the publication of the data paper “Checklist of tracheophyte in Heilongjiang Province Checklist of tracheophyte in Heilongjiang Province”, in Biodiversity Science.

While project implementation and some activities were impacted by the COVID-19 pandemic, this did not change the project plans. Post project, with still roughly 200,000 digital specimens in the region and an estimated 400,000 undigitized specimens, the project team plan to continue to promote the digitization and cleaning of specimen records.

€ {{ 7399 | localNumber }}
€ {{ 21598 | localNumber }}
1 July 2020 - 31 August 2022
Project identifier
Contact details

Hongfang Wang
Northeast Forestry University
Hexing road 26#
Haerbin, Heilongjiang 150040