Peninsular Malaysia and Singapore are part of the Sundaland biodiversity hotspot with one of the highest numbers of species in the world. Butterflies (Lepidoptera) are arguably the most well-studied megadiverse terrestrial invertebrates in this region with a long taxonomic history in various localities and ecoregions. Yet, with an estimated butterfly diversity of 1,200 species in the region, there is a clear under-representation within GBIF.org having only specimen records of less than a third of this number.
The Lee Kong Chian Natural History Museum (LKCNHM) joined the international Butterflies of the Southeast Asian Islands Project recently and aims to mobilize specimen data through the digitization of ~10,000 butterfly specimens, with priorities to achieve high coverage of all representative families/subfamilies available.
Publication of these specimen data will directly increase the species representation for the targeted region to ~90% with georeferenced data belonging to a critical time period of drastic landscape modifications (1960-1990). Availability of this data will bridge the gap between historical data and contemporary knowledge of butterfly species in the region, and at the same time promote further scientific investigations into this remarkable regional butterfly diversity.
To date, a total of 3,631 specimens have been transcribes including specimen ID assignment (occurrence ID), original locality and identification data. Among the transcribed specimens, 1,893 specimens have completed the image-vouchering process. The Doggett's collection of Hesperiidae (skippers) has the most complete sets of specimens (725 records). They also have all the localities georeferenced. This data has been the first to be published on GBIF as a part of this project.
A project member has obtained a biodiversity data mobilization advance badge, as a part of the project. The project has two interns and a student volunteer. One of the interns uses a machine-learning algorithm to rename image files automatically. This is done by recognizing the assigned unique specimen ID label captured within the image of each specimen. The procedure has been under optimization to ensure higher accuracy with the ultimate aim of saving time.