Centre for Biodiversity Genomics (BI…

Occurrence dataset published by University of Guelph

  • 1,005,358

View occurrences


Full Title

Centre for Biodiversity Genomics (BIOUG)


The Centre for Biodiversity Genomics (CBG), formerly the Biodiversity Institute of Ontario, is a biotechnology organization dedicated to the use of DNA barcoding for both the identification and discovery of species. Based at the University of Guelph, CBG coordinates the International Barcode of Life Project, the largest research program ever undertaken in biodiversity genomics. CBG plays a special role in leading Canadian contributions to this initiative, with emphasis on the development of efficient collection and preservation methods, high-throughput laboratory protocols, and the Barcode of Life Data (BOLD) Systems – an online workbench and database for the global barcoding community. CBG also houses an unique natural history collection composed of over 1.5 million digitized voucher specimens, most of which are associated with a DNA barcode record and specimen image.


The BIObus (www.biobus.ca) has been collecting material from National Parks since 2008 as part of the Centre for Biodiversity Genomics' Canadian National Park Malaise Program. The program is an on-going study aimed at gathering a collection of voucher specimens and tissue material of animals occurring in Canadian National Parks for subsequent molecular analyzes at the Canadian Centre for DNA Barcoding and for authoritative identification by taxonomists. The goal was to collect specimens from a broad range of taxa within these parks and subsequently to recover high-quality sequences of 658 base pairs of cytochrome c oxidase subunit I gene (COI, the standard animal DNA barcode). These specimens and their sequences then become part of the growing collection of reference library DNA barcodes of Canadian animals (www.boldsystems.org). Even while under construction, this reference library is intended for use by the broader scientific and amateur naturalist community. These reference DNA barcodes are linked to authoritatively identified voucher specimens deposited in major collections where they are accessible for examination and in-depth analyzes by all interested researchers. Until a species gains a formal taxonomic identification, each divergent barcode group (roughly >2% from nearest neighbour) is assigned an unique identifier called a Barcode Index Number or BIN. This permits the necessary subsequent taxonomic revisions to be completed at maximum efficiency.

Additional Information

All data included in this release are publicly available on the Barcode of Life Data Systems (BOLD; www.boldsystems.org). Each specimen is also linked to a COI DNA barcode deposited in GenBank at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/genbank/).

Temporal coverages

Date range: 01-May-2008 - 01-Sep-2015

Language of Metadata


Language of Data


Jeremy deWaard
Director of Bio-Inventory and Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario Canada 519-824-4120 x 52258
Metadata author
Jeremy deWaard
Director of Bio-Inventory and Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario Canada 519-824-4120 x 52258
Administrative contact
Jeremy deWaard
Director of Bio-Inventory and Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario Canada 519-824-4120 x 52258


Published by

University of Guelph

Publication Date


Registration Date


Hosted by

Université de Montréal Biodiversity Centre

Served by

Canadensys repository


Alternative Identifiers

External Data

Metadata Documents

1,005,261 Georeferenced data

View records

All records | In viewable area


The current resource includes records from 43 Canadian National Parks in which the Centre for Biodiversity Genomics has collected specimens during the years 2008-2015.


What does this map show?

Other Contacts

Content provider
Jeremy deWaard
Director of Bio-Inventory and Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario Canada 519-824-4120 x 52258
Angela Telfer
Data Management Lead - Bio-Inventory and Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario Canada 519-824-4120 x 53600
Jeremy deWaard

Taxonomic Coverage

Specimens were identified to species where possible by taxonomic experts. As we operate a high-throughput facility with an influx of approximately 1 million specimens every year, we also rely on semi-automated and automated methods of assigning taxonomy to specimens. Using DNA barcode sequences from BOLD Systems (BOLD; www.boldsystems.org), these remaining specimens were assigned a taxonomic name based on sequence similarity to named specimens. Two different procedures were used to assign names based on DNA barcode sequences: 1)CollectionsID - Taxonomic names were assigned to all specimens found in the same Barcode Index Number (BIN). An advanced algorithm is used to cluster BINs based on sequence similarity. BINs show high concordance with species (<2% sequence divergence within a BIN). CollectionsID scans the existing database of named specimens, and assigns taxonomic information whenever a BIN match is found. Applies to : Phylum- Species level ID 2)BOLD ID Engine (Manual) – BOLD stores sequence information for every barcoded organism. The BOLD ID Engine compares the sequence similarity of your chosen specimen against the other sequences in the database, and returns the taxonomic information for the best matches. We then assign taxonomy based on the following rule set: a.Order to Family identification: Assigned when class, order, or family name has greater than 80% sequence similarity to listed results b.Genus identification: Assigned when genus name has greater than 95% sequence similarity to listed results c.Species identification: Assigned when species name has greater than 98% sequence similarity to listed results (indicating specimens are from the same BIN) If there are conflicts present in the result (i.e. more than one family name present and having greater than 80% sequence similarity), we assign names when it is overwhelmingly clear one name is more prevalent and is likely the correct taxonomic name (i.e. 20 Tipulidae listed, 1 Limoniidae, we would assign Tipulidae). This is a judgment call made by trained senior staff of the Bio-Inventory and Collections Unit.

Arthropoda, Annelida, Mollusca, Cnidaria, Platyhelminthes

BIObus Inventory of Canada's National Parks 2008-2015

show all

Study area description

The BIObus visited 43 of the National Park and National Park Reserves of Canada in trips starting in 2008. Parks include: Aulavik National Park Auyuittuq National Park Banff National Park Bruce Peninsula National Park Cape Breton Highlands National Park Elk Island National Park Forillon National Park Fundy National Park Georgian Bay Islands National Park Glacier National Park Grasslands National Park Gros Morne National Park Gulf Islands National Park Reserve Gwaii Haanas National Park Reserve Ivvavik National Park Jasper National Park Kejimkujik National Park Kluane National Park/Kluane National Park Reserve Kootenay National Park Kouchibouguac National Park La Mauricie National Park Mingan Archipelago National Park Reserve Mount Revelstoke National Park Nahanni National Park Reserve Pacific Rim National Park Reserve Point Pelee National Park Prince Albert National Park Prince Edward Island National Park Pukaskwa National Park Quttinirpaaq National Park Riding Mountain National Park Rouge NUP Sable Island National Park Reserve Sirmilik National Park Terra Nova National Park Thousand Islands National Park Torngat Mountain National Park Tuktut Nogait National Park Vuntut National Park Wapusk National Park Waterton Lakes National Park Wood Buffalo National Park Yoho National Park

Design description

BIObus expeditions are aimed at gathering a synoptic collection of voucher specimens of non-endangered invertebrates which occur in protected areas across North America for subsequent molecular analyses at the Center for Biodiversity Genomics and identification by taxonomic experts.


National Science and Engineering Research Council (NSERC), Genome Canada, Ontario Genomics Institute, Ontario Ministry of Research and Innovation, the Canadian Foundation for Innovation, and the McCain/Evans Foundation.

Project Personnel

Jeremy deWaard



Study extent

Sampling frequency varies according to year and location of collection. Some national parks have been visited more than once during the 2008-2015 collection period that this dataset covers. Individual records contain complete collection details.

Sampling description

Canadian National Parks Malaise Program Two malaise traps are deployed at each of the National Parks and serviced by Parks Canada staff over the course of the field season, with a target of 20 sampling weeks. Deployment occurs in late spring. The locations of the traps are selected in habitats that represent a typical vegetation type reflective of each park, as well as with regards to ease of accessibility for park staff. The malaise traps require servicing once a week, during which time the collection bottles were swapped out, labelled and stored. All parks are revisited in the early fall to gather all specimens, trapping materials, and otherwise wrap-up the field work portion of the Malaise Program. Standardized Sampling Program Three sites were chosen at certain parks in the 2012-2014 National Parks collections, which varied in habitat, vegetation, elevation, and that were otherwise unique from one another. Sites were selected on the first day of the first park visit and the same sites were revisited on the subsequent park visit, where applicable, to substantiate temporal differences in insect communities. Each site was setup as follows: a) 2 malaise traps at least 10 meters apart b) 1 intercept trap with 2 yellow gardening pans filled halfway with soapy water c) 10 pan traps laid out at random, at least 2 meters apart, filled with soapy water d) 10 pitfall traps in a line transect, 5 paces apart; filled halfway with ethanol e) 1 detritus sample taken for Berlese funnel f) 3X 5min sweeps with sweep nets conducted by each field technician (i.e. 4 sweepers) over the course of the week g) 1 10cm deep soil core Sites were visited on average every other day, unless weather warranted otherwise. During site visits, traps were visited, assessed for disturbance, and emptied: malaise collection bottles were swapped out, specimens from intercept pans and pan traps were collected, pitfalls traps were assessed for disturbance by wildlife and ethanol levels, and a 5 min sweep was conducted. Any traps that needed resetting were addressed. General collection When not administering the duties needed to fulfill Standardized Collection as described above, and if time allowed, general collection was conducted opportunistically. The following variety of techniques was employed for general collection: a) freshwater day sampling using kick net, fish nets, overturning rocks, and general active searching b) freshwater night sampling with bottle traps (glow stick and used 2L pop bottles) c) sweep netting for a variety of insects including: dragonflies, damselflies, wasps, hoppers, flies, hemipterans, beetles, etc. d) vegetation beating e) at minimum 1 night sheet per week per park d) at minimum 2 bucket traps per park per week; minimum one for next-gen sequencing, and minimum one for regular sequencing f) active searching, particularly for spiders - ripped up stumps, looked under bark and rocks g) road kill or otherwise dead mammals were collected opportunistically and if small enough to fit in collection bottles No baited traps were attempted. No soil washing was attempted.

Quality control

Traps were revisited for servicing according to a schedule, which varied according to the needs of the trap. All specimens are visible on BOLD (www.boldsystems.org). Through comparison with other specimens using their DNA barcode sequences, contaminated specimens and misidentifications were discovered and fixed where possible. All fields underwent a data cleansing process to ensure data were entered in a standardized matter.

Method Steps

  1. See Sampling Description

Collection name

BIObus Inventory of Canada's National Parks 2008-2015

Collection Identifier


Parent Collection Identifier

Centre for Biodiversity Genomics

Specimen Preservation method