Darwin Core Archive Validator

This tool, produced by the GBIF Secretariat, tests Darwin Core Archives as specified in the Darwin Core Text Guidelines. It produces a small report that includes the errors detected so they can be solved before publishing the archive on the internet.

NOTE: There’s a new version of the validator under development by the online community. Please check https://github.com/gbif/dwca-validator/wiki/Vision for more information.

The validator is a tool to test Darwin Core Archives as specified in the Darwin Core Text Guidelines. Due to the simplicity of the archives GBIF encourages publishers to create them using simple custom scripts. Therefore the need arises to provide a testing framework for developers to make sure GBIF and others can read the information as expected.

The validator uses the official XML schema to validate the meta.xml descriptor, but additionally it uses the Darwin Core Archive Reader java library to validate the content against the known extensions and terms registered within the GBIF network for sharing biodiversity data. GBIF runs a production and a development registry that keeps track of extensions, both of which are used by this validator.

GBIF recommends to bundle an Ecological Markup Language (EML) xml file with an archive. As EML is a rather large and complex schema GBIF has specified a GBIF profile that uses a subset of EML 2.1.0 and also declares specific additions to EML within the generic additionalMetadata section of EML. Every valid GBIF profile document should therefore always be valid according to the official EML schema. The EML validation is done according those two xml schemas.

GBIF, 2010, “Darwin Core Archive Validator - Online tool”. Copenhagen: Global Biodiversity Information Facility.