This hands-on course will explain the principles of data quality and seek their application through data detection and cleaning practices, with practical examples. Data quality is one of the most important aspects that determines the potential use of a dataset. This quality is determined by multiple factors that integrate the information production and processing chain, from the initial registration or capture to the final use and interpretation. The quality of biodiversity data is also associated with its various dimensions, from taxonomy to spatial information, metadata, storage and publication.
The course covers the following topics:
- Principles for data quality
- Tools and protocols for data quality
- OpenRefine
- Quality of taxonomic and spatial data