Clean global data may matter more than complex models

This study suggests that simple data cleaning and correction may improve modeled distribution predictions more than developing sophisticated mathematical models.

Data resources used via GBIF : 372,337 marine fish occurrences

To understand the impact of data sampling biases and quality concerns in global-scale models, this team used all available GBIF-mediated data for fish species from marine-only orders to compare four common procedures. Their findings suggest that, as long as researchers clean the original data, correct for autocorrelation and account for obvious underestimations in species richness, the work of improving both data quantity and quality may matter more in accurately predicting distributions than the development of sophisticated mathematical models.

World variation in species richness of marine fish species according to GBIF-MaxEnt-restricted maps

From García-Roselló E, Guisande C, Manjarrés-Hernández A et al. (2015), Figure 2d: World variation in species richness of marine fish species according to GBIF-MaxEnt-restricted maps (α-shape = 6, threshold = 0.75) at 1°resolution

Citations

García-Roselló E, Guisande C, Manjarrés-Hernández A et al. (2015) Can we derive macroecological patterns from primary Global Biodiversity Information Facility data? Global Ecology and Biogeography 24(3): 335-347. doi:10.1111/geb.12260

Subject