Discrimination accuracy uninformative when evaluating species distribution models

Species distribution simulation study finds discrimination accuracy a very poor predictor of functional accuracy

Data resources used via GBIF : 5,969,252 species occurrences
Acacia complanata
Acacia complanata observed in Nathan, QLD, Australia by toohey-forest-wildlife. Photo via iNaturalist (CC BY-NC 4.0)

Species distribution models (SDMs) are popular tools—easy to build using data freely available—and, in some cases the only tractable means of estimating habitat suitability. The ability to predict withheld data (discrimation accuracy) is often used to make decisions on models, methods and data.

Using a simulation approach, this study evaluates the relationship between discrimination accuracy and functional accuracy to assess commonly-used methods for selection of models. The authors simulated occurrence data across 11 levels of sampling bias—modelled using GBIF-mediated data of all plants in Australia—performing 20 simulations for each and using four experimental scenarios to build models based on seven algorithms, resulting in a total of 6,160 models.

The results showed a fairly good functional accuracy of models overall—that is, modelled suitability matched true suitability of habitat. However, across nearly all algorithms and levels of complexity the discrimination capacity was a very poor predictor of functional accuracy, indicating that assessing models based on this approach might be less than ideal.

Original article

Warren DL, Matzke NJ and Iglesias TL (2019) Evaluating presence‐only species distribution models with discrimination accuracy is uninformative for many applications. Journal of Biogeography. Wiley 47(1): 167–180. Available at: https://doi.org/10.1111/jbi.13705