Discrimination accuracy uninformative when evaluating species distribution models

Species distribution simulation study finds discrimination accuracy a very poor predictor of functional accuracy

Data resources used via GBIF : 5,969,252 species occurrences
Acacia complanata
Acacia complanata observed in Nathan, QLD, Australia by toohey-forest-wildlife. Photo via iNaturalist (CC BY-NC 4.0)

Species distribution models (SDMs) are popular tools—easy to build using data freely available—and, in some cases the only tractable means of estimating habitat suitability. The ability to predict withheld data (aka discrimation accuracy) is often used to make decisions on models, methods and data.

Using a simulation approach, this study evaluates the relationship between discrimination accuracy and functional accuracy to assess commonly used methods for models selection. The authors simulated occurrence data across 11 levels of sampling bias—modelled using GBIF-mediated data of all plants in Australia—performing 20 simulations for each and using four experimental scenarios to build models based on seven algorithms, resulting in a total of 6,160 models.

Their results showed a fairly good functional accuracy of models overall—that is, modeled suitability matched true suitability of habitat. However, across nearly all algorithms and levels of complexity the discrimination capacity was a very poor predictor of functional accuracy, indicating assessing models based on this approach might less than ideal.

Original article

Warren DL, Matzke NJ and Iglesias TL (2019) Evaluating presence‐only species distribution models with discrimination accuracy is uninformative for many applications. Journal of Biogeography. Wiley 47(1): 167–180. Available at: https://doi.org/10.1111/jbi.13705