2020 ESA Annual Meeting (August 3 - 6)

COS 91 Abstract - Assessing habitat maps using presence-only point occurrence locations and their uncertainty

Alexa McKerrow1, Matt Rubino2, Nathan Tarr2 and Steve Williams2, (1)Core Science Systems, United States Geological Survey, Raleigh, NC, (2)Biology Department, NCSU, Biodiversity and Spatial Information Center, Raleigh, NC
Background/Question/Methods

Maps of suitable species’ habitat to aid biodiversity conservation are generated using a variety of modeling approaches. Any modeling output needs to be evaluated for utility to end users and to inform future modeling in a quantitative way. Recently, species location data repositories with enormous numbers of records (i.e. billions) have become ubiquitously available online allowing rapid programmatic access through application interfaces (APIs). However, these location occurrence datasets are usually gathered opportunistically (i.e. museum records, citizen science efforts) and therefore only represent species presence and not absence. Presence-only data are problematic because they eliminate the use of confusion matrix-based metrics such as Kappa that discriminate between where species or habitats are and are not located. Additionally, many of these occurrences are not georeferenced, that is, they have no spatial information such as latitude and longitude (x, y). When records do have x-y locations, they often either lack information regarding their spatial accuracy or the accuracy is very low (e.g. >5 km). Points with such high positional uncertainty are only directly comparable to model outputs of similar scale/resolution.

Results/Conclusions

The National Gap Analysis Project focuses on modeling species habitat distributions for all terrestrial vertebrates in the US. Model outputs are binary habitat maps (suitable/unsuitable) created at a relatively fine resolution (30 meters). We developed two evaluation methods that incorporate presence-only point occurrence records and their locational uncertainty to assess habitat maps at their native resolution. These metrics avoid either removing large numbers of occurrences because of low accuracy and/or “enlarging” map resolution to match greater numbers of records. The first is based on the proportion of habitat in the range relative to mean proportion of habitat around each occurrence record. The second measures true presence fraction (sensitivity) at varying distances from occurrences. The buffer proportion is a modification of Rondinini et al.’s (2011) model prevalence vs. point prevalence metric. We will describe the workflow used to compile and filter species occurrence records from on-line resources (e.g. Global Biodiversity Information Facility and Biodiversity Serving Our Nation), including necessary taxonomic checks. As expected, the amount and quality of species occurrence data varies greatly across species. For species with sufficient data, we will summarize model evaluations and characterize data quality for 282 amphibians we have modeled in the conterminous U.S.