COS 67-9 - Using occupancy-detection models to address biases in biodiversity data

Wednesday, August 14, 2019: 4:20 PM
L013, Kentucky International Convention Center
Kelley D. Erickson1, David Henderson1,2, Stephen J. Murphy1 and Adam Smith3, (1)Center for Conservation and Sustainable Development, Missouri Botanical Garden, Saint Louis, MO, (2)Department of Biology, Washington University in St. Louis, Saint Louis, MO, (3)Missouri Botanical Garden, Saint Louis, MO
Background/Question/Methods

Accurately determining species ranges is of fundamental interest, especially for species whose ranges are in flux due to global change. Available data on species occurrences ranges widely in terms of geographical accuracy, with many records that are available in online databases of biodiversity data lacking coordinates yet possessing place names for regions smaller than a state/province (i.e., county, parish, or equivalent). For the purposes of determining species distributions, records without coordinates are commonly discarded, although for many species doing so risks underestimating the true range. Many of these occurrence records were collected for another purpose, and biases in the global collection record are a well-documented effect for many taxa. We evaluated collector bias for several species of plants found in Florida, using records downloaded from GBIF. To address collection biases, we constructed county-level occupancy-detection models, where the probability of detection is a function of sampling effort.

Results/Conclusions

We found evidence of temporal, taxonomic, and geographic biases in collector behavior. On average, collectors were active for 5.54 years. 35% of collectors were only active one year although one collector made collections in 54 years. The taxonomic breadth of collectors also varied widely, with an individual collector collecting on average 42 families, 7% collecting only one family and 25% collecting more than 67 families. 77% of collectors only collected a species once. The majority of records (63%) lacked coordinates and were only locatable to the county-level. The mean posterior probability of detection given occurrence across all counties ranged from 0 to 0.51 depending on species. Our models predicted a high probability of occupancy (>60%) for between two (Schinus terebinthifolia) to 14 (Asclepias curtissii) additional counties beyond those counties that had records with coordinates. Our results indicate that only using records with coordinates can lead to underestimating the range limits of a species. Although county-level data is imprecise, in locations where well-georeferenced specimens are absent, they provide unique information about the range and environmental tolerances of species so long as bias in collector behavior can be addressed in the modeling process.