2018 ESA Annual Meeting (August 5 -- 10)

PS 66-206 - A comparison of machine learning methods to classify chukar partridge (Alectoris chukar) establishment patterns in Washington state

Friday, August 10, 2018
ESA Exhibit Hall, New Orleans Ernest N. Morial Convention Center
Austin Matthew Smith, School of Natural Resources and Environment, University of Florida, Gainesville, FL, Wendell P. Cropper Jr., School of Forest Resources and Conservation, University of Florida, Gainesville, FL and Michael P. Moulton, Wildlife Ecology & Conservation, University of Florida
Background/Question/Methods

Better understanding and management of species populations require information on habitat requirements. Modeling species presence-absence as a function of environmental variables is one approach to address this question. Game bird introductions are regularly implemented throughout the world, even when little information of the species’ needs are known. The Chukar partridge (Alectoris chukar) was used as a test species for this study. Data related to site-level factors were collected and analyzed, which include physiography, climate, land-coverage type, and habitat range in an effort to understand Chukar habitat needs in Washington state and distinguish which algorithms are best suited when limited to these data types. Four modeling techniques were utilized for this experiment: generalized linear models (GLM); support vector machines(SVM); random forests (RF); and artificial neural networks (ANN). Principal component analysis (PCA) was also implemented in an effort to reduce variable dimensionality.

Results/Conclusions

Five measurements were recorded to assessed the classification rate of each model: predicted accuracy (ACC) , sensitivity (SENS), specificity (SPEC), positive prediction value (PPV) , negative prediction value (NPV), and Cohen’s kappa (Kappa). Results for this study indicate the RF models provide the most accurate predictions for all of these test. To test the predictive potential of these models, an external validation data set was further analyzed. We tested the predicted accuracy for three counties in Oregon where Chukars are present. The random forest model correctly identified 79% of sites with the ANN and SVM scoring 51% and GLM 41%. Two methods of PCA transformation were further applied to each model. The first involves retaining eigenvectors with eigenvalues greater than 1 and the second involves specifying a minimum degree of cumulative variance as a threshold for inclusion ( . These results reduced model accuracy, implying reduce dimensionality is not always necessary. Variables related to climate and slope quality were found to be the most important predictors of Chukar distribution and grassland landcover-type as the most suitable habitat. These results indicate RF models provide more useful estimates for species habitat requirements.