Friday, August 10, 2018
244, New Orleans Ernest N. Morial Convention Center
The messy process of extracting data from historical literature yields small, noisy, and biased data sets. Yet many recent findings in disease macroecology rely on such data. By turning our predictive modeling tools to the process of data-harvesting itself, though, we can capture better, bigger data for testing macroecological hypotheses and make better use of previous work. I present a case study of identifying novel antimicrobial resistance mutations, where we trained a neural network to identify such events in scientific abstracts, yielding greater returns from labor-intensive literature searches.