COS 103-2 - Using phylogenetic transformations to account for effects of sequencing error and intra-specific variation on operational taxonomic unit (OTU) assignment in microbial ecology

Friday, August 16, 2019: 8:20 AM
L011/012, Kentucky International Convention Center
Austin C. Koontz, Biology, Utah State University, Logan, UT, William D. Pearse, Department of Biology & Ecology Center, Utah State University, Logan, UT and Bonnie Waring, Department of Biology, Utah State University, Logan, UT
Background/Question/Methods


A key challenge in microbial ecology is how to define units for analysis in the face of sequencing error and variation among species. Operational taxonomic units (OTUs), as well as amplicon sequence variants (ASVs), have emerged as solutions to this problem, but the issue of sequencing error and intraspecific differences confounding OTU boundaries persists. These inherent challenges in the OTU concept undermine the species definition it is meant to convey, and limits the field’s ability to objectively compare the results of microbial community analyses. We argue there is a need for a theory-based approach that determines OTU boundaries and the range of their validity. Here we outline the use of Pagel’s transformations, a classic family of phylogenetic tree transformations, to bound within a range of values uncertainty and error in species assignment. By comparing metrics of community phylogenetic diversity at different degrees of branch-length transformations, we illustrate the presence of a threshold beyond which recent genetic changes inaccurately account for perceived species diversity. We applied this method to community simulations and empirical data in order to test our approach.

Results/Conclusions

We simulated communities over a distribution of species richness values and with varying levels of sequencing error within those species. For datasets in which variation from sequencing error is less than the genuine intraspecific diversity, our approach identified broad ranges of phylogenetic transformation values for which community structure was consistent. We also illustrate the use of the approach in an empirical dataset of macroinvertebrates samples from the Rio Laja of Mexico, demonstrating how this technique extends equally to macrofauna as well as microorganisms. Our results demonstrate a means of bounding potential error when species assignment is ambiguous. One limitation of this technique is the requirement of a large phylogeny of the community being observed; we discuss ways forward in order to overcome this limitation.