Many ecological disciplines rely on testing combinations of compounds, proteins, or bacterial species to drive scientific discovery. It is time-consuming and expensive to determine experimentally, via trial-and-error or random selection approaches, which of the many possible combinations will lead to desirable outcomes. Hence there is a pressing need for more rational and efficient experimental design approaches to reduce experimental effort.
We demonstrate the potential of machine learning methods for the in silico selection of promising bacterial co-culture combinations in the application of bioaugmentation. We use the example of pollutant removal in drinking water treatment plants, which can be achieved using co-cultures of a specialized pollutant degrader with combinations of bacterial isolates. To reduce the experimental effort needed to discover high-performing combinations, we propose a data-driven experimental design.
Results/Conclusions
In this study, we used a dataset of mineralization performance for all pairs of 13 bacterial species co-cultured with the pollutant degrader MSH1. We built a Gaussian process regression model to predict the Gompertz mineralization parameters of the co-cultures of two and three species, based on the single-strain parameters. We subsequently used this model in a Bayesian optimization scheme to suggest potentially high-performing combinations of bacteria. We evaluated the model using specialized cross-validation schemes suitable for estimating the model's performance in learning interactions.
We achieved excellent performance with this approach, both for predicting mineralization parameters as well as for selecting effective co-cultures, despite the limited dataset. Using Bayesian optimization schemes, we were able to dramatically reduce the number of co-culture experiments needed to find optimal mineralization parameters.
As a novel application of Bayesian optimization in bioremediation, this experimental design approach has promising applications for highlighting co-culture combinations for in vitro testing in various settings, to lessen the experimental burden and perform more targeted screenings.