COS 82-9 - Machine learns ecological networks: Automated reconstruction of community network based on a data-driven approach

Thursday, August 15, 2019: 10:50 AM
L010/014, Kentucky International Convention Center
Hongseok Ko1,2, Ahyoung Amy Kim3 and Hao Helen Zhang3,4, (1)Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, (2)UMI 3157 iGLOBES, Centre National de la Recherche Scientifique, Paris, France, (3)Department of Statistics, University of Arizona, Tucson, AZ, (4)Department of Mathematics, University of Arizona, Tucson, AZ
Background/Question/Methods

The structure of ecological networks is commonly assembled based on data collected from extensive field operations. However, these field works usually require high-cost and time-consuming human effort. Also, even though a large amount of ecological data already has been collected over many years, the utilization of these data is limited mainly due to high-dimensionality. This limitation may prevent researchers from assembling community networks with a large number of species. Indeed, there is a need for an innovative way of acquiring information about the species interactions, and existing statistical machine learning algorithms can be used to develop automated learning methodologies to reconstruct ecological networks from high-dimensional datasets covering over 20 species and 20 years.

In this study, we apply five penalized regression methods and two graphical machine learning algorithms to reconstruct ecological networks using empirical data that has known trophic interactions of 20 different species. We compare our results to the known ecological network structures and evaluate our algorithms. Then, we apply these methods to other empirical data including Australian oceanic phytoplankton data to further validate our models and compare the results to those of conventional models.

Results/Conclusions

We found that 5 penalized regression methods and 2 graphical machine learning algorithms provide a promising way of implementing machine learning algorithms to reconstruct ecological networks. The overall performances were much higher than those of other recent studies. The performances were measured by precision, recall, f1 score, and network density. These metrics were highly consistent among the 7 machine learning algorithms. The best precision, recall, and f1 score were 0.45, 0.98, and 0.62, respectively. About 58% of the known interactions were captured by all methods throughout 5,000 multiple learning processes. More importantly, from the fact that the average density of reconstructed networks, 0.82, was doubled that of the known network, 0.42, we concluded that the algorithms capture not only trophic interactions but also non-trophic interactions, which may indicate the discovery of new interactions among species. However, because we compared the reconstructed networks to this known network, we expect to have better performances, if a reference network reflects trophic and non-trophic interactions. These new findings show the potential of new methodologies to discover comprehensive interactions web and may lead to further research to study non-trophic or unknown trophic interactions between species.