In the past few years, there has been an increasingly growing effort to collect ecological data spanning many spatial and temporal scales. This large amount of available datasets, particularly in the form of time series of species abundances and environmental parameters, provide a novel opportunity for the development of data-driven approaches to study the interactions and the temporal evolution of species abundances within natural ecological communities. However, unavoidable stochasticity, emerging from both underspecified biological processes and measurement noise, alongside the intrinsic nonlinearity of ecological dynamics, poses a serious challenge to the analysis of ecological time series. In this talk, we show how to improve the performance of current learning algorithms developed for parameter inference and out-of-sample forecasting of species abundances. Specifically, we apply regularized loss minimization methods to a class of locally weighted linear fit, known as S-maps. We validate our findings on three different simulated data sets generated by three distinct models with different nonlinearities and levels of stochasticity due to process error.
Results/Conclusions
We have found that regularization systematically increases the quality of the in-sample inference of ecological interactions and the out-of-sample forecast of species abundances of the S-map. Our results show that the improvement is particularly marked when the time series are contaminated with process noise (such as demographic stochasticity). This result has important implications for ecological studies. Indeed, accurate out-of-sample forecast of species abundances are key for the development of sustainability plan and risk assessment studies in fast changing environments.
Furthermore, our results also have important theoretical implication. Specifically, we have found that, while many combinations of inferred parameters can explain the data equally well, only a subset of those provide reliable out-of-sample predictions. That is, we have found that ecological interactions cannot be uniquely inferred from noisy data. This result illustrate that the problem of structural identifiability, a common issue in parametric studies of ecological data, emerge in nonparametric studies as well. Our finding highlight the difference between explaining and predicting the behavior of ecological communities from nonlinear, noisy, time series data.