2021 ESA Annual Meeting (August 2 - 6)

Prediction and causal inference in ecology

On Demand
Paul Ferraro, Carey Business School and Department of Environmental Health and Engineering, Johns Hopkins University;
1) Background/Question/Methods

Empirical ecologists are interested in both prediction and causal inference, but these two objectives are different and thus the approaches that ecologists ought to use to achieve these objectives will differ. Conceptual confusion about the difference between predictive and causal models can lead to methodological confusion about the best ways to approach empirical analyses in ecosystems. For example, ecologists often use goodness-of-fit criteria (e.g., AIC, BIC, R2) to evaluate the quality of a causal model, but those criteria are more appropriate for predictive models (and even then, one can use better criterion to avoid overfitting predictive models). In a study focused on prediction, the goal is to develop a model that, given observed attributes of the system, yields the best prediction of the unobserved true value of variable at a particular location and time. For example, we may want to use aboveground biomass data from sampled sites to predict aboveground biomass in unsampled sites. In contrast, in a study focused on causal inference, the goal is to develop a model that quantifies a causal relationship between two or more variables in the system. For example, how does aboveground biomass change when another attribute of the system is changed? A good predictive model need not include any variables that have direct causal relationships with the unobserved target variable. 2)

Results/Conclusions

In this presentation, I cover two topics. First, I explore the key differences in criteria to assess the quality of predictive and causal studies. For example, in casual studies, omitted variable bias, measurement error, and reverse causality can be serious threats to inferences and one must take numerous steps to demonstrate that the correlations observed reflect true causal relationships rather than spurious ones. In predictive studies, however, omitted variables and measurement error may lead to models that are not as accurate as they could be, but they do not introduce bias; likewise, reverse causality poses no challenges because the aim is to predict the value of one variable given observations of other variables – regardless of the nature of the causal relationships between the variables. Second, using data on the relationship between biodiversity and ecosystem productivity, I explore the key differences in design and methods for implementing models focused on prediction and those focused on causal inference.