2021 ESA Annual Meeting (August 2 - 6)

Leveraging the data richness of the Long Term Ecological Research (LTER) Network to teach environmental data science

On Demand
Julien Brun, NCEAS, University of California;
Background/Question/Methods

Datasets from the Long Term Ecological Research (LTER) network are a treasure trove of interesting, well-documented, real-world environmental data. However, it is time and energy consuming for teachers to identify, explore and clean complex scientific datasets for use in data science classes. As the success (>90k downloads) of the recent palmerpenguins R package (https://allisonhorst.github.io/palmerpenguins/) has shown, there is strong demand and interest in real-world environmental datasets that are ready to be used for data science teaching purposes “out-of-the-box”. In this project, our goal is to develop a sample dataset and an associated analytical example from each of the 30 LTER sites. We also provide information and code to access the original datasets that were used to develop those examples. All of those resources have been combined into an R package.

Results/Conclusions

R packages are an ideal vehicle for teaching datasets as R is widely used in environmental research communities, and packages can be installed in one command on any computer. In addition, the Rmarkdown ecosystem provides a suite of tools to build a website exposing all the pedagogic content to non-R users as well. In this presentation, we will explain our process to discover and design examples capturing the richness of the LTER data, as well as how we developed a reproducible workflow to integrate and document the datasets into an R package. We will also discuss how a team of undergraduate students was involved in this process, as well as the broader pedagogical impacts of this project.