2021 ESA Annual Meeting (August 2 - 6)

Connecting disciplines and data in ecosystem sciences: Practices for efficient sample tracking, integration, and reuse

On Demand
Joan Ball-Damerow, Earth and Environmental Sciences Area, Lawrence Berkeley National Lab;
Background/Question/Methods

The study of natural ecosystems requires multidisciplinary science teams to understand and model multi-scale processes. Research on these complex processes involve diverse collections of samples and associated field or laboratory measurements. For example, studies of organic matter cycling through plants and soil involve analysis of samples that represent soil biogeochemistry, microbial communities, plant structures, and ecophysiological traits of specific organisms involved. When such multidisciplinary data are published, however, they are often disconnected and missing information needed for interpretation, integration, and reuse. Clear data connections/links help represent interacting processes across related ecosystems data and support future data discovery and usability. While there are widely adopted conventions within certain disciplinary communities to describe sample data, these have gaps when applied in a multidisciplinary context. In this study, we compared existing practices for identifying, characterizing, and linking related environmental samples. We then conducted a pilot test involving eight United States Department of Energy projects, to assess practicalities of assigning persistent identifiers to samples with standardized metadata. Participants collected a variety of sample types, with analyses conducted across multiple facilities. We addressed terminology gaps for multidisciplinary research and made recommendations for assigning identifiers and metadata that supports tracking, integration, and reuse. Our goal was to provide a practical approach for sample documentation, geared towards ecosystem scientists who contribute and reuse sample data.

Results/Conclusions

Many multidisciplinary projects have complicated workflows and need an efficient system for tracking samples as they are sent to collaborators, labs, user facilities, and published online. Despite growing need and interest, there was previously no straightforward guidance on how to describe collections of multidisciplinary samples. We therefore recommend registering samples with Global Sample Numbers (IGSNs), using our modified metadata template for ecosystem sciences (IGSN-ESS). The downloadable template, terminology definitions, and instructions for IGSN registration using IGSN-ESS are detailed in an associated github repository. Overcoming complex challenges that require communities to change behavior and provide standardized data will require a coordinated effort; only coalitions of key stakeholders can establish community consensus, enforce guidelines, and help solve problems. These stakeholders include diverse data contributors and users from different scientific domains, as well as laboratory facilities, repositories, funders, and publishers that take part in institutionalizing and rewarding good data management practices. Community coordination on sample reporting conventions and linked cyberinfrastructure will help solve data management problems, expand access pathways, and make our sample data more useful over time.