2020 ESA Annual Meeting (August 3 - 6)

PS 48 Abstract - Streamline QA/QC for observational data

Li Kui, Marine Science Institute, University of California, Santa Barbara, Santa Barbara, CA, Kristin Vanderbilt, University of New Mexico, Albuquerque, NM and John H. Porter, Environmental Sciences, University of Virginia, Charlottesville, VA
Background/Question/Methods

Observational data, a form of data observed/measured by humans, has been used widely in a variety of disciplines to gain first-hand knowledge on target objects in their natural setting. The workflow for processing observational data typically involves data collection on a paper survey sheet, transfer to a computer, QA/QC, and production of a data product. Because of data handling by humans and human interactions with software, e.g. Excel, Google Sheet, Pages or a database, there is a significant chance that human errors will be introduced at both observation (in the field) and data transfer stages. This creates needs for an error checklist, containing the types of errors and how to find them, and tools to assure high quality data.

Results/Conclusions

To improve data quality, we describe best practices for QA/QC of observational data based on our experiences with datasets from the Long-Term Ecological Research (LTER) Network. Potential errors and their causes at each step of data processing are summarized. The corresponding recommendations with examples are provided for each of the data processing steps: 1) Use software and tools that detect errors in the data transfer process from paper to computer; 2) Implement programs/scripts that streamline and automate the QA/QC process in a reproducible way; and 3) Format and organize data to increase the reusability of the data. We conclude that the QA/QC recommendations should be implemented in all observational data processing workflow, through one or multiple platforms. Development of community-wide data processing procedures are essential for QA/QC that would instill confidence in observational data and would improve interoperability in scientific network.