Lake and stream conditions respond to both natural and human-related landscape features. Characterizing these features within contributing areas (i.e., delineated watersheds) of streams and lakes could improve our understanding of how biological conditions vary spatially and improve the use, management, and restoration of these aquatic resources. However, the specialized geospatial techniques required to define and characterize stream and lake watersheds has limited their widespread use in both scientific and management efforts at large spatial scales. We developed the StreamCat and LakeCat Datasets to model, predict, and map the probable biological conditions of streams and lakes across the conterminous US (CONUS). Both StreamCat and LakeCat contain watershed-level characterizations of several hundred natural (e.g., soils, geology, climate, and land cover) and anthropogenic (e.g., urbanization, agriculture, mining, and forest management) landscape features for ca. 2.6 million stream segments and 376,000 lakes across the CONUS, respectively. These datasets can be paired with field samples to provide independent variables for modeling and other analyses. We paired 1,380 stream and 1,073 lake samples from the USEPAs National Aquatic Resource Surveys with StreamCat and LakeCat and used random forest (RF) to model and then map an invertebrate condition index and chlorophyll aconcentration, respectively.
Results/Conclusions
The invertebrate index model correctly predicted 75% of streams as being in good or poor condition. Furthermore, we were able to map the probability of being in good condition [Pr(good)] for 1.1 million perennial stream segments across the CONUS. This map provides insight into how conditions vary both nationally and locally and can be queried to identify and target candidate streams for conservation or restoration. Although preliminary, the RF model for chlorophyll a explained 20% of the variation in lake concentration. Lower stream Pr(good) and higher lake chlorophyll a concentrations were both associated with higher percentages of watershed urbanization and agriculture in the RF models. These models provide important examples of how StreamCat and LakeCat can be used to both understand the distribution of conditions and landscape features associated with these conditions. Both datasets greatly improve the accessibility of independent variables for scientists and managers working within aquatic ecosystems. StreamCat is currently available for download through the USEPA (https://www.epa.gov/national-aquatic-resource-surveys/streamcat) and we anticipate LakeCat to be available for download this year.