Journal cover Journal topic
Earth System Science Data The Data Publishing Journal
https://doi.org/10.5194/essd-2017-52
© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 4.0 License.
Review article
18 Jul 2017
Review status
This discussion paper is a preprint. It is a manuscript under review for the journal Earth System Science Data (ESSD).
The National Eutrophication Survey: lake characteristics and historical nutrient concentrations
Joseph Stachelek1, Chanse Ford2, Dustin Kincaid3, Katelyn King1, Heather Miller4, and Ryan Nagelkirk5 1Department of Fisheries and Wildlife, Natural Resources Building, Michigan State University, 480 Wilson Rd, East Lansing, MI 48824, USA
2Department of Earth and Environmental Sciences, Natural Science Building, Michigan State University, 288 Farm Ln, East Lansing, MI 48824, USA
3Department of Integrative Biology, Natural Science Building, Michigan State University, 288 Farm Ln, East Lansing, MI 48824, USA
4Department of Microbiology and Molecular Genetics, Biomedical and Physical Science Building, Michigan State University, 567 Wilson Rd, East Lansing, MI 48824, USA
5Department of Geography, Environment, and Spatial Sciences, Geography Building, Michigan State University, 673 Auditorium Rd, East Lansing, MI 48824, USA
Abstract. Historical ecological surveys serve as a baseline and provide context for contemporary research, yet many of these records are not preserved in a way that ensures their long-term usability. The National Eutrophication Survey database is currently only available as scans of the original reports (PDF files) with no embedded character information. This limits its searchability, machine readability, and the ability of current and future scientists to systematically evaluate its contents. These data were collected by the United States Environmental Protection Agency between 1972 and 1975 as part of an effort to investigate eutrophication in freshwater lakes and reservoirs. Although several studies have manually transcribed small portions of the database in support of specific studies, there have been no systematic attempts to transcribe and preserve the database in its entirety. Here we use a combination of automated optical character recognition and manual quality assurance procedures to make these data available for analysis. The performance of the optical character recognition protocol was found to be linked to variation in the quality (clarity) of the original documents. For each of the four archival scanned reports, our quality assurance protocol found an error rate between 5.9 and 17 %. The goal of our approach was to strike a balance between efficiency and data quality by combining hand-entry of data with digital transcription technologies. The finished database contains information on the physical characteristics, hydrology, and water quality of about 800 lakes in the contiguous United States (https://doi.org/10.5063/F1KK98R5). Ultimately, this database could be combined with more recent studies to generate metadata analyses of water quality trends and spatial variation across the continental United States.

Citation: Stachelek, J., Ford, C., Kincaid, D., King, K., Miller, H., and Nagelkirk, R.: The National Eutrophication Survey: lake characteristics and historical nutrient concentrations, Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2017-52, in review, 2017.
Joseph Stachelek et al.
Joseph Stachelek et al.

Data sets

The National Eutrophication Survey: lake characteristics and historical nutrient concentrations.
J. Stachelek, C. Ford, D. Kincaid, K. King, H. Miller, and R. Nagelkirk
https://doi.org/10.5063/F1KK98R5

Model code and software

Scrape Data from the National Eutrophication Survey
J. Stachelek
https://doi.org/10.5281/zenodo.591266
R package to fetch, cache, and serve the National Eutrophication Survey
J. Stachelek
https://doi.org/10.5281/zenodo.830416
The National Eutrophication Survey: lake characteristics and historical nutrient concentrations
J. Stachelek, C. Ford, D. Kincaid, K. King, H. Miller, and R. Nagelkirk
https://doi.org/10.5281/zenodo.830468
Joseph Stachelek et al.

Viewed

Total article views: 232 (including HTML, PDF, and XML)

HTML PDF XML Total BibTeX EndNote
187 42 3 232 2 2

Views and downloads (calculated since 18 Jul 2017)

Cumulative views and downloads (calculated since 18 Jul 2017)

Viewed (geographical distribution)

Total article views: 232 (including HTML, PDF, and XML)

Thereof 229 with geography defined and 3 with unknown origin.

Country # Views %
  • 1

Saved

Discussed

Latest update: 19 Oct 2017
Download
Short summary
Here we report the results of an effort to fully transcribe the National Eutrophication Survey database containing water quality data collected by the United States Environmental Protection Agency between 1972 and 1975 as part of an effort to investigate nutrient loading in freshwater lakes and reservoirs. The transcribed database contains information on the physical characteristics, hydrology, and water quality of about 800 lakes in the contiguous United States.
Here we report the results of an effort to fully transcribe the National Eutrophication Survey...
Share