Toward automating post processing of aquatic sensor data

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: This is version 2 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint


Amber S Jones , Tanner Lex Jones, Jeffery S Horsburgh


Sensors measuring environmental phenomena at high frequency commonly report anomalies related to fouling, sensor drift and calibration, and datalogging and transmission issues. Suitability of data for analyses and decision making often depends on manual review and adjustment of data. Machine learning techniques have potential to automate identification and correction of anomalies, streamlining the quality control process. We explored approaches for automating anomaly detection and correction of aquatic sensor data for implementation in a Python package (PyHydroQC). We applied both classical and deep learning time series regression models that estimate values, identify anomalies based on dynamic thresholds, and offer correction estimates. Techniques were developed and performance assessed using data reviewed, corrected, and labeled by technicians in an aquatic monitoring use case. Auto-Regressive Integrated Moving Average (ARIMA) consistently performed best, and aggregating results from multiple models improved detection. PyHydroQC includes custom functions and a workflow for anomaly detection and correction.



Biogeochemistry, Civil Engineering, Environmental Engineering, Environmental Monitoring, Hydrology


data management, aquatic sensors quality control, anomaly detection, Python, data management, aquatic sensors, quality control, anomaly detection


Published: 2021-07-23 08:13

Last Updated: 2022-03-26 07:18

Older Versions

CC BY Attribution 4.0 International