PuSH - Publication Server of Helmholtz Zentrum München

He, H.* ; Boehringer, T.* ; Schaefer, B.* ; Heppell, K.* ; Beck, C.*

Analyzing spatio-temporal dynamics of dissolved oxygen for the River Thames using superstatistical methods and machine learning.

Sci. Rep. 14:21288 (2024)
Publ. Version/Full Text DOI
By employing superstatistical methods and machine learning, we analyze time series data of water quality indicators for the River Thames (UK). The indicators analyzed include dissolved oxygen, temperature, electrical conductivity, pH, ammonium, turbidity, and rainfall, with a specific focus on the dynamics of dissolved oxygen. After detrending, the probability density functions of dissolved oxygen fluctuations exhibit heavy tails that are effectively modeled using q-Gaussian distributions. Our findings indicate that the multiplicative Empirical Mode Decomposition method stands out as the most effective detrending technique, yielding the highest log-likelihood in nearly all fittings. We also observe that the optimally fitted width parameter of the q-Gaussian shows a negative correlation with the distance to the sea, highlighting the influence of geographical factors on water quality dynamics. In the context of same-time prediction of dissolved oxygen, regression analysis incorporating various water quality indicators and temporal features identify the Light Gradient Boosting Machine as the best model. SHapley Additive exPlanations reveal that temperature, pH, and time of year play crucial roles in the predictions. Furthermore, we use the Transformer, a state-of-the-art machine learning model, to forecast dissolved oxygen concentrations. For long-term forecasting, the Informer model consistently delivers superior performance, achieving the lowest Mean Absolute Error (0.15) and Symmetric Mean Absolute Percentage Error (21.96%) with the 192 historical time steps that we used. This performance is attributed to the Informer's ProbSparse self-attention mechanism, which allows it to capture long-range dependencies in time-series data more effectively than other machine learning models. It effectively recognizes the half-life cycle of dissolved oxygen, with particular attention to critical periods such as morning to early afternoon, late evening to early morning, and key intervals between the 16th and 26th quarter-hours of the previous half-day. Our findings provide valuable insights for policymakers involved in ecological health assessments, aiding in accurate predictions of river water quality and the maintenance of healthy aquatic ecosystems.
Impact Factor
Scopus SNIP
Altmetric
3.800
0.000
Tags
Annotations
Special Publikation
Hide on homepage

Edit extra information
Edit own tags
Private
Edit own annotation
Private
Hide on publication lists
on hompage
Mark as special
publikation
Publication type Article: Journal article
Document type Scientific Article
Language english
Publication Year 2024
HGF-reported in Year 2024
ISSN (print) / ISBN 2045-2322
e-ISSN 2045-2322
Quellenangaben Volume: 14, Issue: 1, Pages: , Article Number: 21288 Supplement: ,
Publisher Nature Publishing Group
Publishing Place London
Reviewing status Peer reviewed
Institute(s) Helmholtz AI - KIT (HAI - KIT)
Grants QMUL Research England impact fund
Erfassungsdatum 2024-10-21