Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Machine Learning Approaches for Predicting Health Risk of Cyanobacterial Blooms in Northern European Lakes
Cyanobacterial blooms are considered a major threat to global water security with documented impacts on lake ecosystems and public health. Given that cyanobacteria possess highly adaptive traits that favor them to prevail under different and often complicated stressor regimes, predicting their abundance is challenging. A dataset from 822 Northern European lakes is used to determine which variables better explain the variation of cyanobacteria biomass (CBB) by means of stepwise multiple linear regression. Chlorophyll-a (Chl-a) and total nitrogen (TN) provided the best modelling structure for the entire dataset, while for subsets of shallow and deep lakes, Chl-a, mean depth, TN and TN/TP explained part of the variance in CBB. Path analysis was performed and corroborated these findings. Finally, CBB was translated to a categorical variable according to risk levels for human health associated with the use of lakes for recreational activities. Several machine learning methods, namely Decision Tree, K-Nearest Neighbors, Support-vector Machine and Random Forest, were applied showing a remarkable ability to predict the risk, while Random Forest parameters were tuned and optimized, achieving a 95.81% accuracy, exceeding the performance of all other machine learning methods tested. A confusion matrix analysis is performed for all machine learning methods, identifying the potential of each method to correctly predict CBB risk levels and assessing the extent of false alarms; random forest clearly outperforms the other methods with very promising results.
Machine Learning Approaches for Predicting Health Risk of Cyanobacterial Blooms in Northern European Lakes
Cyanobacterial blooms are considered a major threat to global water security with documented impacts on lake ecosystems and public health. Given that cyanobacteria possess highly adaptive traits that favor them to prevail under different and often complicated stressor regimes, predicting their abundance is challenging. A dataset from 822 Northern European lakes is used to determine which variables better explain the variation of cyanobacteria biomass (CBB) by means of stepwise multiple linear regression. Chlorophyll-a (Chl-a) and total nitrogen (TN) provided the best modelling structure for the entire dataset, while for subsets of shallow and deep lakes, Chl-a, mean depth, TN and TN/TP explained part of the variance in CBB. Path analysis was performed and corroborated these findings. Finally, CBB was translated to a categorical variable according to risk levels for human health associated with the use of lakes for recreational activities. Several machine learning methods, namely Decision Tree, K-Nearest Neighbors, Support-vector Machine and Random Forest, were applied showing a remarkable ability to predict the risk, while Random Forest parameters were tuned and optimized, achieving a 95.81% accuracy, exceeding the performance of all other machine learning methods tested. A confusion matrix analysis is performed for all machine learning methods, identifying the potential of each method to correctly predict CBB risk levels and assessing the extent of false alarms; random forest clearly outperforms the other methods with very promising results.
Machine Learning Approaches for Predicting Health Risk of Cyanobacterial Blooms in Northern European Lakes
Nikolaos Mellios (Autor:in) / S. Jannicke Moe (Autor:in) / Chrysi Laspidou (Autor:in)
2020
Aufsatz (Zeitschrift)
Elektronische Ressource
Unbekannt
Metadata by DOAJ is licensed under CC BY-SA 1.0
Hepatotoxic cyanobacterial blooms in the lakes of northern Poland
Online Contents | 2005
|Cyanobacterial Blooms Enhance Nitrogen Removal in Lakes through Carbon/Nitrogen Coupling Metabolism
American Chemical Society | 2023
|An integrated method for removal of harmful cyanobacterial blooms in eutrophic lakes
Online Contents | 2012
|