Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Prediction of 35 Target Per- and Polyfluoroalkyl Substances (PFASs) in California Groundwater Using Multilabel Semisupervised Machine Learning
Comprehensive monitoring of perfluoroalkyl and polyfluoroalkyl substances (PFASs) is challenging because of the high analytical cost and an increasing number of analytes. We developed a machine learning pipeline to understand environmental features influencing PFAS profiles in groundwater. By examining 23 public data sets (2016–2022) in California, we built a state-wide groundwater database (25,000 observations across 4200 wells) encompassing contamination sources, weather, air quality, soil, hydrology, and groundwater quality (PFASs and cocontaminants). We used supervised learning to prescreen total PFAS concentrations above 70 ng/L and multilabel semisupervised learning to predict 35 individual PFAS concentrations above 2 ng/L. Random forest with ADASYN oversampling performed the best for total PFASs (AUROC 99%). XGBoost with SMOTE oversampling achieved the AUROC of 73–100% for individual PFAS prediction. Contamination sources and soil variables contributed the most to accuracy. Individual PFASs were strongly correlated within each PFAS’s subfamily (i.e., short- vs long-chain PFCAs, sulfonamides). These associations improved prediction performance using classifier chains, which predicts a PFAS based on previously predicted species. We applied the model to reconstruct PFAS profiles in groundwater wells with missing data in previous years. Our approach can complement monitoring programs of environmental agencies to validate previous investigation results and prioritize sites for future PFAS sampling.
We leverage chemical correlations of individual PFASs, pollution sources, and soil parameters to achieve higher accuracy of machine learning prediction of groundwater occurrence for total PFASs and 35 individual PFASs in California.
Prediction of 35 Target Per- and Polyfluoroalkyl Substances (PFASs) in California Groundwater Using Multilabel Semisupervised Machine Learning
Comprehensive monitoring of perfluoroalkyl and polyfluoroalkyl substances (PFASs) is challenging because of the high analytical cost and an increasing number of analytes. We developed a machine learning pipeline to understand environmental features influencing PFAS profiles in groundwater. By examining 23 public data sets (2016–2022) in California, we built a state-wide groundwater database (25,000 observations across 4200 wells) encompassing contamination sources, weather, air quality, soil, hydrology, and groundwater quality (PFASs and cocontaminants). We used supervised learning to prescreen total PFAS concentrations above 70 ng/L and multilabel semisupervised learning to predict 35 individual PFAS concentrations above 2 ng/L. Random forest with ADASYN oversampling performed the best for total PFASs (AUROC 99%). XGBoost with SMOTE oversampling achieved the AUROC of 73–100% for individual PFAS prediction. Contamination sources and soil variables contributed the most to accuracy. Individual PFASs were strongly correlated within each PFAS’s subfamily (i.e., short- vs long-chain PFCAs, sulfonamides). These associations improved prediction performance using classifier chains, which predicts a PFAS based on previously predicted species. We applied the model to reconstruct PFAS profiles in groundwater wells with missing data in previous years. Our approach can complement monitoring programs of environmental agencies to validate previous investigation results and prioritize sites for future PFAS sampling.
We leverage chemical correlations of individual PFASs, pollution sources, and soil parameters to achieve higher accuracy of machine learning prediction of groundwater occurrence for total PFASs and 35 individual PFASs in California.
Prediction of 35 Target Per- and Polyfluoroalkyl Substances (PFASs) in California Groundwater Using Multilabel Semisupervised Machine Learning
Dong, Jialin (Autor:in) / Tsai, Gabriel (Autor:in) / Olivares, Christopher I. (Autor:in)
ACS ES&T Water ; 4 ; 969-981
08.03.2024
Aufsatz (Zeitschrift)
Elektronische Ressource
Englisch
Perfluoroalkyl and polyfluoroalkyl substances (PFASs) in mineral water and tap water
Taylor & Francis Verlag | 2013
|Perfluoroalkyl and polyfluoroalkyl substances (PFASs) in mineral water and tap water
Online Contents | 2013
|DOAJ | 2021
|