Eine Plattform für die Wissenschaft: Bauingenieurwesen, Architektur und Urbanistik
Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals
A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.
Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals
A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of prominent SDG labeling systems using a variety of text sources and show that these differ considerably in their sensitivity (i.e., true-positive rate) and specificity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools SDG labeling systems alleviates some of these limitations, exceeding the performance of the individual SDG labeling systems considered. We conclude that researchers and policymakers should care about the choice of the SDG labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.
Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals
Sustain Sci
Wulff, Dirk U. (Autor:in) / Meier, Dominik S. (Autor:in) / Mata, Rui (Autor:in)
Sustainability Science ; 19 ; 1773-1787
01.09.2024
15 pages
Aufsatz (Zeitschrift)
Elektronische Ressource
Englisch
Springer Verlag | 2025
|Big Data to Support Sustainable Development Goals (SDGs)
TIBKAT | 2021
|Big Data to Support Sustainable Development Goals (SDGs)
Springer Verlag | 2020
|