A platform for research: civil engineering, architecture and urbanism
Echo: A crowd-sourced Romanian speech dataset.
Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less popular languages. Echo is a crowd-sourcing platform that has collected more than 300 hours of speech from various contributors. In this study, we document how a large speech dataset enables researchers to train automatic speech recognition, speaker verification, and diarization models to automatically process students’ notes. We publicly release both the dataset and the Whisper-based baseline model as open-source.
Echo: A crowd-sourced Romanian speech dataset.
Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less popular languages. Echo is a crowd-sourcing platform that has collected more than 300 hours of speech from various contributors. In this study, we document how a large speech dataset enables researchers to train automatic speech recognition, speaker verification, and diarization models to automatically process students’ notes. We publicly release both the dataset and the Whisper-based baseline model as open-source.
Echo: A crowd-sourced Romanian speech dataset.
Remus-Dan Ungureanu (author) / Mihai Dascalu (author)
2024
Article (Journal)
Electronic Resource
Unknown
Metadata by DOAJ is licensed under CC BY-SA 1.0
Shoreline change mapping using crowd-sourced smartphone images
Elsevier | 2019
|Crowd-sourced collected building attributes of the Colouring Dresden project
DataCite | 2024
|Designing Data Validation Framework for Crowd-Sourced Road Monitoring Applications
Springer Verlag | 2022
|