In addition to the Common Voice dataset, we’re also building an open source speech recognition engine called Deep Speech.
Source: tatoeba (8310495)
Ranked by relevance and common usage.
OpenGloss and ConceptNet supply richer edges like generalizations, collocations, and derivations.
10 total sentences available.
In addition to the Common Voice dataset, we’re also building an open source speech recognition engine called Deep Speech.
Source: tatoeba (8310495)
We’re crowdsourcing an open-source dataset of voices.
Source: tatoeba (8310507)
Donate your voice, validate the accuracy of other people’s clips, make the dataset better for everyone.
Source: tatoeba (8310508)
To make it into the Common Voice dataset, a voice clip must be validated by two separate users.
Source: tatoeba (8310521)
Showing 4 of 10 available sentences.
Data sourced from Wiktionary, WordNet, CMU, and other open linguistic databases. Updated March 2026.