For LING 575C Speech Technology for Endangered Languages
datasets/
- br/ (Breton)
-
- letter_to_sound.txt (pronunciation dictionary)
-
- transcribed/ (wav and eaf files for training)
-
- untranscribed/ (wav files development set)
- cy/ (Welsh)
-
- letter_to_sound.txt (pronunciation dictionary)
-
- transcribed/ (wav and eaf files for training)
-
- untranscribed/ (wav files development set)
- de/ (German)
-
- letter_to_sound.txt (pronunciation dictionary)
-
- transcribed/ (wav and eaf files for training)
-
- untranscribed/ (wav files development set)
This project includes audio and transcriptions (in Elan .eaf format) for 3 languages (German, Breton, and Welsh).