Skip to content

This is the official repository for the training of Yodi V1, the frst speech recognition system for Ewè.

Notifications You must be signed in to change notification settings

Umbaji/umini_speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

umini_speech

The first version of Yodi is built on the umini_speech dataset, the mini speech dataset which was developed and is currently being improved by the Umbaji team. It is a set of more than 800 augmented recordings from various users, in different settings speaking the same words as in the mini speech dataset but in Ewè.

The Yodi model was developed by applying STFT transforms to speech data to get the waveforms for training with more meaningful and embeddable data. For the audio inputs, the Ewè words vector space is:

$\alpha = {'gome', 'djo','dusi', 'mio', 'edji', 'note', 'ao','ee'}$

We consider the word labels (transcripts) vector space for the recognition,

$\Omega = { 'down','go','left','no','right','stop','up','yes'}$

Consult the Yodi package here :https://github.com/Umbaji/Yodi