An open-source Khmer Word to Speech Model. Just single word not sentence!
pip install -r requirements.txt
wget https://huggingface.co/spaces/seanghay/KLEA/resolve/main/G_60000.pth
Place the checkpoint in the current directory.
python infer.py "មនុស្សខ្មែរ"
This will output a file called audio.wav
in the current directory. Output audio sample rate is 22.05 kHz.
python app.py
![image](https://private-user-images.githubusercontent.com/15277233/272982180-ac0da746-2a6c-439f-85da-a70e35efb85f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg1MjUzOTcsIm5iZiI6MTcxODUyNTA5NywicGF0aCI6Ii8xNTI3NzIzMy8yNzI5ODIxODAtYWMwZGE3NDYtMmE2Yy00MzlmLTg1ZGEtYTcwZTM1ZWZiODVmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE2VDA4MDQ1N1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWMwNmYxZGFkYjI5YzA4MmQ3OWE1M2JhMDQzODkwMmU3OGY3ODczZDU3M2U2ZWMyZmRkYTg2ZjQ2ZjJjYjUwODUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.tc-QGEBml4eZT4oxMx9gLbrApDhx87P_2ndIAGeZjSA)
This model was trained on kheng.info dataset. You can find it on http://kheng.info or at https://hf.co/datasets/seanghay/khmer_kheng_info_speech
- VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
- kheng.info is an online audio dictionary for the Khmer language with over 3000 recordings. Kheng.info is backed by multiple dictionaries and a large text corpus, and supports search in English and Khmer with search results ordered by word frequency.