Phoneme-level pronunciation control #94

danablend · 2024-03-14T15:32:02Z

Hey! I understand that the text tokens are currently encoded on the character-level and the model is trained with these tokens.

What would be the process for getting phoneme level control over the output audio to correct pronunciations for exotic words or different accents during runtime? One could maybe fine tune models for this, but getting the phoneme level control on the input side would be great.

This would be an amazing add. Would be happy to contribute.

vatsalaggarwal · 2024-03-14T18:07:23Z

Where have you had these kinds of issues? Are you able to share examples?

StephennFernandes · 2024-04-28T21:33:19Z

@danablend

I get what you are trying to express. you should try dict TTS

It used a dictionary of pronounciation of exotic words with could of sentence as context. It's a context aware exotic word pronounciation model.

lucapericlp added the feature request New feature or request label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phoneme-level pronunciation control #94

Phoneme-level pronunciation control #94

danablend commented Mar 14, 2024

vatsalaggarwal commented Mar 14, 2024

StephennFernandes commented Apr 28, 2024

Phoneme-level pronunciation control #94

Phoneme-level pronunciation control #94

Comments

danablend commented Mar 14, 2024

vatsalaggarwal commented Mar 14, 2024

StephennFernandes commented Apr 28, 2024