General advice on training a custom dataset w/ GlowTTS #3416

T145 · 2023-12-12T19:54:08Z

T145
Dec 12, 2023

So I have a custom dataset w/ audio clips that range from a few seconds to a couple minutes. Its base sample rate is 48000, which I resample to 22050 using the resample script:

from TTS.bin.resample import resample_files
resample_files(sounds, 22050, file_ext='wav', n_jobs=10)

This is the only data preparation I perform as I assume audio with a lower quality is easier for the trainer to work with. To train a model I'm using the script given in the official tutorial, except I boosted the epoch count from 100 to 300.

Performing text-to-speech using the best model and a source WAV is fairly robotic, so what should be done to make it closer to the original quality? Are there any recommended parameters to add or remove in the config to help, or more that should be done to the audio itself?

EDIT: The model is en and not multi-lingual.

ianujkr · 2024-04-08T09:26:04Z

ianujkr
Apr 8, 2024

Hello,
Did u get your model trained on the mentioned model and what was the speed with which model was getting trained on?

1 reply

T145 Apr 8, 2024
Author

As mentioned I trained it with the parameters given in the official tutorial. Everything was the same but the epoch count.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General advice on training a custom dataset w/ GlowTTS #3416

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

General advice on training a custom dataset w/ GlowTTS #3416

T145 Dec 12, 2023

Replies: 1 comment · 1 reply

ianujkr Apr 8, 2024

T145 Apr 8, 2024 Author

T145
Dec 12, 2023

Replies: 1 comment 1 reply

ianujkr
Apr 8, 2024

T145 Apr 8, 2024
Author