pretrain loss #56

MarsMeng1994 · 2023-07-07T01:57:25Z

Excuse me, what value does my pre-training loss reach, can I start fintune tts?

i found my finued tts model can generate a mel-spectrom but diffrent to ori mel-spectrom very much。

Is this due to the bart loss is too high?

mechanicalsea · 2023-07-07T04:12:14Z

As mentioned in the SpeechT5 paper: "We pre-train the proposed SpeechT5 model on 32 V100 GPUs with a batch size of around 90s samples per GPU for speech and 12k tokens per GPU for text and set the update frequency to 2 for 500k steps."
Thus, keeping pre-training.
For TTS fine-tuning, the pre-training without $\mathcal{L}{mlm}^s$ is more suitable because as mentioned in the paper "The proposed SpeechT5 trained without $\mathcal{L}{mlm}^s$ is considered because the bidirectional masked pre- diction loss is proposed to help the encoder learn to encode the speech signal, and this variant achieves superior Naturalness, as shown in Table 13 (in Appendix D)."

MarsMeng1994 · 2023-07-07T07:31:20Z

thanks for reply
does the nums_updates in the log means step? if true, it consume 2 hour for each 100 step in the picture, so it means it will consum10000 hour for pretrain?
can i use a english pretrained model to fintune a other language model? can it work?

mechanicalsea · 2023-07-07T07:38:05Z

10000 hours seems so long. Actually, pre-training on the 32 V100 GPUs cost around one week. So pre-training using multiple gpu is recommended.
The fine-tuning on the other languages is available by replace the English vocabulary to the fine-tuned vocabulary, but it causes language mismatch between pre-training and fine-tuning, which may influence the performance of the pre-training method.

MarsMeng1994 · 2023-07-12T07:30:03Z

thanks for reply
i will try to use more GPU. There is an other question, when pretraining, the num_workers is 0, why don't set it to a higher number such as fintune tts

can i set it to a higher number to accelerate pretraining?

when i set num_workers=1, there is a error like:
RuntimeError: unable to mmap 408 bytes from file </torch_2632095_3802486040_258611>: Cannot allocate

mechanicalsea mentioned this issue Jul 7, 2023

SpeechT5-tts fine-tuned on Chinese #49

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pretrain loss #56

pretrain loss #56

MarsMeng1994 commented Jul 7, 2023

mechanicalsea commented Jul 7, 2023

MarsMeng1994 commented Jul 7, 2023

mechanicalsea commented Jul 7, 2023

MarsMeng1994 commented Jul 12, 2023 •

edited

pretrain loss #56

pretrain loss #56

Comments

MarsMeng1994 commented Jul 7, 2023

mechanicalsea commented Jul 7, 2023

MarsMeng1994 commented Jul 7, 2023

mechanicalsea commented Jul 7, 2023

MarsMeng1994 commented Jul 12, 2023 • edited

MarsMeng1994 commented Jul 12, 2023 •

edited