Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about Mel_Post_Net #44

Open
liuxubo717 opened this issue Aug 6, 2021 · 1 comment
Open

about Mel_Post_Net #44

liuxubo717 opened this issue Aug 6, 2021 · 1 comment

Comments

@liuxubo717
Copy link

Many thanks for your great work.

I have a question about the Post-Net after the Mel Linear layer, I find in the inference stage, you use the mel_pred as output rather than the postnet_pred, which is shown as follows (in synthesis,py):

with t.no_grad():
    for i in pbar:
        pos_mel = t.arange(1,mel_input.size(1)+1).unsqueeze(0).cuda()
        **mel_pred**, postnet_pred, attn, stop_token, _, attn_dec = m.forward(text, mel_input, pos_text, pos_mel)
        mel_input = t.cat([mel_input, **mel_pred**[:,-1:,:]], dim=1)

Also, when I run the code, I find that the post_mel_loss is always bigger than the mel_loss, which means the post_Net module doesn't work as expected, right? I think it is conflicted with the Post-Net module used in Tacotron and the TTS-Transformer original paper. I am a bit confused, can you explain it to me? Many thanks!

@YanyuanAI
Copy link

Thanks for great work.
I have the same question with @liuxubo717 , I compare with the tacotron2, find that its postnet is "torch.nn.Conv1d", I don't understand that using the "Bi-GRU" in your postnet, does it better than the postnet of Tacotron2? can you explain than? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants