Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bert 加入以后, 底模训练, KL loss爆炸. #145

Open
ericwudayi opened this issue Nov 2, 2023 · 2 comments
Open

Bert 加入以后, 底模训练, KL loss爆炸. #145

ericwudayi opened this issue Nov 2, 2023 · 2 comments

Comments

@ericwudayi
Copy link

ericwudayi commented Nov 2, 2023

感谢大大提供这么好的repo,

我这边在用训练一个基底模型, 用版主預先訓練好的prosody.pt 來做character embedding的提取, 然後把提取出來的feature跟chinses_ipa出來的音素做對齊. 這樣算出來的feature, 每一個時間點的norm大約在10左右

不知道在加入bert feature以后有没有发现KL爆炸的问题? 我對Bert feature加入很多normalization都沒有辦法解決這個問題. 不知道版主有沒有遇到過?
image

def forward(self, x, x_lengths, bert, bert_legnths):
    bert_emb = self.tanh(self.bert_proj(self.tanh(bert)).transpose(1, 2))
    #bert_emb = bert_emb / (torch.norm(bert_emb, dim=-1).unsqueeze(-1) + 1e-5) * 0.0
    #bert_emb = self.bert_proj(bert).transpose(1, 2) #[b,t,h]
    x = self.emb(x) * math.sqrt(self.hidden_channels) + bert_emb # [b, t, h]
@MaxMax2016
Copy link
Collaborator

我没研究过IPA,不知道您有没有推荐的IPA方案呢,我研究下将拼音替换为IPA

@ericwudayi
Copy link
Author

我是使用vits-fast-finetuning 的chinese-to-ipa来实作的. 比较繁复的点是如何让ipa跟char对齐. 但这个问题我后来发现跟ipa没有关系. 问题是我在data_util.py 没有加上bert.zero_...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants