Bert 加入以后, 底模训练, KL loss爆炸. #145

ericwudayi · 2023-11-02T01:45:55Z

感谢大大提供这么好的repo,

我这边在用训练一个基底模型, 用版主預先訓練好的prosody.pt 來做character embedding的提取, 然後把提取出來的feature跟chinses_ipa出來的音素做對齊. 這樣算出來的feature, 每一個時間點的norm大約在10左右

不知道在加入bert feature以后有没有发现KL爆炸的问题? 我對Bert feature加入很多normalization都沒有辦法解決這個問題. 不知道版主有沒有遇到過?

def forward(self, x, x_lengths, bert, bert_legnths):
    bert_emb = self.tanh(self.bert_proj(self.tanh(bert)).transpose(1, 2))
    #bert_emb = bert_emb / (torch.norm(bert_emb, dim=-1).unsqueeze(-1) + 1e-5) * 0.0
    #bert_emb = self.bert_proj(bert).transpose(1, 2) #[b,t,h]
    x = self.emb(x) * math.sqrt(self.hidden_channels) + bert_emb # [b, t, h]

The text was updated successfully, but these errors were encountered:

MaxMax2016 · 2023-11-02T06:12:03Z

我没研究过IPA，不知道您有没有推荐的IPA方案呢，我研究下将拼音替换为IPA

ericwudayi · 2023-11-02T11:18:23Z

我是使用vits-fast-finetuning 的chinese-to-ipa来实作的. 比较繁复的点是如何让ipa跟char对齐. 但这个问题我后来发现跟ipa没有关系. 问题是我在data_util.py 没有加上bert.zero_...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bert 加入以后, 底模训练, KL loss爆炸. #145

Bert 加入以后, 底模训练, KL loss爆炸. #145

ericwudayi commented Nov 2, 2023 •

edited

Loading

MaxMax2016 commented Nov 2, 2023

ericwudayi commented Nov 2, 2023

Bert 加入以后, 底模训练, KL loss爆炸. #145

Bert 加入以后, 底模训练, KL loss爆炸. #145

Comments

ericwudayi commented Nov 2, 2023 • edited Loading

MaxMax2016 commented Nov 2, 2023

ericwudayi commented Nov 2, 2023

ericwudayi commented Nov 2, 2023 •

edited

Loading