初学者请问作者prosody_model.pt是怎么训练得到的 #107

ShangkunTu · 2023-07-12T05:09:19Z

作者您好，请问在合成阶段用到的prosody_model.pt是在哪里生成的，您提供的这代码有能生成这个prossody模型文件吗？我尝试找了一下，但是没找到，还是说这个模型文件本代码并没有生成只是拿来用呀。

MaxMax2016 · 2023-07-12T05:14:40Z

Reference For TTS
Microsoft's NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

https://github.com/Executedone/Chinese-FastSpeech2 bert prosody

https://github.com/wenet-e2e/WeTextProcessing

ShangkunTu · 2023-07-12T08:35:30Z

Reference For TTS Microsoft's NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

https://github.com/Executedone/Chinese-FastSpeech2 bert prosody

https://github.com/wenet-e2e/WeTextProcessing

https://github.com/jaywalnut310/vits

感谢回复，我研究一下

ShangkunTu · 2023-07-14T06:49:56Z

Reference For TTS Microsoft's NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

https://github.com/Executedone/Chinese-FastSpeech2 bert prosody

https://github.com/wenet-e2e/WeTextProcessing

https://github.com/jaywalnut310/vits

您好，我用您的工程训练到了31w步了，发现比较18w步，发音更清晰了，并且情感上也上扬了一些，比较出来更活泼了一些。发音更清晰了我能理解，但是情感表现为更上扬了，更明亮了是要归功于上面提到了来自于FastSpeech2中文版里面的韵律模型吗？期待您的回复

MaxMax2016 · 2023-07-14T06:57:27Z

是的，去掉韵律模型会呈现一个平均状态的韵律，没有韵律的起伏变化。

你可以试试修改推理，屏蔽bert韵律向量

def forward(self, x, x_lengths, bert):
x = self.emb(x) * math.sqrt(self.hidden_channels) # [b, t, h]
b = self.emb_bert(bert)
~~x = x + b~~
x = torch.transpose(x, 1, -1) # [b, h, t]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

初学者请问作者prosody_model.pt是怎么训练得到的 #107

初学者请问作者prosody_model.pt是怎么训练得到的 #107

ShangkunTu commented Jul 12, 2023

MaxMax2016 commented Jul 12, 2023

ShangkunTu commented Jul 12, 2023

ShangkunTu commented Jul 14, 2023

MaxMax2016 commented Jul 14, 2023

初学者请问作者prosody_model.pt是怎么训练得到的 #107

初学者请问作者prosody_model.pt是怎么训练得到的 #107

Comments

ShangkunTu commented Jul 12, 2023

MaxMax2016 commented Jul 12, 2023

ShangkunTu commented Jul 12, 2023

ShangkunTu commented Jul 14, 2023

MaxMax2016 commented Jul 14, 2023