Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

初学者请问作者prosody_model.pt是怎么训练得到的 #107

Open
ShangkunTu opened this issue Jul 12, 2023 · 4 comments
Open

初学者请问作者prosody_model.pt是怎么训练得到的 #107

ShangkunTu opened this issue Jul 12, 2023 · 4 comments

Comments

@ShangkunTu
Copy link

作者您好,请问在合成阶段用到的prosody_model.pt是在哪里生成的,您提供的这代码有能生成这个prossody模型文件吗?我尝试找了一下,但是没找到,还是说这个模型文件本代码并没有生成只是拿来用呀。

@ShangkunTu
Copy link
Author

Reference For TTS Microsoft's NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

https://github.com/Executedone/Chinese-FastSpeech2 bert prosody

https://github.com/wenet-e2e/WeTextProcessing

https://github.com/jaywalnut310/vits

您好,我用您的工程训练到了31w步了,发现比较18w步,发音更清晰了,并且情感上也上扬了一些,比较出来更活泼了一些。发音更清晰了我能理解,但是情感表现为更上扬了,更明亮了是要归功于上面提到了来自于FastSpeech2中文版里面的韵律模型吗?期待您的回复

@MaxMax2016
Copy link
Collaborator

是的,去掉韵律模型会呈现一个平均状态的韵律,没有韵律的起伏变化。

你可以试试修改推理,屏蔽bert韵律向量

def forward(self, x, x_lengths, bert):
x = self.emb(x) * math.sqrt(self.hidden_channels) # [b, t, h]
b = self.emb_bert(bert)
x = x + b
x = torch.transpose(x, 1, -1) # [b, h, t]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants