-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: NaturalSpeech2 training issue #115
Labels
bug
Something isn't working
Comments
10 tasks
Im experiencing same issue @KevinLee1993 described. Any information on a solution would be helpful. |
Hi, you can use any G2P module to get the phone sequence, and "num_frames" is the number of frames of the melspec. (For example, if the hopsize is 200, the num_frame of an 1s 16KHz audio is 80) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Thank you so much for sharing this wonderful project. However, I have some problem about the tts ns2 training.
./egs/tts/NaturalSpeech2/README.md suggests us to follow other Amphion TTS recipes for the data processing. But After I finish the features that need to be used in ns2 using fs2 and valle data preprocess script, I find I can not run the training script of ns2 successfully. In ./models/tts/naturalspeech2/ns2_dataset.py, some of the features seems to be obtained by refer to "phones" and "num_frames" in metadata, which is NOT included in the train.txt file.
Is there anything else I can do to run ns2 training successfully. Or should I just wait for the official update of ns2 preprocess as I have seen in other issue.
Can any of the author tell me when would the preprocess script be ready? Looking forward for your reply.
The text was updated successfully, but these errors were encountered: