Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuel datasets #2

Open
dribnet opened this issue Feb 22, 2017 · 14 comments
Open

fuel datasets #2

dribnet opened this issue Feb 22, 2017 · 14 comments

Comments

@dribnet
Copy link

dribnet commented Feb 22, 2017

The referenced fuel datasets ['arctic', 'blizzard', 'dimex', 'librispeech', 'pavoque', 'vctk'] are not in the fuel distribution. Are there standard converters for any of these already in other projects?

@sotelo
Copy link
Owner

sotelo commented Feb 22, 2017

Hey! Thanks for your interest.

Unfortunately, some of the datasets are not available publicly (like blizzard). For the others, we plan to release a preprocessed version so people can use them. We have a rough series of instructions to preprocess but it requires installing quite a few libraries so I'm not sure that you'd like to go that way. Let me know what would you prefer.

Right now, we are working on finishing our ICML submission. After this, (probably this weekend,) we will be more free to shape up the code and data. Also, we should release some pretrained models for everyone to explore.

@slbinilkumar
Copy link

i want to know how to train with utf8 processing and how to train Arctic data .

@Zeta36
Copy link

Zeta36 commented Feb 23, 2017

@sotelo, is this the way you preprocess the data for this project?: https://github.com/sotelo/world.py

Thank you for your work!!

@sotelo
Copy link
Owner

sotelo commented Feb 23, 2017

@Zeta36 Hola! No, it's not like that. We will describe how we do it soon. With the ICML deadline coming, we are finishing the paper but should be ready to help others with replication afterwards.

@AdamMiltonBarker
Copy link

Hi @sotelo this is great, where can I find your email would definitely like to keep up with your progress.

@Zeta36
Copy link

Zeta36 commented Mar 5, 2017

Hello, @sotelo. Any news about your project? (I 've not seen any update in your website from a time now)

By the way, you said to @dribnet: "Unfortunately, some of the datasets are not available publicly (like blizzard). For the others, we plan to release a preprocessed version so people can use them."

Are you finally going to release this or explain al least how to do this preprocces step?

Thanks a lot!!

@Zeta36
Copy link

Zeta36 commented Mar 7, 2017

Hello, @sotelo.

"So, we're currently in the process of doing this. It's a bit messy because the data processing requires installing a few C libraries. Now, we're deliberating whether we should proceed with wrappers (basically updating my old world.py repo) or we just should point people to the instructions on how to do the processing themselves."

It would be wonderful to have any of the two posibilites. No hurry anyway, we will be waiting :).

Regards!!

@dp-aixball
Copy link

Hello, @sotelo.
My data shapes printed by datasets.py:
features shape: (1001, 500, 67) features dtype: float64
features_mask shape: (1001, 500) features_mask dtype: float64
labels shape: (500, 1914) labels dtype: int32
labels_mask shape: (500, 1914) labels_mask dtype: float64

It's correct?(seq_size is 1000, batch_size is 500,feature_dim is 67)

Why labels and labels_mask not do _transpose?

Regards!!

@dp-aixball
Copy link

dp-aixball commented Mar 31, 2017

@sotelo The features include 60 mgc,5 bap,1 lf0 and 1 v/u, 5 ms / frame. labels are pure phones index seqs(label_type use unaligned_phoneme in code).not use raw_audio process now.
audios are a bit long,one includes 1 to 5 sentences,I will split to 1 sentence latter,is it necessary?

Your seq_size=50, means to 250ms? so short. And I can't find SegmentSequence process for unaligned_phonemes in your codes, why? I just want to train mapping from unaligned_phonemes to vocoder features, What should I do?

Thanks!

@dp-aixball
Copy link

@sotelo

How can I understand them:'full_labels', 'phonemes', 'unaligned_phonemes', 'text'.
Thanks!

@dp-aixball
Copy link

Yes, I using unaligned phonemes and have trained one model, MSE from 150 to 6.1, but the sound over vocoder is not right, I'am checking...

@reuben
Copy link

reuben commented Jun 21, 2017

Any updates on this? I wanted to do some experiments with VCTK, but couldn't figure out how to preprocess the data.

@caoba1
Copy link

caoba1 commented Jun 23, 2017

Hey, @sotelo is there a way to use parrrot for finding phoneme boundaries? I am working on concatenative synthesis and it would be a very nice feature to have.

@ystrehlow
Copy link

For the others, we plan to release a preprocessed version so people can use them. We have a rough series of instructions to preprocess but it requires installing quite a few libraries so I'm not sure that you'd like to go that way. Let me know what would you prefer.

Hey @sotelo, is there any new information regarding the preprocessed version or could you upload/point to the mentioned rough series of instructions? Would be very helpful.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants