Feature request - add multiple speakers to repository #83

BBC-Esq · 2024-02-05T19:47:41Z

The following website has a bunch of voices for Bark:

https://rsxdalv.github.io/bark-speaker-directory/

I was wondering if anyone had an interest in doing something similar for WhisperSpeech? Currently, to use anything except the default voice one has to obtain an audio file and properly add a parameter within custom code to extract the embeddings...then the voice is used.

The pipeline.py script currently hardcodes the default voice here:

Perhaps we can obtain multiple tensors of high quality voices and offer them as options for people, male, female, etc.? I'm willing to contribute but still haven't been able to accurately extract speaker embeddings and get the tensors...spent about 3 hours trying different ways.

Let's say we get a dozen high quality voices (i.e. tensors), perhaps include them in a configuration file or constants.py and allow people to choose among them - not removing the ability to create your own of course!

People could even post their voices in the tensor format in the "examples" folder, just brainstorming.

The text was updated successfully, but these errors were encountered:

jpc · 2024-02-13T10:48:00Z

This is actually quite easy to add – one needs to run a voice sample through the speechbrain model (example code is in pipeline.py) and copy the resulting weights to a file.

If we want to add some voices be default we could probably save all the vectors to huggingface in a single pth file (instead of pasting them into the source code). The tricky part is to find reference voices that are properly licensed. Maybe use a few samples from LibriTTS-R?

BBC-Esq · 2024-02-13T12:15:01Z

Yep, that was my only concern, the licensing issue. One idea would be to use a file named constants.py and just keep adding voices that we've verified as high quality and there's no licensing issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request - add multiple speakers to repository #83

Feature request - add multiple speakers to repository #83

BBC-Esq commented Feb 5, 2024 •

edited

jpc commented Feb 13, 2024

BBC-Esq commented Feb 13, 2024

Feature request - add multiple speakers to repository #83

Feature request - add multiple speakers to repository #83

Comments

BBC-Esq commented Feb 5, 2024 • edited

jpc commented Feb 13, 2024

BBC-Esq commented Feb 13, 2024

BBC-Esq commented Feb 5, 2024 •

edited