Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CONTRIBUTION] Speech Dataset Generator for Metavoice #112

Open
davidmartinrius opened this issue Mar 26, 2024 · 5 comments
Open

[CONTRIBUTION] Speech Dataset Generator for Metavoice #112

davidmartinrius opened this issue Mar 26, 2024 · 5 comments

Comments

@davidmartinrius
Copy link

davidmartinrius commented Mar 26, 2024

Hi everyone!

I have just published this project on GitHub: https://github.com/davidmartinrius/speech-dataset-generator/

Now you can create datasets automatically with any audio or lists of audios.

This project creates metavoice datasets. You can pass your own files, youtube links, ted talks or librivox audiobooks as input and it will create a dataset from that.

I hope you can find it useful.

Here are the key functionalities of the project:

  1. Dataset Generation: The project allows for the creation of datasets with Mean Opinion Score (MOS).

  2. Silence Removal: It includes a feature to remove silences from audio files, enhancing the overall quality.

  3. Sound Quality Improvement: The project focuses on improving the quality of the audio.

  4. Audio Segmentation: It can segment audio files within specified second ranges.

  5. Transcription: The project transcribes the segmented audio, providing a textual representation.

  6. Gender Identification: It identifies the gender of each speaker in the audio.

  7. Pyannote Embeddings: Utilizes pyannote embeddings for speaker detection across multiple audio files.

  8. Automatic Speaker Naming: Automatically assigns names to speakers detected in multiple audios.

  9. Multiple Speaker Detection: Capable of detecting multiple speakers within each audio file.

Feel free to explore the project at https://github.com/davidmartinrius/speech-dataset-generator.

Actually I was not planning to include the Metavoice dataset in the speech dataset generator, but @platform-kit asked me to implement it and I just did it
speechbrain/speechbrain#2428 (comment)

@maepopi
Copy link

maepopi commented Mar 27, 2024

Oh wow thank you for this!! I myself made an audio dataset manager a month ago but I think yours is much more complete! Here's mine if you want to take a look and maybe merge the two together : mine is mostly designed to work with this repo, and it notably has a feature to correct JSON transcriptions and manage your dataset from a UI.

Thank you again, can't wait to test your tool!

@davidmartinrius
Copy link
Author

Hi @maepopi , pull requests are welcome :) I have little time to add new features, as I am developing it in my free time. I any case, new features like yours are welcome. Thanks for sharing your tools.

@maepopi
Copy link

maepopi commented Mar 30, 2024

Hey there! Oh that’s great! I’ve never contributed to another repo before so that will be a first 😂 I’ll start by having a look and see if I can add my stuff, and I’ll keep you posted 😊

@davidmartinrius
Copy link
Author

Great! How are you holding up? What key points do you think could be integrated into the project?

@maepopi
Copy link

maepopi commented Apr 1, 2024

Hey! Sorry I didn't have time to have a look yet, I'll try some time this week or week end.

From what I've read in your readme, I think you integrated most of what I did in my tool like transcription and audio segmentation. I'm very curious to test your quality improvement feature, for I have a couple of audiobooks whose sound is really not great. In the end I think what I would add is my part about checking and fixing the transcription, but I might have to add an option to deal with CSV inputs instead of JSON.

Anyway only speculations here, as I said I didn't test your tool yet, I'll try and do that soon! Sorry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants