Keeping the same speaker in different files #777

wallaceblaia · 2024-04-11T17:00:50Z

First off, thank you for your fantastic work here. I am working on a project where I aim to translate and dub YouTube live streams almost in real-time. I've managed to achieve a delay of 3 minutes, but I'm looking to reduce this even further.

In my implementation, I capture the live stream and create segments of approximately 1 minute each because I use a technique to make cuts in speech only between words. After processing this audio with Demucs, I send it to the Whisperx pipeline. However, the speaker data varies across each audio file. I am interested in knowing if there is a way to preserve the embedding data of speakers across multiple audio files, with the same speakers, because I use the flags returned from diarization to dub in another language. But in each audio, I would have different flags for the same speaker.

SeeknnDestroy · 2024-05-15T16:22:33Z

Hey @wallaceblaia could you find any solution for this?

wallaceblaia changed the title ~~Mantendo o mesmo falante em arquivos diferente~~ Keeping the same speaker in different files Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keeping the same speaker in different files #777

Keeping the same speaker in different files #777

wallaceblaia commented Apr 11, 2024

SeeknnDestroy commented May 15, 2024

Keeping the same speaker in different files #777

Keeping the same speaker in different files #777

Comments

wallaceblaia commented Apr 11, 2024

SeeknnDestroy commented May 15, 2024