Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silero-VAD Meta Hallucinations #843

Open
TedTimbrell opened this issue May 16, 2024 · 1 comment
Open

Silero-VAD Meta Hallucinations #843

TedTimbrell opened this issue May 16, 2024 · 1 comment

Comments

@TedTimbrell
Copy link

TedTimbrell commented May 16, 2024

I noticed while transcribing some of my own audio that near-silence doesn't get removed during VAD. In fact, running noisereduce actually made the problem dramatically worse, making 10 seconds of falsely detected speech into a minute and a half of falsely detected speech.

Apologies if I'm referring to the wrong version of Silero but it seems like this a known issue / feature(tm). snakers4/silero-vad#396

Preforming a volume filter along with VAD might solve a fair number of hallucinations and might even remove the need to set condition_on_previous_text to False prevent the hallucinations from ruining the rest (section) of the transcription.

I'm down to try it out and open a PR if you all are welcome to it. Before I do though, I'm curious if this came up when adding the hallucination detection logic.

It'd be really nice to have in this library so that I don't have to preform a second layer of timestamp adjustments.

@trungkienbkhn
Copy link
Collaborator

@TedTimbrell , hello. Feel free to open a new PR, and could you attach an example audio ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants