You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi
If the tokenizer.json isn't available in the model directory, the faster-whisper loaded automatically downloads the tokenizer from huggingface which is a good thing. However, it always downloads the openai/whisper-tiny tokenizer. This can cause problems if the model used is or derived from whisper-large-v3 as it has a different tokenizer; the task token_ids are offset by 1 since it has introduced a new language id.
Hi
If the
tokenizer.json
isn't available in the model directory, the faster-whisper loaded automatically downloads the tokenizer from huggingface which is a good thing. However, it always downloads theopenai/whisper-tiny
tokenizer. This can cause problems if the model used is or derived fromwhisper-large-v3
as it has a different tokenizer; the task token_ids are offset by 1 since it has introduced a new language id.Can we modify the code so that
The text was updated successfully, but these errors were encountered: