You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
I am currently trying to train a SpeechT5forSpeechToText model for an ASR task from scratch. My traing goes quite well most of the time, however when i try to use the model for inference with model.generate(**input) the predicts the same output for different inputs... I'm using the huggingface implementation and I followed every step on how to train the model but I just cant find the error in my code, why my model predicts the same output for every input...
Might this be a general error with the SpeechT5ForSpeechToText implementation on huggingface? Or am I doing anything wrong??
Any fast help would be really appreceated!
The text was updated successfully, but these errors were encountered:
Hi!
I am currently trying to train a SpeechT5forSpeechToText model for an ASR task from scratch. My traing goes quite well most of the time, however when i try to use the model for inference with model.generate(**input) the predicts the same output for different inputs... I'm using the huggingface implementation and I followed every step on how to train the model but I just cant find the error in my code, why my model predicts the same output for every input...
Might this be a general error with the SpeechT5ForSpeechToText implementation on huggingface? Or am I doing anything wrong??
Any fast help would be really appreceated!
The text was updated successfully, but these errors were encountered: