ASR SpeechT5 training - model predicts same output for different inputs #62

L7uan · 2023-09-25T17:46:12Z

Hi!
I am currently trying to train a SpeechT5forSpeechToText model for an ASR task from scratch. My traing goes quite well most of the time, however when i try to use the model for inference with model.generate(**input) the predicts the same output for different inputs... I'm using the huggingface implementation and I followed every step on how to train the model but I just cant find the error in my code, why my model predicts the same output for every input...
Might this be a general error with the SpeechT5ForSpeechToText implementation on huggingface? Or am I doing anything wrong??
Any fast help would be really appreceated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASR SpeechT5 training - model predicts same output for different inputs #62

ASR SpeechT5 training - model predicts same output for different inputs #62

L7uan commented Sep 25, 2023

ASR SpeechT5 training - model predicts same output for different inputs #62

ASR SpeechT5 training - model predicts same output for different inputs #62

Comments

L7uan commented Sep 25, 2023