-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teacher forcing per timestep? #195
Comments
Year, I am dealing with RNN and also found this problem and in pytorch example (https://github.com/pytorch/tutorials/blob/master/intermediate_source/seq2seq_translation_tutorial.py#L558). I think it is a mistake |
Hi,
I don't understand why the teacher forcing is being done per the whole sequence. The definition of the teacher forcing claims that at each timestep, a predicted or the ground truth token should be fed from the previous timestep. The implementation here, on the other hand, will first make a decision on whether generate the whole sequence with teacher forcing, and then continues decoding with teacher forcing set to True or False (for the whole sequence), which I believe is not correct.
I really appreciate the feedback on this issue, Thanks!
The text was updated successfully, but these errors were encountered: