Issues with se2seq tutorial (batch training) #2840

gavril0 · 2024-04-18T19:50:36Z

Add Link

Link to the tutorial:

https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

Describe the bug

The tutorial was markedly changed in June 2023, see commit 6c03bb3 which aimed at fixing the implementation of attention among other things (#2468). In doing so, several other things have been changed:

adding dataloader which returns a batch of zero-padded sequences to train the network
the foward() function of the Decoder process input one word at the time in parallel for all sentences
in the batch until MAX_LENGTH is reached.

I am not a torch expert but I think that the embedding layers in the encoder and decoder should have been modified to recognize padding (padding_idx=0 is missing). Using zero-padded sequence as input might also have other implications during learning but I am not sure. Can you confirm that the implementation is correct?

As a result of these change, the text does not describe well the code. I think that it would be nice to include a discussion of zero-padding and the implications of using batches on the code in the tutorial. I am also curious if there is really a gain in using a batch since most sentences are short.

Finally, I found a mention in the text about using teacher_forcing_ratio which is not included in the code. The tutorial or the code need to be adjusted.

If this is useful, I found another implementation of the same tutorial which seems to be a fork from a previous version (it was archived in 2021):

It does not does not use batches
It includes teacher_forcing_ratio to select the amount of forced teaching
It implements both Luong et al and Bahdanau et al. models of attention

Describe your environment

I appreciate this tutorial as it provides a simple introduction to Seq2Seq models with a small dataset. I am actually trying to port this tutorial in R with torch package.

cc @albanD

The text was updated successfully, but these errors were encountered:

gavril0 added the bug label Apr 18, 2024

brcolli mentioned this issue May 15, 2024

Including reference to teacher_forcing_ratio and padding_idx to seq2seq_translation_tutorial #2870

Open

4 tasks

svekars added the core Tutorials of any level of difficulty related to the core pytorch functionality label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with se2seq tutorial (batch training) #2840

Issues with se2seq tutorial (batch training) #2840

gavril0 commented Apr 18, 2024 •

edited by pytorch-bot bot

Issues with se2seq tutorial (batch training) #2840

Issues with se2seq tutorial (batch training) #2840

Comments

gavril0 commented Apr 18, 2024 • edited by pytorch-bot bot

Add Link

Describe the bug

Describe your environment

gavril0 commented Apr 18, 2024 •

edited by pytorch-bot bot