Why using padding_side='right' during training? #6

YuZhang10 · 2024-03-08T01:59:16Z

Hi, I noticed you use padding_side='right' in training while 'left' in eval.
In my previous experience, padding_side is usually set to 'left' for generation models. (as stated in this link )
Looking forward to your reply~Thanks in advance.

zhengbw0324 · 2024-03-09T12:09:48Z

@YuZhang10
Hello, the position encoding used by LLaMA is RoPE, a relative position encoding, so there is no difference whether left padding or right padding is used during training. However, during the autoregressive generation process, each new token generated is added to the end of the sentence. If you use the right padding, it will be added after the pad token, which is unreasonable. Therefore, be sure to use left padding during the inference process.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why using padding_side='right' during training? #6

Why using padding_side='right' during training? #6

YuZhang10 commented Mar 8, 2024

zhengbw0324 commented Mar 9, 2024

Why using padding_side='right' during training? #6

Why using padding_side='right' during training? #6

Comments

YuZhang10 commented Mar 8, 2024

zhengbw0324 commented Mar 9, 2024