Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why using padding_side='right' during training? #6

Open
YuZhang10 opened this issue Mar 8, 2024 · 1 comment
Open

Why using padding_side='right' during training? #6

YuZhang10 opened this issue Mar 8, 2024 · 1 comment

Comments

@YuZhang10
Copy link

Hi, I noticed you use padding_side='right' in training while 'left' in eval.
In my previous experience, padding_side is usually set to 'left' for generation models. (as stated in this link )
Looking forward to your reply~Thanks in advance.

@zhengbw0324
Copy link
Collaborator

@YuZhang10
Hello, the position encoding used by LLaMA is RoPE, a relative position encoding, so there is no difference whether left padding or right padding is used during training. However, during the autoregressive generation process, each new token generated is added to the end of the sentence. If you use the right padding, it will be added after the pad token, which is unreasonable. Therefore, be sure to use left padding during the inference process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants