Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ziya2预训练的语料拼接是如何通过attention mask规避的 #444

Open
linyubupa opened this issue Nov 16, 2023 · 5 comments
Open

Comments

@linyubupa
Copy link

image
@zztMermory
Copy link

同问+1

@qibao77
Copy link

qibao77 commented Dec 4, 2023

同问+1

@ganzhiruyi
Copy link
Contributor

ganzhiruyi commented Dec 6, 2023

只需要把当前token前面不属于同一个doc的token对应的attention_mask设置成0即可,不同doc通过eos即可区分。

@qibao77
Copy link

qibao77 commented Dec 11, 2023

只需要把当前token前面不属于同一个doc的token对应的attention_mask设置成0即可,不同doc通过eos即可区分。
@ganzhiruyi
但是这种方式生成的attention_mask是不是不能用flash attention呀?或者说自己改flash attention?

@ganzhiruyi
Copy link
Contributor

可以用flash attention triton

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants