Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <title>GLM的双向注意力是否有必要? #1463

Open
1 task done
ssgg-code opened this issue Mar 5, 2024 · 0 comments
Open
1 task done

[BUG/Help] <title>GLM的双向注意力是否有必要? #1463

ssgg-code opened this issue Mar 5, 2024 · 0 comments

Comments

@ssgg-code
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

最近在学习GLM的理论知识,发现很多解释都是粗略的说GLM的双向attention能够帮助更好的理解上下文。但是和bert的MLM方式不同,glm把被mask的部分作为partB放在了partA的后面,在预训练的时候,也是用partB对应维度的输出去做loss的计算。那么partA部分的注意力是否是双向有起到作用吗? 还是说有做其他的预训练任务?

Expected Behavior

No response

Steps To Reproduce

none

Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant