-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about RoPE code #227
Comments
You are correct that And when it comes to the modication, that is also correct because it would ensure that the rotary embeddings are applied only to the subset of features specified by |
They have the similar shapes. The truncation of cached sin/cos to |
Thanks for reply!! @vpj @nagamonish Didn't you have any problems running that code? The original code didn't work for me with different shape of input. And i thought it's about grammar. |
annotated_deep_learning_paper_implementations/labml_nn/transformers/rope/__init__.py
Line 188 in f42c0e9
self.cos_cached
andself.sin_cached
have same shape ofx
, aren't they??So if this line intended to compute RoPE with partial of x which means
x[...,:self.d]
,i think this line should be
x_rope = (x_rope * self.cos_cached[...,:self.d) + (neg_half_x * self.sin_cached[...,:self.d])
please let me know if i'm wrong
The text was updated successfully, but these errors were encountered: