-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross Attention maps #8
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
Thank you so much for your great work and codebase!
I would appreciate your clarifications on a few items.
TextToVideoSDPipelineCall.py
, at this line, the attention maps from the temporal layers seem to be empty, by approximately using this code block- First set
while only
.attentions
layers and thetransformer_in
layer in the second set have cross attention maps.- Second set
transformer_in.transformer_blocks[0].attn2
with size 64, 64, 24, 24 suggesting its temporal (not spatial as mentioned in supplemental) with 24 frames andmid_block.attentions[0].transformer_blocks[0].attn2
with size 480, 8, 8, 77 suggesting its the spatial attention map (not temporal) with 77 tokens.Your kind clarification would be very helpful. Thanks
The text was updated successfully, but these errors were encountered: