Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How was the pretraining dataset Laion-400M used ? does this actually refer to the use of the ‘open_clip_pytorch_model’ from OPENCLIP ? #25

Open
BaiqiangGit opened this issue Aug 9, 2023 · 1 comment

Comments

@BaiqiangGit
Copy link

您好,请教下,论文里提到的用Laion-400M预训练,是指用Laion-400M对VideoComposer做了额外的预训练 ?如果是的话,预训练的输入组织方式,和参与训练的算法模块,可以讲解一下吗? 谢谢 ~

PS: 看代码里和Laion相关的有2个预训练模型,没有找到Laion-400M相关的,是不是我理解错了?

  • "v2-1_512-ema-pruned.ckpt" :预训练是用Laion5B
  • “open_clip_pytorch_model” : 预训练是Laion2B (OPENCLIP里的“ViT-H-14", pretrained="laion2b_s32b_b79k”)
@Steven-SWZhang
Copy link
Collaborator

Hello, our model supports both videos and single frame as inputs. When inputting single frame, the value of F is set to 1. As long as the dimensions within each batch are consistent, we train the model using both images and videos simultaneously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants