How was the pretraining dataset Laion-400M used ? does this actually refer to the use of the ‘open_clip_pytorch_model’ from OPENCLIP ? #25

BaiqiangGit · 2023-08-09T08:47:59Z

您好，请教下，论文里提到的用Laion-400M预训练，是指用Laion-400M对VideoComposer做了额外的预训练？如果是的话，预训练的输入组织方式，和参与训练的算法模块，可以讲解一下吗？谢谢 ~

PS: 看代码里和Laion相关的有2个预训练模型，没有找到Laion-400M相关的，是不是我理解错了？

"v2-1_512-ema-pruned.ckpt" ：预训练是用Laion5B
“open_clip_pytorch_model” ：预训练是Laion2B （OPENCLIP里的“ViT-H-14", pretrained="laion2b_s32b_b79k”）

Steven-SWZhang · 2023-08-18T12:47:11Z

Hello, our model supports both videos and single frame as inputs. When inputting single frame, the value of F is set to 1. As long as the dimensions within each batch are consistent, we train the model using both images and videos simultaneously.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How was the pretraining dataset Laion-400M used ? does this actually refer to the use of the ‘open_clip_pytorch_model’ from OPENCLIP ? #25

How was the pretraining dataset Laion-400M used ? does this actually refer to the use of the ‘open_clip_pytorch_model’ from OPENCLIP ? #25

BaiqiangGit commented Aug 9, 2023

Steven-SWZhang commented Aug 18, 2023

How was the pretraining dataset Laion-400M used ? does this actually refer to the use of the ‘open_clip_pytorch_model’ from OPENCLIP ? #25

How was the pretraining dataset Laion-400M used ? does this actually refer to the use of the ‘open_clip_pytorch_model’ from OPENCLIP ? #25

Comments

BaiqiangGit commented Aug 9, 2023

Steven-SWZhang commented Aug 18, 2023