Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After following all steps still get huge missmatch in loading the checkpoint #55

Open
Wonder1905 opened this issue Jan 6, 2024 · 0 comments

Comments

@Wonder1905
Copy link

Im running with the following command:
python inference.py --seed 123 --mode 'i2v' --ckpt_path checkpoints/i2v_512_v1/model.ckpt --config configs/inference_i2v_512_v1.0.yaml --savedir results/i2v_512_test --n_samples 1 --bs 1 --height 320 --width 512 --unconditional_guidance_scale 12.0 --ddim_steps 50 --ddim_eta 1.0 --prompt_file prompts/i2v_prompts/test_prompts.txt --cond_input prompts/i2v_prompts --fps 8

RuntimeError: Error(s) in loading state_dict for LatentVisualDiffusion:
Missing key(s) in state_dict: "model.diffusion_model.input_blocks.1.0.temopral_conv.conv1.0.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv1.0.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv1.2.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv1.2.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv2.0.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv2.0.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv2.3.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv2.3.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv3.0.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv3.0.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv3.3.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv3.3.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv4.0.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv4.0.bias", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv4.3.weight", "model.diffusion_model.input_blocks.1.0.temopral_conv.conv4.3.bias", "model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k_ip.weight", "model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v_ip.weight", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv1.0.weight", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv1.0.bias", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv1.2.weight", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv1.2.bias", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv2.0.weight", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv2.0.bias", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv2.3.weight", "model.diffusion_model.input_blocks.2.0.temopral_conv.conv2.3.bias",
.....
"model.diffusion_model.input_blocks.11.0.temopral_conv.conv1.0.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv1.2.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv1.2.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv2.0.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv2.0.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv2.3.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv2.3.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv3.0.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv3.0.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv3.3.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv3.3.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv4.0.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv4.0.bias", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv4.3.weight", "model.diffusion_model.input_blocks.11.0.temopral_conv.conv4.3.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv1.0.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv1.0.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv1.2.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv1.2.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv2.0.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv2.0.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv2.3.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv2.3.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv3.0.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv3.0.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv3.3.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv3.3.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv4.0.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv4.0.bias", "model.diffusion_model.middle_block.0.temopral_conv.conv4.3.weight", "model.diffusion_model.middle_block.0.temopral_conv.conv4.3.bias",
...
"model.diffusion_model.output_blocks.5.0.temopral_conv.conv2.0.bias", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv2.3.weight", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv2.3.bias", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv3.0.weight", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv3.0.bias", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv3.3.weight", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv3.3.bias", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv4.0.weight", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv4.0.bias", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv4.3.weight", "model.diffusion_model.output_blocks.5.0.temopral_conv.conv4.3.bias", "model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_k_ip.weight", "model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_v_ip.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv1.0.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv1.0.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv1.2.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv1.2.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv2.0.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv2.0.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv2.3.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv2.3.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv3.0.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv3.0.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv3.3.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv3.3.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv4.0.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv4.0.bias", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv4.3.weight", "model.diffusion_model.output_blocks.6.0.temopral_conv.conv4.3.bias", "model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_k_ip.weight",
...
"embedder.model.visual.transformer.resblocks.1.ln_2.bias", "embedder.model.visual.transformer.resblocks.1.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.1.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.1.mlp.c_proj.weight", "embedder.model.visual.transformer.resblocks.1.mlp.c_proj.bias", "embedder.model.visual.transformer.resblocks.2.ln_1.weight", "embedder.model.visual.transformer.resblocks.2.ln_1.bias", "embedder.model.visual.transformer.resblocks.2.attn.in_proj_weight", "embedder.model.visual.transformer.resblocks.2.attn.in_proj_bias", "embedder.model.visual.transformer.resblocks.2.attn.out_proj.weight", "embedder.model.visual.transformer.resblocks.2.attn.out_proj.bias", "embedder.model.visual.transformer.resblocks.2.ln_2.weight", "embedder.model.visual.transformer.resblocks.2.ln_2.bias", "embedder.model.visual.transformer.resblocks.2.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.2.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.2.mlp.c_proj.weight", "embedder.model.visual.transformer.resblocks.2.mlp.c_proj.bias", "embedder.model.visual.transformer.resblocks.3.ln_1.weight", "embedder.model.visual.transformer.resblocks.3.ln_1.bias", "embedder.model.visual.transformer.resblocks.3.attn.in_proj_weight", "embedder.model.visual.transformer.resblocks.3.attn.in_proj_bias", "embedder.model.visual.transformer.resblocks.3.attn.out_proj.weight", "embedder.model.visual.transformer.resblocks.3.attn.out_proj.bias", "embedder.model.visual.transformer.resblocks.3.ln_2.weight", "embedder.model.visual.transformer.resblocks.3.ln_2.bias", "embedder.model.visual.transformer.resblocks.3.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.3.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.3.mlp.c_proj.weight", "embedder.model.visual.transformer.resblocks.3.mlp.c_proj.bias", "embedder.model.visual.transformer.resblocks.4.ln_1.weight", "embedder.model.visual.transformer.resblocks.4.ln_1.bias", "embedder.model.visual.transformer.resblocks.4.attn.in_proj_weight", "embedder.model.visual.transformer.resblocks.4.attn.in_proj_bias", "embedder.model.visual.transformer.resblocks.4.attn.out_proj.weight", "embedder.model.visual.transformer.resblocks.4.attn.out_proj.bias", "embedder.model.visual.transformer.resblocks.4.ln_2.weight", "embedder.model.visual.transformer.resblocks.4.ln_2.bias", "embedder.model.visual.transformer.resblocks.4.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.4.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.4.mlp.c_proj.weight", "embedder.model.visual.transformer.resblocks.4.mlp.c_proj.bias", "embedder.model.visual.transformer.resblocks.5.ln_1.weight", "embedder.model.visual.transformer.resblocks.5.ln_1.bias", "embedder.model.visual.transformer.resblocks.5.attn.in_proj_weight", "embedder.model.visual.transformer.resblocks.5.attn.in_proj_bias", "embedder.model.visual.transformer.resblocks.5.attn.out_proj.weight", "embedder.model.visual.transformer.resblocks.5.attn.out_proj.bias", "embedder.model.visual.transformer.resblocks.5.ln_2.weight", "embedder.model.visual.transformer.resblocks.5.ln_2.bias", "embedder.model.visual.transformer.resblocks.5.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.5.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.5.mlp.c_proj.weight",
....
"embedder.model.visual.transformer.resblocks.29.attn.out_proj.bias", "embedder.model.visual.transformer.resblocks.29.ln_2.weight", "embedder.model.visual.transformer.resblocks.29.ln_2.bias", "embedder.model.visual.transformer.resblocks.29.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.29.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.29.mlp.c_proj.weight", "embedder.model.visual.transformer.resblocks.29.mlp.c_proj.bias", "embedder.model.visual.transformer.resblocks.30.ln_1.weight", "embedder.model.visual.transformer.resblocks.30.ln_1.bias", "embedder.model.visual.transformer.resblocks.30.attn.in_proj_weight", "embedder.model.visual.transformer.resblocks.30.attn.in_proj_bias", "embedder.model.visual.transformer.resblocks.30.attn.out_proj.weight", "embedder.model.visual.transformer.resblocks.30.attn.out_proj.bias", "embedder.model.visual.transformer.resblocks.30.ln_2.weight", "embedder.model.visual.transformer.resblocks.30.ln_2.bias", "embedder.model.visual.transformer.resblocks.30.mlp.c_fc.weight", "embedder.model.visual.transformer.resblocks.30.mlp.c_fc.bias", "embedder.model.visual.transformer.resblocks.30.mlp.c_proj.weight", "embedder.model.visual.transformer.resblocks.30.mlp.c_proj.bias", "embedder.model.visual.transformer.resblocks.31.ln_1.weight", "embedder.model.visual.transformer.resblocks.31.ln_1.bias", "embedder.model.visual.transformer.resblocks.31.attn.in_proj_weight", "embedder.model.visual.transformer.resblocks.31.attn.in_proj_bias",
"image_proj_model.layers.1.1.1.weight", "image_proj_model.layers.1.1.3.weight", "image_proj_model.layers.2.0.norm1.weight", "image_proj_model.layers.2.0.norm1.bias", "image_proj_model.layers.2.0.norm2.weight", "image_proj_model.layers.2.0.norm2.bias", "image_proj_model.layers.2.0.to_q.weight", "image_proj_model.layers.2.0.to_kv.weight", "image_proj_model.layers.2.0.to_out.weight", "image_proj_model.layers.2.1.0.weight", "image_proj_model.layers.2.1.0.bias", "image_proj_model.layers.2.1.1.weight", "image_proj_model.layers.2.1.3.weight", "image_proj_model.layers.3.0.norm1.weight", "image_proj_model.layers.3.0.norm1.bias", "image_proj_model.layers.3.0.norm2.weight", "image_proj_model.layers.3.0.norm2.bias", "image_proj_model.layers.3.0.to_q.weight", "image_proj_model.layers.3.0.to_kv.weight", "image_proj_model.layers.3.0.to_out.weight", "image_proj_model.layers.3.1.0.weight", "image_proj_model.layers.3.1.0.bias", "image_proj_model.layers.3.1.1.weight", "image_proj_model.layers.3.1.3.weight".
Unexpected key(s) in state_dict: "model.diffusion_model.input_blocks.1.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.input_blocks.1.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.input_blocks.1.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.input_blocks.1.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.input_blocks.2.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.input_blocks.2.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.input_blocks.2.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table",
...
"model.diffusion_model.init_attn.0.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.middle_block.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.middle_block.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.middle_block.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.middle_block.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.3.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.3.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.3.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.3.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.4.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.4.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.4.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.4.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.5.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.5.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.5.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.5.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.6.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.6.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.6.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.6.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.7.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.7.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.7.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.7.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table",
...
"model.diffusion_model.output_blocks.9.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.9.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.10.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.10.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.10.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.10.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.11.2.transformer_blocks.0.attn1.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.11.2.transformer_blocks.0.attn1.relative_position_v.embeddings_table", "model.diffusion_model.output_blocks.11.2.transformer_blocks.0.attn2.relative_position_k.embeddings_table", "model.diffusion_model.output_blocks.11.2.transformer_blocks.0.attn2.relative_position_v.embeddings_table".
size mismatch for scale_arr: copying a param with shape torch.Size([1000]) from checkpoint, the shape in current model is torch.Size([1400]).

Any idea what Im missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant