Inference time #29

puckikk1202 · 2024-03-28T15:07:45Z

Hi, I'm grateful for your excellent work! I've implemented the code as per the instructions, and it runs without errors. However, the inference time is slow, approximately 176 seconds per iteration. I tested it on an 80G A100 GPU, and it seems to be using around 71G of GPU memory. Is this normal?

ShenhaoZhu · 2024-03-28T18:30:58Z

The inference time and the GPU memory usage have significantly exceeded expectations. You could try terminating unrelated processes and then give it another try.

G-force78 · 2024-03-29T11:07:24Z

Thats odd, with a google A100 for motion-06 its peaking at 12.2GB

100% 20/20 [01:21<00:00, 4.05s/it]
100% 116/116 [00:04<00:00, 27.11it/s]

chengzeyi · 2024-03-29T12:48:27Z

Thats odd, with a google A100 for motion-06 its peaking at 12.2GB

100% 20/20 [01:21<00:00, 4.05s/it] 100% 116/116 [00:04<00:00, 27.11it/s]

How did you get that? With an RTX 4090 I get much more VRAM usage than that number.
Ok, that may be related a BUG in WSL2. But how did you achieve such a speed?

G-force78 · 2024-03-30T09:42:37Z

You need a large system RAM too, I've just tried using the T4 on colab free tier but the system RAM maxed out at 12gb loading the motion module, maybe that can be sent to VRAM instead if you have high VRAM?

Here is my config file using motion-06

num_inference_steps: 20
guidance_scale: 6
enable_zero_snr: true
weight_dtype: "fp16"

guidance_types:

'depth'
'normal'
'semantic_map'
'dwpose'

noise_scheduler_kwargs:
num_train_timesteps: 1000
beta_start: 0.00085
beta_end: 0.012
beta_schedule: "linear"
steps_offset: 1
clip_sample: false

unet_additional_kwargs:
use_inflated_groupnorm: true
unet_use_cross_frame_attention: false
unet_use_temporal_attention: false
use_motion_module: true
motion_module_resolutions:

1
2
4
8
motion_module_mid_block: true
motion_module_decoder_only: false
motion_module_type: Vanilla
motion_module_kwargs:
num_attention_heads: 8
num_transformer_block: 1
attention_block_types:
- Temporal_Self
- Temporal_Self
  temporal_position_encoding: true
  temporal_position_encoding_max_len: 32
  temporal_attention_dim_div: 1

guidance_encoder_kwargs:
guidance_embedding_channels: 320
guidance_input_channels: 3
block_out_channels: [16, 32, 96, 256]

enable_xformers_memory_efficient_attention: true

AricGamma self-assigned this Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference time #29

Inference time #29

puckikk1202 commented Mar 28, 2024

ShenhaoZhu commented Mar 28, 2024

G-force78 commented Mar 29, 2024

chengzeyi commented Mar 29, 2024 •

edited

G-force78 commented Mar 30, 2024 •

edited

Inference time #29

Inference time #29

Comments

puckikk1202 commented Mar 28, 2024

ShenhaoZhu commented Mar 28, 2024

G-force78 commented Mar 29, 2024

chengzeyi commented Mar 29, 2024 • edited

G-force78 commented Mar 30, 2024 • edited

chengzeyi commented Mar 29, 2024 •

edited

G-force78 commented Mar 30, 2024 •

edited