Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when run colab notebook #52

Open
ken2190 opened this issue Apr 16, 2023 · 3 comments
Open

Error when run colab notebook #52

ken2190 opened this issue Apr 16, 2023 · 3 comments

Comments

@ken2190
Copy link

ken2190 commented Apr 16, 2023

i get the below error when i run training cell in colab FineTuning_colab.ipynb
also run cell Training parameters and all parameter parsed

No LSB modules are available.
Description: Ubuntu 20.04.5 LTS
diffusers==0.11.1
lora-diffusion @ file:///content/lora
torchvision @ https://download.pytorch.org/whl/cu118/torchvision-0.15.1%2Bcu118-cp39-cp39-linux_x86_64.whl
transformers==4.25.1
xformers==0.0.16rc425
2023-04-16 09:29:59.351268: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-16 09:30:00.246985: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Copy-and-paste the text below in your GitHub issue

  • Accelerate version: 0.15.0
  • Platform: Linux-5.10.147+-x86_64-with-glibc2.31
  • Python version: 3.9.16
  • Numpy version: 1.22.4
  • PyTorch version (GPU?): 1.13.1+cu117 (True)
  • Accelerate default config:
    Not found
    2023-04-16 09:30:04.094704: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
    To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
    2023-04-16 09:30:04.940115: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
    usage: accelerate [] launch
    [-h]
    [--config_file CONFIG_FILE]
    [--cpu]
    [--mps]
    [--multi_gpu]
    [--tpu]
    [--use_mps_device]
    [--dynamo_backend {no,eager,aot_eager,inductor,nvfuser,aot_nvfuser,aot_cudagraphs,ofi,fx2trt,onnxrt,ipex}]
    [--mixed_precision {no,fp16,bf16}]
    [--fp16]
    [--num_processes NUM_PROCESSES]
    [--num_machines NUM_MACHINES]
    [--num_cpu_threads_per_process NUM_CPU_THREADS_PER_PROCESS]
    [--use_deepspeed]
    [--use_fsdp]
    [--use_megatron_lm]
    [--gpu_ids GPU_IDS]
    [--same_network]
    [--machine_rank MACHINE_RANK]
    [--main_process_ip MAIN_PROCESS_IP]
    [--main_process_port MAIN_PROCESS_PORT]
    [--rdzv_conf RDZV_CONF]
    [--max_restarts MAX_RESTARTS]
    [--monitor_interval MONITOR_INTERVAL]
    [-m]
    [--no_python]
    [--main_training_function MAIN_TRAINING_FUNCTION]
    [--downcast_bf16]
    [--deepspeed_config_file DEEPSPEED_CONFIG_FILE]
    [--zero_stage ZERO_STAGE]
    [--offload_optimizer_device OFFLOAD_OPTIMIZER_DEVICE]
    [--offload_param_device OFFLOAD_PARAM_DEVICE]
    [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
    [--gradient_clipping GRADIENT_CLIPPING]
    [--zero3_init_flag ZERO3_INIT_FLAG]
    [--zero3_save_16bit_model ZERO3_SAVE_16BIT_MODEL]
    [--deepspeed_hostfile DEEPSPEED_HOSTFILE]
    [--deepspeed_exclusion_filter DEEPSPEED_EXCLUSION_FILTER]
    [--deepspeed_inclusion_filter DEEPSPEED_INCLUSION_FILTER]
    [--deepspeed_multinode_launcher DEEPSPEED_MULTINODE_LAUNCHER]
    [--fsdp_offload_params FSDP_OFFLOAD_PARAMS]
    [--fsdp_min_num_params FSDP_MIN_NUM_PARAMS]
    [--fsdp_sharding_strategy FSDP_SHARDING_STRATEGY]
    [--fsdp_auto_wrap_policy FSDP_AUTO_WRAP_POLICY]
    [--fsdp_transformer_layer_cls_to_wrap FSDP_TRANSFORMER_LAYER_CLS_TO_WRAP]
    [--fsdp_backward_prefetch_policy FSDP_BACKWARD_PREFETCH_POLICY]
    [--fsdp_state_dict_type FSDP_STATE_DICT_TYPE]
    [--megatron_lm_tp_degree MEGATRON_LM_TP_DEGREE]
    [--megatron_lm_pp_degree MEGATRON_LM_PP_DEGREE]
    [--megatron_lm_num_micro_batches MEGATRON_LM_NUM_MICRO_BATCHES]
    [--megatron_lm_sequence_parallelism MEGATRON_LM_SEQUENCE_PARALLELISM]
    [--megatron_lm_recompute_activations MEGATRON_LM_RECOMPUTE_ACTIVATIONS]
    [--megatron_lm_use_distributed_optimizer MEGATRON_LM_USE_DISTRIBUTED_OPTIMIZER]
    [--megatron_lm_gradient_clipping MEGATRON_LM_GRADIENT_CLIPPING]
    [--aws_access_key_id AWS_ACCESS_KEY_ID]
    [--aws_secret_access_key AWS_SECRET_ACCESS_KEY]
    [--debug]
    training_script
    ...
    accelerate [] launch: error: argument --mixed_precision: invalid choice: '' (choose from 'no', 'fp16', 'bf16')
@ken2190
Copy link
Author

ken2190 commented Apr 16, 2023

set --mixed_precision="fp16" get another error

CalledProcessError: Command '['/usr/bin/python3',
'/content/Dreambooth/train.py', '--lora_rank=',
'--pretrained_model_name_or_path=', '--pretrained_vae_name_or_path=',
'--instance_data_dir=', '--class_data_dir=', '--output_dir=', '--logging_dir=',
'--prior_loss_weight=', '--instance_prompt=', '--class_prompt=',
'--conditioning_dropout_prob=', '--unconditional_prompt=', '--seed=',
'--resolution=', '--train_batch_size=', '--gradient_accumulation_steps=',
'--mixed_precision=', '--adam_beta1=', '--adam_beta2=', '--adam_weight_decay=',
'--adam_epsilon=', '--learning_rate=', '--learning_rate_text=',
'--lr_scheduler=', '--lr_warmup_steps=', '--lr_cosine_num_cycles=',
'--ema_inv_gamma=', '--ema_power=', '--ema_min_value=', '--ema_max_value=',
'--max_train_steps=', '--num_class_images=', '--sample_batch_size=',
'--save_min_steps=', '--save_interval=', '--n_save_sample=',
'--save_sample_prompt=', '--save_sample_negative_prompt=']' returned non-zero
exit status 2.

@xam74er1
Copy link

xam74er1 commented Jul 9, 2023

I also have the isse , be sure to run all the previous step , including the experimental step

@theodorhar
Copy link

I had a very similar error, which also resolved once I did not skip past the Experimental steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants