New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I used a 2060 graphics card and reported an error "Feature 'cvt with.f32.BF16 'requires.target sm_80 or higher". #434
Comments
I get the same when I run this on a v100. I thought setting bf16 to false should solve this. import os max_seq_length = 1024 Load Llama3 modelmodel, tokenizer = FastLanguageModel.from_pretrained( Model patching and add fast LoRA weights and trainingmodel = FastLanguageModel.get_peft_model( trainer = SFTTrainer( Show current memory statsgpu_stats = torch.cuda.get_device_properties(0) trainer_stats = trainer.train() Show final memory and time statsused_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3) Save the modelmodel.save_pretrained("llama3_lora_model") Save to 8bit Q8_0 and q4model.save_pretrained_gguf("llama3_model_q8", tokenizer,) Error: |
I had the exact same problem using torch 2.3.0. As you said, even said the flag for bf16 to I resolved the issue by downgrading to torch 2.2.0 and installing the
|
==((====))== Unsloth: Fast Llama patching release 2024.4
\ /| GPU: NVIDIA GeForce RTX 2060 SUPER. Max memory: 7.785 GB. Platform = Linux.
O^O/ _/ \ Pytorch: 2.3.0. CUDA = 7.5. CUDA Toolkit = 12.1.
\ / Bfloat16 = TRUE. Xformers = 0.0.26.post1. FA = False.
"-____-" Free Apache license: http://github.com/unslothai/unsloth
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 4/4 [00:09<00:00, 2.40s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
/home/chuhaitong/yangce/Meta-Llama-3-8B-Instruct does not have a padding or unknown token!
Will use the EOS token of id 128001 as padding.
Unsloth 2024.4 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.
True
Using the
WANDB_DISABLED
environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).max_steps is given, it will override any value given in num_train_epochs
==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
\ /| Num examples = 1 | Num Epochs = 60
O^O/ _/ \ Batch size per device = 2 | Gradient Accumulation steps = 4
\ / Total batch size = 8 | Total steps = 60
"-____-" Number of trainable parameters = 41,943,040
0%| | 0/60 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/chuhaitong/yangce/app.py", line 114, in
trainer_stats = trainer.train()
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 361, in train
output = super().train(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 1859, in train
return inner_training_loop(
File "", line 361, in _fast_inner_training_loop
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 3138, in training_step
loss = self.compute_loss(model, inputs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/transformers/trainer.py", line 3161, in compute_loss
outputs = model(**inputs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 822, in forward
return model_forward(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/utils/operations.py", line 810, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
return func(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 882, in PeftModelForCausalLM_fast_forward
return self.base_model(
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 161, in forward
return self.model.forward(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 813, in _CausalLM_fast_forward
outputs = self.model(
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 650, in LlamaModel_fast_forward
hidden_states = Unsloth_Offloaded_Gradient_Checkpointer.apply(
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/autograd/function.py", line 598, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 115, in decorate_fwd
return fwd(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/_utils.py", line 333, in forward
(output,) = forward_function(hidden_states, *args)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/models/llama.py", line 432, in LlamaDecoderLayer_fast_forward
hidden_states = fast_rms_layernorm(self.input_layernorm, hidden_states)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 190, in fast_rms_layernorm
out = Fast_RMS_Layernorm.apply(X, W, eps, gemma)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/torch/autograd/function.py", line 598, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/unsloth/kernels/rms_layernorm.py", line 144, in forward
fx[(n_rows,)](
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/runtime/jit.py", line 167, in
return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/runtime/jit.py", line 416, in run
self.cache[device][key] = compile(
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/compiler.py", line 193, in compile
next_module = compile_ir(module, metadata)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/backends/cuda.py", line 201, in
stages["cubin"] = lambda src, metadata: self.make_cubin(src, metadata, options, self.capability)
File "/home/chuhaitong/anaconda3/envs/unsloth_env/lib/python3.10/site-packages/triton/compiler/backends/cuda.py", line 194, in make_cubin
return compile_ptx_to_cubin(src, ptxas, capability, opt.enable_fp_fusion)
RuntimeError: Internal Triton PTX codegen error:
ptxas /tmp/compile-ptx-src-d2fe88, line 100; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 100; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 102; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 102; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 104; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 104; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 106; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 106; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 108; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 108; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 110; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 110; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 112; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 112; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 114; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 114; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 116; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 116; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 118; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 118; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 120; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 120; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 122; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 122; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 124; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 124; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 126; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 126; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 128; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 128; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 130; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 130; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 316; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 316; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 318; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 318; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 320; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 320; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 322; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 322; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 324; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 324; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 326; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 326; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 328; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 328; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 330; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 330; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 332; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 332; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 334; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 334; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 336; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 336; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 338; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 338; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 340; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 340; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 342; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 342; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 344; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 344; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 346; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 346; error : Feature 'cvt.bf16.f32' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 350; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 350; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 354; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 354; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 358; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 358; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 362; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 362; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 366; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 366; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 370; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 370; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 374; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 374; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 378; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 378; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 382; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 382; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 386; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 386; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 390; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 390; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 394; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 394; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 398; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 398; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 402; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 402; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 406; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 406; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 410; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-d2fe88, line 410; error : Feature '.bf16' requires .target sm_80 or higher
ptxas fatal : Ptx assembly aborted due to errors
The text was updated successfully, but these errors were encountered: