New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixture-of-Depths Finetune,IndexError: too many indices for tensor of dimension 0 #3662
Open
1 task done
Labels
pending
This problem is yet to be addressed.
Comments
AlexYoung757
changed the title
mod finetune,IndexError: too many indices for tensor of dimension 0
Mixture-of-Depths Finetune,IndexError: too many indices for tensor of dimension 0
May 10, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
Reproduction
MASTER_PORT=$(shuf -n 1 -i 10000-65535)
DEEPSPEED_PATH=../config/ds_config_sft_z2_offload.json
MODEL_PATH=/your path/Meta-Llama-3-8B
OUTPUT_PATH=../output/llama3-8b-mod-sft
LOG_PATH=../logs/result_mod_sft.log
nohup deepspeed --num_gpus=4 --master_port $MASTER_PORT ../src/train_bash.py
--deepspeed $DEEPSPEED_PATH
--stage sft
--do_train
--model_name_or_path $MODEL_PATH
--dataset_dir ../data
--dataset oaast_sft_zh
--template llama3
--finetuning_type lora
--lora_target q_proj,v_proj
--mixture_of_depths convert
--output_dir $OUTPUT_PATH
--overwrite_cache
--overwrite_output_dir
--cutoff_len 1024
--preprocessing_num_workers 16
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 8
--optim paged_adamw_8bit
--lr_scheduler_type cosine
--logging_steps 10
--warmup_steps 20
--save_steps 100
--eval_steps 100
--evaluation_strategy steps
--load_best_model_at_end
--learning_rate 5e-5
--num_train_epochs 3.0
--max_samples 3000
--val_size 0.1
--plot_loss
--fp16
--flash_attn fa2
Expected behavior
I get an error: IndexError: IndexError: too many indices for tensor of dimension 0. What could be the reason for this?
System Info
Others
The text was updated successfully, but these errors were encountered: