Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <ChatGLM-6B做P-tuning-v2时无法保留权重参数pytorch_model.bin> #1447

Open
1 task done
SKURA502 opened this issue Jan 13, 2024 · 1 comment
Open
1 task done

Comments

@SKURA502
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

日志中显示成功保存参数pytorch_model.bin:
image
但文档中找不到模型权重参数:
image
运行inference.py时报错:
1facb0bc1fe16b2ede94c6de44f2e30

Expected Behavior

No response

Steps To Reproduce

train.sh:
PRE_SEQ_LEN=128
LR=2e-2
NUM_GPUS=1

CUDA_VISIBLE_DEVICES=1 python main.py
--do_train
--train_file /home/ns/chatbot/ChatGLM2-6B/ptuning/Chinese-medical-dialogue-data-master/train.json
--validation_file /home/ns/chatbot/ChatGLM2-6B/ptuning/Chinese-medical-dialogue-data-master/dev.json
--preprocessing_num_workers 10
--prompt_column context
--response_column target
--overwrite_cache
--model_name_or_path /home/ns/chatbot/ChatGLM2-6B/chatglm2-6b
--output_dir output/adgen-chatglm2-6b-pt-$PRE_SEQ_LEN-$LR
--overwrite_output_dir
--max_source_length 64
--max_target_length 128
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--predict_with_generate
--max_steps 3000
--logging_steps 10
--save_steps 10
--learning_rate $LR
--pre_seq_len $PRE_SEQ_LEN
--quantization_bit 4

Environment

- OS:win11
- Python:Python 3.8.16
- Transformers:4.36.2
- PyTorch:2.1.2
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

@annpion
Copy link

annpion commented Jan 18, 2024

transformers==4.30.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants