We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
日志中显示成功保存参数pytorch_model.bin: 但文档中找不到模型权重参数: 运行inference.py时报错:
No response
train.sh: PRE_SEQ_LEN=128 LR=2e-2 NUM_GPUS=1
CUDA_VISIBLE_DEVICES=1 python main.py --do_train --train_file /home/ns/chatbot/ChatGLM2-6B/ptuning/Chinese-medical-dialogue-data-master/train.json --validation_file /home/ns/chatbot/ChatGLM2-6B/ptuning/Chinese-medical-dialogue-data-master/dev.json --preprocessing_num_workers 10 --prompt_column context --response_column target --overwrite_cache --model_name_or_path /home/ns/chatbot/ChatGLM2-6B/chatglm2-6b --output_dir output/adgen-chatglm2-6b-pt-$PRE_SEQ_LEN-$LR --overwrite_output_dir --max_source_length 64 --max_target_length 128 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --gradient_accumulation_steps 16 --predict_with_generate --max_steps 3000 --logging_steps 10 --save_steps 10 --learning_rate $LR --pre_seq_len $PRE_SEQ_LEN --quantization_bit 4
- OS:win11 - Python:Python 3.8.16 - Transformers:4.36.2 - PyTorch:2.1.2 - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True
The text was updated successfully, but these errors were encountered:
transformers==4.30.2
Sorry, something went wrong.
No branches or pull requests
Is there an existing issue for this?
Current Behavior
日志中显示成功保存参数pytorch_model.bin:
但文档中找不到模型权重参数:
运行inference.py时报错:
Expected Behavior
No response
Steps To Reproduce
train.sh:
PRE_SEQ_LEN=128
LR=2e-2
NUM_GPUS=1
CUDA_VISIBLE_DEVICES=1 python main.py
--do_train
--train_file /home/ns/chatbot/ChatGLM2-6B/ptuning/Chinese-medical-dialogue-data-master/train.json
--validation_file /home/ns/chatbot/ChatGLM2-6B/ptuning/Chinese-medical-dialogue-data-master/dev.json
--preprocessing_num_workers 10
--prompt_column context
--response_column target
--overwrite_cache
--model_name_or_path /home/ns/chatbot/ChatGLM2-6B/chatglm2-6b
--output_dir output/adgen-chatglm2-6b-pt-$PRE_SEQ_LEN-$LR
--overwrite_output_dir
--max_source_length 64
--max_target_length 128
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--predict_with_generate
--max_steps 3000
--logging_steps 10
--save_steps 10
--learning_rate $LR
--pre_seq_len $PRE_SEQ_LEN
--quantization_bit 4
Environment
Anything else?
No response
The text was updated successfully, but these errors were encountered: