当我的transformers是4.36.2时，chatglm2不能正常加载 #651

Congcong-Song · 2024-01-05T03:52:04Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

报错如下：
Traceback (most recent call last):
File "/home/inspur/scc/gpt/LLaMA-Factory/src/train_bash.py", line 14, in
main()
File "/home/inspur/scc/gpt/LLaMA-Factory/src/train_bash.py", line 5, in main
run_exp()
File "/home/inspur/scc/gpt/LLaMA-Factory/src/llmtuner/train/tuner.py", line 26, in run_exp
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "/home/inspur/scc/gpt/LLaMA-Factory/src/llmtuner/train/sft/workflow.py", line 29, in run_sft
model, tokenizer = load_model_and_tokenizer(model_args, finetuning_args, training_args.do_train)
File "/home/inspur/scc/gpt/LLaMA-Factory/src/llmtuner/model/loader.py", line 49, in load_model_and_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
File "/home/inspur/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 774, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/inspur/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2028, in from_pretrained
return cls._from_pretrained(
File "/home/inspur/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2260, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/inspur/.cache/huggingface/modules/transformers_modules/chatglm2-6b/tokenization_chatglm.py", line 69, in init
super().init(padding_side=padding_side, **kwargs)
File "/home/inspur/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 367, in init
self._add_tokens(
File "/home/inspur/anaconda3/envs/llama/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/home/inspur/.cache/huggingface/modules/transformers_modules/chatglm2-6b/tokenization_chatglm.py", line 108, in get_vocab
vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
File "/home/inspur/.cache/huggingface/modules/transformers_modules/chatglm2-6b/tokenization_chatglm.py", line 104, in vocab_size
return self.tokenizer.n_words
AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'. Did you mean: 'tokenize'?

这个要降transformers的版本才能加载。但是降版本后又会导致其他问题。

Expected Behavior

No response

Steps To Reproduce

我的命令：
CUDA_VISIBLE_DEVICES=5 python src/train_bash.py
--stage sft
--do_train
--model_name_or_path /path/THUDM/chatglm2-6b
--dataset alpaca_gpt4_zh
--template chatglm2
--finetuning_type lora
--lora_target query_key_value
--output_dir /path/chatglm2
--overwrite_cache
--per_device_train_batch_size 4
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--save_steps 1000
--learning_rate 5e-5
--num_train_epochs 3.0
--plot_loss
--fp16

Environment

环境：按照requirements，硬件：A100。python:3.10 Transformers: 4.36.2 pytorch:2.1.2

Anything else?

No response

Gaojun123123 · 2024-01-06T02:07:30Z

去hugingface 下载最新的模型

mawenju203 · 2024-01-16T09:13:21Z

更新下tokenization_chatglm.py，就可以了

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

当我的transformers是4.36.2时，chatglm2不能正常加载 #651

当我的transformers是4.36.2时，chatglm2不能正常加载 #651

Congcong-Song commented Jan 5, 2024

Gaojun123123 commented Jan 6, 2024

mawenju203 commented Jan 16, 2024

当我的transformers是4.36.2时，chatglm2不能正常加载 #651

当我的transformers是4.36.2时，chatglm2不能正常加载 #651

Comments

Congcong-Song commented Jan 5, 2024

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

Gaojun123123 commented Jan 6, 2024

mawenju203 commented Jan 16, 2024