Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <title>ptuning 出现这种异常该如何处理 #1445

Open
1 task done
brianzhangrong opened this issue Jan 11, 2024 · 2 comments
Open
1 task done

[BUG/Help] <title>ptuning 出现这种异常该如何处理 #1445

brianzhangrong opened this issue Jan 11, 2024 · 2 comments

Comments

@brianzhangrong
Copy link

brianzhangrong commented Jan 11, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

ChatGLMTokenizer(name_or_path='THUDM/chatglm-6b', vocab_size=64794, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='left', truncation_side='right', special_tokens={})

Traceback (most recent call last):
File "main.py", line 431, in
main()
File "main.py", line 249, in main
train_dataset = train_dataset.map(
File "/usr/local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 592, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 557, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3093, in map
for rank, done, content in Dataset._map_single(**dataset_kwargs):
File "/usr/local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3470, in _map_single
batch = apply_function_on_filtered_inputs(
File "/usr/local/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 3349, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "main.py", line 220, in preprocess_function_train
context_length = input_ids.index(tokenizer.bos_token_id)
ValueError: None is not in list

Model config ChatGLMConfig {
"_name_or_path": "THUDM/chatglm-6b",
"add_bias_linear": false,
"add_qkv_bias": true,
"apply_query_key_layer_scaling": true,
"apply_residual_connection_post_layernorm": false,
"architectures": [
"ChatGLMModel"
],
"attention_dropout": 0.0,
"attention_softmax_in_fp32": true,
"auto_map": {
"AutoConfig": "configuration_chatglm.ChatGLMConfig",
"AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
"AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
},
"bias_dropout_fusion": true,
"classifier_dropout": null,
"eos_token_id": 2,
"ffn_hidden_size": 13696,
"fp32_residual_connection": false,
"hidden_dropout": 0.0,
"hidden_size": 4096,
"kv_channels": 128,
"layernorm_epsilon": 1e-05,
"model_type": "chatglm",
"multi_query_attention": true,
"multi_query_group_num": 2,
"num_attention_heads": 32,
"num_layers": 28,
"original_rope": true,
"pad_token_id": 0,
"padded_vocab_size": 65024,
"post_layer_norm": true,
"pre_seq_len": null,
"prefix_projection": false,
"quantization_bit": 0,
"rmsnorm": true,
"seq_length": 32768,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.27.1",
"use_cache": true,
"vocab_size": 65024
}

Expected Behavior

No response

Steps To Reproduce

1.bash train.sh

Environment

- OS:centos
- Python:3.8
- Transformers: 4.27.1
- PyTorch:2.1.2
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

@HarryFunn
Copy link

你好,我也碰到了ValueError: None is not in list的报错,请问你解决了吗,谢谢!

@amauryjulien
Copy link

hello, how does this issue go on? I also met this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants