ValueError Reason #30570

ananegru · 2024-04-30T13:11:00Z

System Info

transformers version: 4.41.0.dev0
Platform: Linux-4.18.0-372.57.1.el8_6.x86_64-x86_64-with-glibc2.28
Python version: 3.9.18
Huggingface_hub version: 0.22.2
Safetensors version: 0.4.3
Accelerate version: 0.29.3
Accelerate config: not found
PyTorch version (GPU?): 2.2.2+cu121 (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I'm using the script, run_qa.py from the HF transformers trainer from the following repository to fine-tune the large language model Falcon on the SQuAD dataset:

https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering

The following parameters are being applied, also coming from the same repository I linked above:

python run_qa.py
--model_name_or_path tiiuae/falcon-7b
--dataset_name squad
--do_train
--do_eval
--per_device_train_batch_size 12
--learning_rate 3e-5
--num_train_epochs 2
--max_seq_length 384
--doc_stride 128
--output_dir /home/anegru/Test_Folder/Unqover/unqover/fine_tuning_output

And I'm running into the following value error that I would like some help with solving:

Traceback (most recent call last):
File "/gpfs/home3/anegru/Test_Folder/Unqover/unqover/fine_tuning/fine_tune.py", line 725, in
main()
File "/gpfs/home3/anegru/Test_Folder/Unqover/unqover/fine_tuning/fine_tune.py", line 491, in main
train_dataset = train_dataset.map(
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 602, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 567, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3156, in map
for rank, done, content in Dataset._map_single(**dataset_kwargs):
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3547, in _map_single
batch = apply_function_on_filtered_inputs(
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3416, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "/gpfs/home3/anegru/Test_Folder/Unqover/unqover/fine_tuning/fine_tune.py", line 438, in prepare_train_features
cls_index = input_ids.index(tokenizer.cls_token_id)
ValueError: None is not in list

Expected behavior

Save the trained Falcon model on the SQuAD dataset to a folder

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-04-30T15:53:11Z

Hi @ananegru, thanks for opening this issue!

The reason this error is being raised is because the tiiuae/falcon-7b tokenizer doesn't have a cls_token set. This is used in the example to indicate when an q/a isn't answerable e.g. if there isn't an answer or the answer happens outside of the span.

In order to get this example to work for this model, I'd suggest setting the tokenizer's cls token to an equivalent token that can represent this, or adapting the script to filter out these problem examples.

cc @Rocketknight1

Rocketknight1 · 2024-04-30T17:05:27Z

Yes, this script is rather outdated - with modern models, it is more common to do QA by just providing the text and asking an instruct model directly!

We can probably work around this by just setting the value to 0 (and emitting an 'empty' answer at the start of the sequence) when no CLS token is present, since the CLS token location is only used to create 'dummy' spans for impossible answers. I'll make a PR now!

Rocketknight1 · 2024-04-30T17:16:02Z

@ananegru a PR is ready. Can you try it out and let us know if it works for you? You can install the PR branch with this command:

pip install --upgrade git+https://github.com/huggingface/transformers.git@fix_qa_example

Let us know if you can train Falcon after running it!

ananegru · 2024-05-01T09:04:03Z

Hi, thanks a lot for the help! However, when I try running the command you sent above I get the following error:

Collecting git+https://github.com/huggingface/transformers.git@fix_qa_example
Cloning https://github.com/huggingface/transformers.git (to revision fix_qa_example) to /gpfs/scratch1/nodespecific/int4/anegru/pip-req-build-ji9_wocz
Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /gpfs/scratch1/nodespecific/int4/anegru/pip-req-build-ji9_wocz
WARNING: Did not find branch or tag 'fix_qa_example', assuming revision or ref.
Running command git checkout -q fix_qa_example
error: pathspec 'fix_qa_example' did not match any file(s) known to git
error: subprocess-exited-with-error

× git checkout -q fix_qa_example did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git checkout -q fix_qa_example did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

amyeroberts · 2024-05-01T09:12:52Z

@ananegru I think this is because the branch was deleted after merge. The commit is now on main - you can get this by installing from source pip install git+https://github.com/huggingface/transformers

ananegru · 2024-05-01T13:32:21Z

Unfortuantely, the same issue regarding the valueerror for the CLS token is persisting, even after running the command you sent above as well, anything else that might be done to fix it?

amyeroberts · 2024-05-01T14:55:13Z

@ananegru Is this occurring when using the falcon checkpoint? Unfortunately, it's not possible to use this model here - the QA script is designed for MLM, whereas falcon is a CLM.

ananegru · 2024-05-03T15:27:33Z

Ah yes, its occurring at the checkpoint. Okay so I just can't use this script at all for fine-tuning falcon? Do you perhaps know of any other scripts to fine-tune a CLM for QA?

amyeroberts · 2024-05-07T12:04:04Z

There isn't a script currently in the library, however as there's a FalconForQuestionAnswering head, this should probably be supported cc @Rocketknight1

Rocketknight1 · 2024-05-07T12:49:35Z

I'll investigate! But also @ananegru, is there a reason you specifically want a CLM for this kind of span-extraction task? The most common approaches for question answering in 2024 are:

Use a masked language model to extract spans from the input text
Use a chat language model, give it the text, and directly ask it questions

The second option is harder to fine-tune for, but the base accuracy will be very high if you use a state-of-the-art chat model like LLaMA-3, DBRX, Mixtral or Command-R.

Rocketknight1 mentioned this issue Apr 30, 2024

Fix QA example #30580

Merged

Rocketknight1 closed this as completed in #30580 May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError Reason #30570

ValueError Reason #30570

ananegru commented Apr 30, 2024

amyeroberts commented Apr 30, 2024

Rocketknight1 commented Apr 30, 2024

Rocketknight1 commented Apr 30, 2024 •

edited

ananegru commented May 1, 2024

amyeroberts commented May 1, 2024

ananegru commented May 1, 2024

amyeroberts commented May 1, 2024

ananegru commented May 3, 2024

amyeroberts commented May 7, 2024

Rocketknight1 commented May 7, 2024

ValueError Reason #30570

ValueError Reason #30570

Comments

ananegru commented Apr 30, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

amyeroberts commented Apr 30, 2024

Rocketknight1 commented Apr 30, 2024

Rocketknight1 commented Apr 30, 2024 • edited

ananegru commented May 1, 2024

amyeroberts commented May 1, 2024

ananegru commented May 1, 2024

amyeroberts commented May 1, 2024

ananegru commented May 3, 2024

amyeroberts commented May 7, 2024

Rocketknight1 commented May 7, 2024

Rocketknight1 commented Apr 30, 2024 •

edited