Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError Reason #30570

Closed
2 of 4 tasks
ananegru opened this issue Apr 30, 2024 · 10 comments · Fixed by #30580
Closed
2 of 4 tasks

ValueError Reason #30570

ananegru opened this issue Apr 30, 2024 · 10 comments · Fixed by #30580

Comments

@ananegru
Copy link

System Info

  • transformers version: 4.41.0.dev0
  • Platform: Linux-4.18.0-372.57.1.el8_6.x86_64-x86_64-with-glibc2.28
  • Python version: 3.9.18
  • Huggingface_hub version: 0.22.2
  • Safetensors version: 0.4.3
  • Accelerate version: 0.29.3
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.2+cu121 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I'm using the script, run_qa.py from the HF transformers trainer from the following repository to fine-tune the large language model Falcon on the SQuAD dataset:

https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering

The following parameters are being applied, also coming from the same repository I linked above:

python run_qa.py
--model_name_or_path tiiuae/falcon-7b
--dataset_name squad
--do_train
--do_eval
--per_device_train_batch_size 12
--learning_rate 3e-5
--num_train_epochs 2
--max_seq_length 384
--doc_stride 128
--output_dir /home/anegru/Test_Folder/Unqover/unqover/fine_tuning_output

And I'm running into the following value error that I would like some help with solving:

Traceback (most recent call last):
File "/gpfs/home3/anegru/Test_Folder/Unqover/unqover/fine_tuning/fine_tune.py", line 725, in
main()
File "/gpfs/home3/anegru/Test_Folder/Unqover/unqover/fine_tuning/fine_tune.py", line 491, in main
train_dataset = train_dataset.map(
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 602, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 567, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3156, in map
for rank, done, content in Dataset._map_single(**dataset_kwargs):
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3547, in _map_single
batch = apply_function_on_filtered_inputs(
File "/home/anegru/anaconda3/envs/py39/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3416, in apply_function_on_filtered_inputs
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
File "/gpfs/home3/anegru/Test_Folder/Unqover/unqover/fine_tuning/fine_tune.py", line 438, in prepare_train_features
cls_index = input_ids.index(tokenizer.cls_token_id)
ValueError: None is not in list

Expected behavior

Save the trained Falcon model on the SQuAD dataset to a folder

@amyeroberts
Copy link
Collaborator

Hi @ananegru, thanks for opening this issue!

The reason this error is being raised is because the tiiuae/falcon-7b tokenizer doesn't have a cls_token set. This is used in the example to indicate when an q/a isn't answerable e.g. if there isn't an answer or the answer happens outside of the span.

In order to get this example to work for this model, I'd suggest setting the tokenizer's cls token to an equivalent token that can represent this, or adapting the script to filter out these problem examples.

cc @Rocketknight1

@Rocketknight1
Copy link
Member

Yes, this script is rather outdated - with modern models, it is more common to do QA by just providing the text and asking an instruct model directly!

We can probably work around this by just setting the value to 0 (and emitting an 'empty' answer at the start of the sequence) when no CLS token is present, since the CLS token location is only used to create 'dummy' spans for impossible answers. I'll make a PR now!

@Rocketknight1
Copy link
Member

Rocketknight1 commented Apr 30, 2024

@ananegru a PR is ready. Can you try it out and let us know if it works for you? You can install the PR branch with this command:

pip install --upgrade git+https://github.com/huggingface/transformers.git@fix_qa_example

Let us know if you can train Falcon after running it!

@ananegru
Copy link
Author

ananegru commented May 1, 2024

Hi, thanks a lot for the help! However, when I try running the command you sent above I get the following error:

Collecting git+https://github.com/huggingface/transformers.git@fix_qa_example
Cloning https://github.com/huggingface/transformers.git (to revision fix_qa_example) to /gpfs/scratch1/nodespecific/int4/anegru/pip-req-build-ji9_wocz
Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /gpfs/scratch1/nodespecific/int4/anegru/pip-req-build-ji9_wocz
WARNING: Did not find branch or tag 'fix_qa_example', assuming revision or ref.
Running command git checkout -q fix_qa_example
error: pathspec 'fix_qa_example' did not match any file(s) known to git
error: subprocess-exited-with-error

× git checkout -q fix_qa_example did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git checkout -q fix_qa_example did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

@amyeroberts
Copy link
Collaborator

@ananegru I think this is because the branch was deleted after merge. The commit is now on main - you can get this by installing from source pip install git+https://github.com/huggingface/transformers

@ananegru
Copy link
Author

ananegru commented May 1, 2024

Unfortuantely, the same issue regarding the valueerror for the CLS token is persisting, even after running the command you sent above as well, anything else that might be done to fix it?

@amyeroberts
Copy link
Collaborator

@ananegru Is this occurring when using the falcon checkpoint? Unfortunately, it's not possible to use this model here - the QA script is designed for MLM, whereas falcon is a CLM.

@ananegru
Copy link
Author

ananegru commented May 3, 2024

Ah yes, its occurring at the checkpoint. Okay so I just can't use this script at all for fine-tuning falcon? Do you perhaps know of any other scripts to fine-tune a CLM for QA?

@amyeroberts
Copy link
Collaborator

There isn't a script currently in the library, however as there's a FalconForQuestionAnswering head, this should probably be supported cc @Rocketknight1

@Rocketknight1
Copy link
Member

I'll investigate! But also @ananegru, is there a reason you specifically want a CLM for this kind of span-extraction task? The most common approaches for question answering in 2024 are:

  • Use a masked language model to extract spans from the input text
  • Use a chat language model, give it the text, and directly ask it questions

The second option is harder to fine-tune for, but the base accuracy will be very high if you use a state-of-the-art chat model like LLaMA-3, DBRX, Mixtral or Command-R.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants