LoRA Adapter from local model are leading to error #1893

philschmid · 2024-05-14T15:16:11Z

System Info

ghcr.io/huggingface/text-generation-inference:2.0.2

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

Download Lora model to local disk

 huggingface-cli download alignment-handbook/zephyr-7b-sft-qlora --exclude "*.bin" "*.pth" "*.gguf" --local-dir ./tmp
rm tmp/config.json

Try to start TGI

docker run --gpus all -ti -p 8080:8080 \
  -e MODEL_ID="/opt/ml/models" \
  -e HUGGING_FACE_HUB_TOKEN=$(cat ~/.cache/huggingface/token) \
  -v $(pwd)/tmp:/opt/ml/models \
   ghcr.io/huggingface/text-generation-inference:2.0.2

Expected behavior

TGI will load peft model from the local disk abd then the base model defined in the adapter_config.json from huggingface.

Error

2024-05-14T15:13:38.940561Z ERROR download: text_generation_launcher: Download encountered an error: 
Traceback (most recent call last):

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/peft.py", line 15, in download_and_unload_peft
    model = AutoPeftModelForCausalLM.from_pretrained(

  File "/opt/conda/lib/python3.10/site-packages/peft/auto.py", line 128, in from_pretrained
    return cls._target_peft_class.from_pretrained(

  File "/opt/conda/lib/python3.10/site-packages/peft/peft_model.py", line 356, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)

  File "/opt/conda/lib/python3.10/site-packages/peft/peft_model.py", line 727, in load_adapter
    adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)

  File "/opt/conda/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 297, in load_peft_weights
    has_remote_safetensors_file = file_exists(

  File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
    validate_repo_id(arg_value)

  File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
    raise HFValidationError(

huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/opt/ml/models'. Use `repo_type` argument if needed.


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 226, in download_weights
    utils.download_and_unload_peft(

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/peft.py", line 23, in download_and_unload_peft
    model = AutoPeftModelForSeq2SeqLM.from_pretrained(

  File "/opt/conda/lib/python3.10/site-packages/peft/auto.py", line 88, in from_pretrained
    raise ValueError(

ValueError: Expected target PEFT class: PeftModelForCausalLM, but you have asked for: PeftModelForSeq2SeqLM make sure that you are loading the correct model for your task type.

Error: DownloadError```

The text was updated successfully, but these errors were encountered:

Narsil · 2024-05-24T11:36:53Z

Issue is in Peft I believe ?

philschmid · 2024-05-24T11:46:59Z

It works when using peft with version 0.10.0.

download model

huggingface-cli download alignment-handbook/zephyr-7b-sft-qlora --exclude "*.bin" "*.pth" "*.gguf" --local-dir ./tmp
rm tmp/config.json

load peft model

from peft import AutoPeftModelForCausalLM

m = AutoPeftModelForCausalLM.from_pretrained("./tmp")

correctly works

Narsil · 2024-05-24T11:55:13Z

Your example doesn't showcase the issue since you're passing the HFValidator, try using /data/test/tmp/xx/ or something.

philschmid · 2024-05-24T12:19:59Z

Thanks this leads to the issue.

Steps:

download model

huggingface-cli download alignment-handbook/zephyr-7b-sft-qlora --exclude "*.bin" "*.pth" "*.gguf" --local-dir ./tmp
rm tmp/config.json

run pytorch container and mount under /opt/ml/model

docker run --gpus all -it --rm \
-v $(pwd)/tmp/:/opt/ml/model \
-e HUGGING_FACE_HUB_TOKEN=$(cat ~/.cache/huggingface/token) \
-e HF_TOKEN=$(cat ~/.cache/huggingface/token) \
 --entrypoint /bin/bash nvcr.io/nvidia/pytorch:24.01-py3

install peft & run python

pip3 install peft && python3

run peft

from peft import AutoPeftModelForCausalLM

m = AutoPeftModelForCausalLM.from_pretrained("/opt/ml/model")

Error

>>> m = AutoPeftModelForCausalLM.from_pretrained("/opt/ml/model")
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 571/571 [00:00<00:00, 9.28MB/s]
model.safetensors.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25.1k/25.1k [00:00<00:00, 19.0MB/s]
model-00001-of-00002.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.94G/9.94G [00:18<00:00, 549MB/s]
model-00002-of-00002.safetensors: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.54G/4.54G [00:08<00:00, 553MB/s]
Downloading shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:26<00:00, 13.22s/it]
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.33it/s]
generation_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 1.14MB/s]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/peft/auto.py", line 128, in from_pretrained
    return cls._target_peft_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/peft/peft_model.py", line 430, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/peft/peft_model.py", line 984, in load_adapter
    adapters_weights = load_peft_weights(model_id, device=torch_device, **hf_hub_download_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/peft/utils/save_and_load.py", line 415, in load_peft_weights
    has_remote_safetensors_file = file_exists(
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 106, in _inner_fn
    validate_repo_id(arg_value)
  File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 154, in validate_repo_id
    raise HFValidationError(
huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/opt/ml/model'. Use `repo_type` argument if needed.

Will open an issue in peft, once fixed we should add >= for the version.

philschmid mentioned this issue May 24, 2024

LoRA Adapter from local model are leading to error huggingface/peft#1800

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA Adapter from local model are leading to error #1893

LoRA Adapter from local model are leading to error #1893

philschmid commented May 14, 2024

Narsil commented May 24, 2024

philschmid commented May 24, 2024

Narsil commented May 24, 2024

philschmid commented May 24, 2024

LoRA Adapter from local model are leading to error #1893

LoRA Adapter from local model are leading to error #1893

Comments

philschmid commented May 14, 2024

System Info

Information

Tasks

Reproduction

Expected behavior

Error

Narsil commented May 24, 2024

philschmid commented May 24, 2024

Narsil commented May 24, 2024

philschmid commented May 24, 2024