Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save and load base model with revision #1658

Merged
merged 14 commits into from
May 16, 2024

Conversation

mnoukhov
Copy link
Contributor

addresses #1567

changed name of parameter from revision to base_model_revision for clarity

can add unit tests if someone gives pointers to similar ones to extend
may need to add a model to hf-internal-testing with a base model with revision for these unit tests

changed name of parameter from "revision" to "base_model_revision" for
clarity
@mnoukhov mnoukhov mentioned this pull request Apr 16, 2024
@BenjaminBossan
Copy link
Member

Thanks a lot for this PR.

can add unit tests if someone gives pointers to similar ones to extend

We don't really have a single place where we test this functionality, I think it's fine to add the tests to the tests/test_hub_features file even if it's not a perfect fit. When it comes to what to test, maybe you can proceed as follows:

  1. Load the model with revision
  2. Add an adapter on top, like LoRA (use init_lora_weights=False)
  3. Save the adapter in a temp directory (use the tmp_path fixture from pytest)
  4. Load the base model using the revision from adapter_config
  5. Load the LoRA adapter
  6. Show that the outputs are the same
  7. As a sanity check, also load the adapter with the base model without revision and show that the outputs are different

may need to add a model to hf-internal-testing with a base model with revision for these unit tests

I added a model here: peft-internal-testing/tiny-random-BertModel. There is a revision v2.0.0 with updated weights:

>>> model = AutoModelForCausalLM.from_pretrained("peft-internal-testing/tiny-random-BertModel").eval()
>>> model(torch.arange(10).reshape(-1, 1)).logits.sum()
tensor(27.5802, grad_fn=<SumBackward0>)
>>> model_rev = AutoModelForCausalLM.from_pretrained("peft-internal-testing/tiny-random-BertModel", revision="v2.0.0").eval()
model_rev(torch.arange(10).reshape(-1, 1)).logits.sum()
tensor(-166.8932, grad_fn=<SumBackward0>)

@BenjaminBossan
Copy link
Member

@mnoukhov Let me know when the PR is ready for review.

@mnoukhov mnoukhov marked this pull request as ready for review April 19, 2024 15:34
@mnoukhov
Copy link
Contributor Author

mnoukhov commented Apr 19, 2024

I ended up changing base_model_revision back to revision to maintain backwards compatibility as some models were already uploaded with it, including some models for testing.

I can change it back and add a little code to account for having revision instead of base_model_revision, if you think that would be better.

Otherwise ready for review @BenjaminBossan

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. I agree that keeping revision at this point makes more sense for backwards compatibility.

I think we have to work a bit on the test, please check out my comment there.

tests/test_hub_features.py Outdated Show resolved Hide resolved
tests/test_hub_features.py Outdated Show resolved Hide resolved
tests/test_hub_features.py Outdated Show resolved Hide resolved
@mnoukhov
Copy link
Contributor Author

I changed the tests but more importantly, realized there is a big problem.

For some reason, I assumed that revision was a property of AutoModel or PretrainedConfig but it turns out it isn't.

name_or_path is stored https://github.com/huggingface/transformers/blob/8c12690cecbb97e187861e386f7a0ac790e4236c/src/transformers/configuration_utils.py#L351

but revision is only used to load the model from the hub.

This feature / fix therefore requires either

  1. the user passing in the base_model_revision to get_peft_config or PeftModel.from_pretrained or
  2. adding revision to PretrainedConfig in transformers

I think the second makes more sense, but that means this is blocked until this is added to transformers. What do you think we should do?

@BenjaminBossan
Copy link
Member

  • the user passing in the base_model_revision to get_peft_config or PeftModel.from_pretrained or
  • adding revision to PretrainedConfig in transformers

I don't think that the latter would make sense. The revision is not really a model config, the same config will typically apply to all revisions. So that only really leaves the first option.

@mnoukhov
Copy link
Contributor Author

Added the revision argument to get_peft_model and tests are now working as intended

One small thing I've noticed is that the peft_config passed to get_peft_model is modified in place and then set in the config so maybe we should be doing copy.copy for the config in get_peft_model to avoid issues like using the config twice and overwriting the revision the second time?

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the adjustments, the PR is almost good to go. I only have a few small comments left, please take a look.

One small thing I've noticed is that the peft_config passed to get_peft_model is modified in place and then set in the config so maybe we should be doing copy.copy for the config in get_peft_model to avoid issues like using the config twice and overwriting the revision the second time?

Yes, good point, this is not super clean. Maybe this could be addressed in a separate PR if you're interested.

src/peft/mapping.py Outdated Show resolved Hide resolved
src/peft/config.py Show resolved Hide resolved
@@ -101,7 +102,7 @@ def from_pretrained(
"Cannot infer the auto class from the config, please make sure that you are loading the correct model for your task type."
)

base_model = target_class.from_pretrained(base_model_path, **kwargs)
base_model = target_class.from_pretrained(base_model_path, revision=base_model_revision, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a small concern here, in PEFT configs, the revision parameter defaults to None but in transformers, it defaults to main:

https://github.com/huggingface/transformers/blob/9fe3f585bb4ea29f209dc705d269fbe292e1128f/src/transformers/models/auto/auto_factory.py#L135

Honestly, the from_pretrained method is a bit inscrutable to me, so I don't know if this can cause any issues (or might in the future). WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. It shouldn't affect anything but I'll change all revision defaults from None to "main" for consistency

src/peft/mapping.py Show resolved Hide resolved
@mnoukhov
Copy link
Contributor Author

mnoukhov commented May 6, 2024

In doing the last changes, I discovered there was a problem if both the base model and the peft model have a revision. In that case, call AutoPeftModel.from_pretrained(revision="foo") while also having peft_config.revision="bar"

would lead to an error in base_model = target_class.from_pretrained(base_model_path, revision=base_model_revision, **kwargs) as revision is both explicitly an argument and part of kwargs.

I fixed it, but in order to have a test for this, I would probably need a LoRA model on the hub with a revision different to that of its base model. An example would be an adapter with a main revision on top of the base model peft-internal-testing/tiny-random-BertModel with revision v2.0.0

@BenjaminBossan
Copy link
Member

Thanks for discovering and fixing that issue. I created a LoRA adapter with and without revision:

model = AutoModelForCausalLM.from_pretrained("hf-internal-testing/tiny-random-BertModel").eval()
# without revision
model = PeftModel.from_pretrained(model, "peft-internal-testing/tiny-random-BertModel-lora")
# with revision
model = PeftModel.from_pretrained(model, "peft-internal-testing/tiny-random-BertModel-lora", revision="v1.2.3")

I don't think we have any way of loading a PEFT model with revision directly using AutoModelForXxx.from_pretrained("peft-internal-testing/tiny-random-BertModel-lora", revision=...) as the revision would be interpreted as the base model revision, not the PEFT adapter revision.

Regarding what I said earlier:

I have a small concern here, in PEFT configs, the revision parameter defaults to None but in transformers, it defaults to main

I'm doubting now if the proposed change is better. It is still true, but so far, all adapters have revision: null by default in their adapter_config.json. With this change, it would be revision: main. My concern is that this could have some unforeseen consequences and would rather stick with null/None for consistency. WDYT?

@mnoukhov
Copy link
Contributor Author

mnoukhov commented May 8, 2024

I've added the test with differing peft and base model revisions. The one caveat is that the peft config you've uploaded has a revision: null. If you decide to change this, we should update the test.

I'm doubting now if the proposed change is better. It is still true, but so far, all adapters have revision: null by default in their adapter_config.json. With this change, it would be revision: main. My concern is that this could have some unforeseen consequences and would rather stick with null/None for consistency. WDYT?

The only issue I can forsee is if the default branch name changes in the future i.e. github changing from master to main. It would be annoying to deal with.

On a more conceptual basis, I think I prefer null since it wasn't specified by the user, but it ends up being inconsistent with transformers since the default in methods is now revision=main so either way works

@BenjaminBossan
Copy link
Member

The only issue I can forsee is if the default branch name changes in the future i.e. github changing from master to main. It would be annoying to deal with.

On a more conceptual basis, I think I prefer null since it wasn't specified by the user, but it ends up being inconsistent with transformers since the default in methods is now revision=main so either way works

I agree with your assessment. In the end, I think I'd also prefer null, even if the scenario you mention could get annoying. At least it will be consistently annoying for all PEFT models, instead of having two different situations.

@mnoukhov
Copy link
Contributor Author

Default reverted to None and I also caught an issue where we threw the overwrite warning when revision and peft_config.revision were the same

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes and your patience, this now looks good to be merged.

I'll wait with the merge for after the next PEFT release (likely very soon), just to be 100% sure, since these types of changes always have a small chance to miss some edge case.

@BenjaminBossan BenjaminBossan merged commit 2535036 into huggingface:main May 16, 2024
14 checks passed
@BenjaminBossan
Copy link
Member

@mnoukhov PEFT release is out, merging this now. Again, thanks for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants