Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models #1704

Open
Abdullah-kwl opened this issue May 2, 2024 · 0 comments

Comments

@Abdullah-kwl
Copy link

Feature request

Enable merge_and_unload functionality with ia3 adapters loaded with 4-bit and 8-bit quantization model. Currently, merging fails with an error "Cannot merge ia3 layers when the model is loaded in 4-bit mode"
image

Screenshot 2024-05-02 153000

Motivation

Existing merge_and_unload support excludes 4-bit quantized models with ia3 adapters. Merging ia3 adapters into the base model during 4-bit quantization leverages the size reduction of quantization and simplifies deployment by creating a single, smaller model.This feature aligns with the core advantages of IA3(reduced model size) and 4-bit quantization (efficiency gains), enabling users to fully exploit these optimizations.

Your contribution

While I cannot currently submit a pull request, I'm happy to provide further details, test functionalities after implementation, and assist with documentation updates if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant