Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models #1704

Abdullah-kwl · 2024-05-02T12:33:02Z

Feature request

Enable merge_and_unload functionality with ia3 adapters loaded with 4-bit and 8-bit quantization model. Currently, merging fails with an error "Cannot merge ia3 layers when the model is loaded in 4-bit mode"

Motivation

Existing merge_and_unload support excludes 4-bit quantized models with ia3 adapters. Merging ia3 adapters into the base model during 4-bit quantization leverages the size reduction of quantization and simplifies deployment by creating a single, smaller model.This feature aligns with the core advantages of IA3(reduced model size) and 4-bit quantization (efficiency gains), enabling users to fully exploit these optimizations.

Your contribution

While I cannot currently submit a pull request, I'm happy to provide further details, test functionalities after implementation, and assist with documentation updates if needed.

Abdullah-kwl mentioned this issue May 2, 2024

Add Support for IA3 Adapters in add_weighted_adapter Method, Currently facing issue that 'IA3Model' object has no attribute 'add_weighted_adapter' #1688

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models #1704

Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models #1704

Abdullah-kwl commented May 2, 2024

Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models #1704

Support merge_and_unload for IA3 Adapters with 4-bit and 8bit Quantization models #1704

Comments

Abdullah-kwl commented May 2, 2024

Feature request

Motivation

Your contribution