Can peft support ColumnParallelLinear? #1711

wjn1996 · 2024-05-05T13:09:44Z

System Info

I have a model, and the architecture has xxxParallel attributes, which are used for parallel inference:

BaichuanForCausalLM(
  (model): BaiChuanModel(
    (embed_tokens): VocabParallelEmbedding()
    (layers): ModuleList(
      (0-31): 32 x BaiChuanDecoderLayer(
        (self_attn): BaiChuanAttention(
          (W_pack): ColumnParallelLinear()
          (o_proj): RowParallelLinear()
          (attn): PagedAttentionWithALiBi()
        )
        (mlp): BaiChuanMLP(
          (gate_up_proj): ColumnParallelLinear()
          (down_proj): RowParallelLinear()
          (act_fn): SiluAndMul()
        )
        (input_layernorm): RMSNorm()
        (post_attention_layernorm): RMSNorm()
      )
    )
    (norm): RMSNorm()
  )
  (lm_head): ColumnParallelLinear()
  (sampler): Sampler()
)

I want to directly load this model with peft (lora), but it throws an error:

ValueError: Target module ColumnParallelLinear() is not supported. Currently, only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv2d`, `transformers.pytorch_utils.Conv1D`.

So, how can I implement this process without any model architecture update?

Who can help?

@pacman100 @younesbelkada @BenjaminBossan @sayakpaul

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

# LLM 和 SamplingParams
# pip install vllm==0.2.1 (cuda=11.8)
from vllm import LLM, SamplingParams
from peft import PeftModel
# Function to load the PeftModel for performance optimization
def load_peft_model(model, peft_model):
    peft_model = PeftModel.from_pretrained(model, peft_model)
    return peft_model

prompts = [
    "xxx",
]

sampling_params = SamplingParams(temperature=1.0, top_p=0.9)

model_name = "baichuan2-7b-base"
origin_model_path = "xxx/pre-trained-lm/{}".format(model_name)
saved_model_path = "xxx/v2/{}/checkpoint-8000".format(model_name) # lora path
save_answer_path = "xxx/{}".format(model_name)

llm = LLM(model=origin_model_path, trust_remote_code=True)

model = llm.llm_engine.workers[0].model
model = load_peft_model(model, saved_model_path)
llm.llm_engine.workers[0].model = model


outputs = llm.generate(
    prompts, 
    sampling_params,
    # lora_request=LoRARequest("headline-lora", 1, saved_model_path)
    )


for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

Expected behavior

solve this issue.

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2024-05-06T09:20:17Z

So I assume you're using megatron. Did you try this:

https://huggingface.co/docs/peft/v0.10.0/en/package_reference/lora#peft.LoraConfig.megatron_config

Here is an example: https://github.com/huggingface/peft/blob/main/tests/test_lora_megatron.py

github-actions · 2024-06-04T15:03:38Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can peft support ColumnParallelLinear? #1711

Can peft support ColumnParallelLinear? #1711

wjn1996 commented May 5, 2024

BenjaminBossan commented May 6, 2024

github-actions bot commented Jun 4, 2024

Can peft support ColumnParallelLinear? #1711

Can peft support ColumnParallelLinear? #1711

Comments

wjn1996 commented May 5, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

BenjaminBossan commented May 6, 2024

github-actions bot commented Jun 4, 2024