Understanding the forward operation #8

evertonaleixo · 2023-10-03T22:12:15Z

I noticed that you apply the mul operation in LoraA and LoraB, then, you sum the result with the input.

I think the result of multiplying LoraA and LoraB has to be summed to the original weights, or I am wrong?

Could you also explain the scaling factor?

Thanks.

cccntu · 2023-10-04T01:44:37Z

I noticed that you apply the mul operation in LoraA and LoraB, then, you sum the result with the input.
I think the result of multiplying LoraA and LoraB has to be summed to the original weights, or I am wrong?

This is the mechanism of torch.parametrizations
https://pytorch.org/tutorials/intermediate/parametrizations.html

Could you also explain the scaling factor?

scaling follows the original implementation https://github.com/microsoft/LoRA
It's mentioned in the paper. From my understanding it's not important, it's only there to control for the change of rank.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding the forward operation #8

Understanding the forward operation #8

evertonaleixo commented Oct 3, 2023

cccntu commented Oct 4, 2023

Understanding the forward operation #8

Understanding the forward operation #8

Comments

evertonaleixo commented Oct 3, 2023

cccntu commented Oct 4, 2023