Description
#10578 fixed loading LoRAs into 4bit quantized models for Flux.
#10576 added a test to ensure Flux LoRAs can be loaded when 8bit bitsandbytes
quantization is applied.
We still need to support all of this for Flux Control LoRAs as we do quite a bit of expansion gymnastics as well as new layer assignments to make it all work.
Some stuff I wanted to discuss before attempting a PR:
(when I say quantization always assumed quantization from bitsandbytes
for this thread)
diffusers/src/diffusers/loaders/lora_pipeline.py
Line 2020 in c944f06
nn.Linear
. It needs to configured based on what quantization scheme we're using (4bit/8bit).
Same goes for:
diffusers/src/diffusers/loaders/lora_pipeline.py
Line 1917 in c944f06
@BenjaminBossan I wanted to pick your brains here to have a robust design for approaching the solution. Suggestions?