Only random noise is generated with Flux + LoRA with optimum-quanto >= 0.2.5 

Hello,
I am facing an issue with generating images with FLUX.1[dev] + LoRA that I trained with SimpleTuner. I need to be able to load the LoRAs dynamically, therefore I want to use the already quantized FLUX before the LoRA is loaded into it. With optimum-quanto version 0.2.4 and lower I got the following error: `KeyError: 'time_text_embed.timestep_embedder.linear_1.weight._data’`. After bumping the version to 0.2.5 or 0.2.6, no error is thrown but the results look like this:
![noise](https://github.com/user-attachments/assets/c47310d9-e850-437f-bfab-b3feeabc78d6)

My code:

```
import torch
from diffusers import DiffusionPipeline
from optimum.quanto import freeze, qfloat8, quantize

model_id = 'black-forest-labs/FLUX.1-dev'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
quantize(pipeline.transformer, weights=qfloat8)
freeze(pipeline.transformer)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
lora_path = <path_to_lora>
pipeline.load_lora_weights(lora_path)

prompts = {"candy": "Candy bar surrounded by playful, abstract shapes resembling candy sprinkles and whimsical clouds of cream. The atmosphere is vibrant and joyful, filled with bright colors that evoke childhood memories of sweetness and fun. This imagery invites viewers to imagine the delight of savoring a piece of chocolate that brings happiness to any moment."}

seed = 19640904
for prompt_key, prompt_value in prompts.items():
    print(prompt_key, prompt_value)
    images = pipeline(
        prompt=prompt_value,
        num_inference_steps=10,
        num_images_per_prompt=1,
        generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(seed),
        width=1024,
        height=1024,
    ).images
    # get key of the prompt
    for idx, image in enumerate(images):
        display(image)
```

Is there a way how to solve this? A workaround could be to load the LoRA into the model before quantization and save the quantized merged model and work with that, but I lose the benefit of working with the LoRA only, which is much faster and less memory expensive.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Only random noise is generated with Flux + LoRA with optimum-quanto >= 0.2.5 #343

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Only random noise is generated with Flux + LoRA with optimum-quanto >= 0.2.5 #343

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions