Skip to content

Conversation

@sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Aug 6, 2025

What does this PR do?

  • Adds support for modular I2I for Flux.
  • Refactors the set_timesteps() to eliminate the dependency on latents (hope that's okay). I have checked if that has any impact on the existing T2I blocks and it doesn't (output image remains unchanged with this code).

Code to test:

Unfold
import torch
from diffusers.modular_pipelines import SequentialPipelineBlocks
from diffusers.modular_pipelines.flux.modular_blocks import IMAGE2IMAGE_BLOCKS
from diffusers.utils.logging import set_verbosity_debug
from diffusers.utils import load_image

# set_verbosity_debug()

model_id = "black-forest-labs/FLUX.1-dev"

blocks = SequentialPipelineBlocks.from_blocks_dict(IMAGE2IMAGE_BLOCKS)

pipeline = blocks.init_pipeline()
pipeline.load_components(["text_encoder"], repo=model_id, subfolder="text_encoder", torch_dtype=torch.bfloat16)
pipeline.load_components(["tokenizer"], repo=model_id, subfolder="tokenizer")
pipeline.load_components(["text_encoder_2"], repo=model_id, subfolder="text_encoder_2", torch_dtype=torch.bfloat16)
pipeline.load_components(["tokenizer_2"], repo=model_id, subfolder="tokenizer_2")
pipeline.load_components(["scheduler"], repo=model_id, subfolder="scheduler")
pipeline.load_components(["transformer"], repo=model_id, subfolder="transformer", torch_dtype=torch.bfloat16)
pipeline.load_components(["vae"], repo=model_id, subfolder="vae", torch_dtype=torch.bfloat16)
pipeline.to("cuda")

url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
init_image = load_image(url).resize((1024, 1024))

prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
output = pipeline(
    prompt=prompt, 
    image=init_image,
    num_inference_steps=28, 
    guidance_scale=3.5, 
    strength=0.95, 
    generator=torch.manual_seed(0)
)
output.get_intermediate("images")[0].save("modular_i2i_flux.png")
Output: image

This is about the same output one would get if they ran the non-modular diffusers pipeline

Code
import torch
from diffusers import FluxImg2ImgPipeline
from diffusers.utils import load_image

# set_verbosity_debug()

model_id = "black-forest-labs/FLUX.1-dev"

pipeline = FluxImg2ImgPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16).to("cuda")
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
init_image = load_image(url).resize((1024, 1024))

prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
output = pipeline(
    prompt=prompt, 
    image=init_image,
    num_inference_steps=28, 
    guidance_scale=3.5, 
    strength=0.95, 
    generator=torch.manual_seed(0)
)
output.images[0].save("i2i_flux.png")

@sayakpaul sayakpaul requested a review from yiyixuxu August 6, 2025 13:42
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sayakpaul
looks good for now, I think we can merge in first, but I'm doing some refactor on SDXL modular, we should review & discuss there once the PR is ready and probably need to revist and refactor Flux modular too afterwards

let's wait for this PR to be in first #11969
I think you'll have conflicts

@sayakpaul
Copy link
Member Author

@yiyixuxu sounds good! Please LMK once we can review.

@DN6 could you please give me a heads-up once the custom code PR is merged so that I can rebase and merge this one?

@sayakpaul
Copy link
Member Author

As discussed, merging this!

@sayakpaul sayakpaul merged commit ff9a387 into main Aug 11, 2025
18 checks passed
@sayakpaul sayakpaul deleted the modular-flux-i2i branch August 11, 2025 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants