You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update the Wan Animate docs to reflect the most recent code
* Further explain input preprocessing and link to original Wan Animate preprocessing scripts
Copy file name to clipboardExpand all lines: docs/source/en/api/pipelines/wan.md
+18-30Lines changed: 18 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -405,7 +405,7 @@ For replacement mode, you additionally need:
405
405
-**Mask video**: A mask indicating where to generate content (white) vs. preserve original (black)
406
406
407
407
> [!NOTE]
408
-
> The preprocessing tools are available in the original Wan-Animate repository. Integration of these preprocessing steps into Diffusers is planned for a future release.
408
+
> Raw videos should not be used for inputs such as `pose_video`, which the pipeline expects to be preprocessed to extract the proper information. Preprocessing scripts to prepare these inputs are available in the [original Wan-Animate repository](https://github.com/Wan-Video/Wan2.2?tab=readme-ov-file#1-preprocessing). Integration of these preprocessing steps into Diffusers is planned for a future release.
409
409
410
410
The example below demonstrates how to use the Wan-Animate pipeline:
411
411
@@ -417,13 +417,10 @@ import numpy as np
417
417
import torch
418
418
from diffusers import AutoencoderKLWan, WanAnimatePipeline
419
419
from diffusers.utils import export_to_video, load_image, load_video
-**mode**: Choose between `"animation"` (default) or `"replacement"`
586
-
-**num_frames_for_temporal_guidance**: Number of frames for temporal guidance (1 or 5 recommended). Using 5 provides better temporal consistency but requires more memory
587
-
-**guidance_scale**: Controls how closely the output follows the text prompt. Higher values (5-7) produce results more aligned with the prompt
588
-
-**num_frames**: Total number of frames to generate. Should be divisible by `vae_scale_factor_temporal` (default: 4)
574
+
-**mode**: Choose between `"animate"` (default) or `"replace"`
575
+
-**prev_segment_conditioning_frames**: Number of frames for temporal guidance (1 or 5 recommended). Using 5 provides better temporal consistency but requires more memory
576
+
-**guidance_scale**: Controls how closely the output follows the text prompt. Higher values (5-7) produce results more aligned with the prompt. For Wan-Animate, CFG is disabled by default (`guidance_scale=1.0`) but can be enabled to support negative prompts and finer control over facial expressions. (Note that CFG will only target the text prompt and face conditioning.)
0 commit comments