Skip to content

Wuerstchen v2

A unique 3-stage cascade architecture with 42x latent compression. CLIP-G text encoder feeds into Prior → Decoder → VQ-GAN stages.

Variants

ModelStepsSizeNotes
wuerstchen-v2:fp16605.6 GBFull cascade pipeline

Defaults

  • Resolution: 1024x1024
  • Guidance: 4.0
  • Steps: 60
WidthHeightAspect Ratio
102410241:1 (native)

Using non-recommended dimensions will trigger a warning. All values must be multiples of 16.

Notes

Wuerstchen includes a default negative prompt. The 42x latent compression means the diffusion process operates in a very compact space, which allows for efficient generation despite the multi-stage pipeline.

Negative prompts are supported and effective with this model.