Skip to content

Z-Image

Qwen3 text encoder with a flow-matching transformer using 3D RoPE positional encoding. Excellent quality at just 9 steps.

Variants

ModelStepsSizeNotes
z-image-turbo:q896.6 GBFast, great
z-image-turbo:q695.3 GBBest quality/size
z-image-turbo:q493.8 GBLighter
z-image-turbo:bf16912.2 GBFull precision

Defaults

  • Resolution: 1024x1024
  • Guidance: 0.0
  • Steps: 9
WidthHeightAspect Ratio
102410241:1 (native)
10247684:3
76810243:4

Using non-recommended dimensions will trigger a warning. All values must be multiples of 16.

Example

Z-Image Turbo — 9 steps, seed 777:

bash
mold run z-image-turbo:q8 "An astronaut floating through a bioluminescent underwater cave, reflections on the helmet visor, science fiction art" --seed 777

Astronaut — Z-Image Turbo

Notes

Z-Image uses a Qwen3 text encoder (BF16 or GGUF with auto-fallback). The quantized transformer is implemented directly in mold (not upstream candle) due to GGUF tensor naming differences.