Skip to content

Qwen-Image

Qwen2.5-VL text encoder with a 3D causal VAE (2D temporal-slice) and flow-matching with classifier-free guidance.

Variants

ModelStepsSizeNotes
qwen-image:bf165044+ GBFull precision, maximum quality
qwen-image:q85021.8 GBBest quality
qwen-image:q65016.8 GBBest quality/size trade-off
qwen-image:q45012.3 GBSmallest practical footprint

Defaults

  • Resolution: 1328x1328
  • Guidance: 3.0
  • Steps: 50
WidthHeightAspect Ratio
132813281:1 (native)
102410241:1
11528969:7
89611527:9
121683219:13
832121613:19
13447687:4
76813444:7
1664928~16:9
9281664~9:16
7687681:1 (small)
5125121:1 (small)

Using non-recommended dimensions will trigger a warning. All values must be multiples of 16.