Skip to content

FLUX.1

The highest quality model family. T5-XXL + CLIP-L text encoding with a flow-matching transformer.

Variants

ModelStepsSizeNotes
flux-schnell:q8412 GBFast, general purpose
flux-schnell:q649.8 GBBest quality/size trade-off
flux-schnell:bf16423.8 GBFull precision (>24 GB VRAM)
flux-schnell:q447.5 GBLighter
flux-dev:q82512 GBFull quality
flux-dev:q6259.9 GBBest quality/size trade-off
flux-dev:bf162523.8 GBFull precision (>24 GB VRAM)
flux-dev:q4257 GBFull quality, less VRAM

Fine-Tunes

ModelStepsSizeStyle
flux-krea:q82512.7 GBAesthetic photography
flux-krea:q6259.8 GBAesthetic photography
flux-krea:q4257.5 GBAesthetic photography
flux-krea:fp82511.9 GBAesthetic photography
jibmix-flux:fp82511.9 GBPhotorealistic
jibmix-flux:q5258.4 GBPhotorealistic
jibmix-flux:q4256.9 GBPhotorealistic
jibmix-flux:q3255.4 GBPhotorealistic, lighter
ultrareal-v4:q82512.6 GBPhotorealistic (latest)
ultrareal-v4:q5258.0 GBPhotorealistic
ultrareal-v4:q4256.7 GBPhotorealistic, lighter
ultrareal-v3:q82512.7 GBPhotorealistic
ultrareal-v3:q6259.8 GBPhotorealistic
ultrareal-v3:q4257.5 GBPhotorealistic, lighter
ultrareal-v2:bf162523.8 GBFull precision
iniverse-mix:fp82511.9 GBRealistic SFW/NSFW mix
WidthHeightAspect Ratio
102410241:1 (native)
10247684:3
76810243:4
102457616:9
57610249:16
7687681:1

Using non-recommended dimensions will trigger a warning. All values must be multiples of 16.

Examples

FLUX Schnell Q8 — 4 steps, seed 42:

bash
mold run flux-schnell:q8 "A majestic snow leopard perched on a Himalayan cliff at golden hour, cinematic lighting, photorealistic" --seed 42

Snow leopard — FLUX Schnell

FLUX Dev Q4 — 25 steps, seed 1337:

bash
mold run flux-dev:q4 "A cozy Japanese tea house interior with warm lantern light, steam rising from ceramic cups, watercolor style" --seed 1337

Tea house — FLUX Dev

LoRA Support

FLUX models support LoRA adapters in both BF16 and GGUF quantized modes:

bash
mold run flux-dev:bf16 "a portrait" --lora style.safetensors --lora-scale 0.8
mold run flux-dev:q4 "a portrait" --lora style.safetensors --lora-scale 0.8

VRAM Notes

  • Full BF16 (23 GB) auto-offloads on 24 GB cards — blocks stream CPU↔GPU
  • GGUF quantized (Q4/Q8) fits without offloading
  • Use --eager to keep encoders loaded between generations (faster, more VRAM)
  • T5-XXL encoder auto-selects quantized variant when VRAM is tight