Skip to content

SD 3.5

Triple encoder architecture (CLIP-L + CLIP-G + T5-XXL) with a quantized MMDiT transformer and NaN-safe inference.

Variants

ModelStepsSizeNotes
sd3.5-large:q8288.5 GB8.1B params
sd3.5-large:q4285.0 GBSmaller footprint
sd3.5-large-turbo:q848.5 GBFast 4-step
sd3.5-medium:q8282.7 GB2.5B params, efficient

Defaults

  • Resolution: 1024x1024
  • Guidance: 4.0
  • Steps: 28 (4 for turbo)
WidthHeightAspect Ratio
102410241:1 (native)
11528969:7
89611527:9
121683219:13
832121613:19
13447687:4
76813444:7

Using non-recommended dimensions will trigger a warning. All values must be multiples of 16.

Example

SD 3.5 Large Q8 — 28 steps, seed 2024:

bash
mold run sd3.5-large:q8 "A steampunk clocktower in a Victorian city at sunset, gears and cogs visible through glass walls, dramatic clouds" --seed 2024

Clocktower — SD 3.5

Notes

SD 3.5 uses classifier-free guidance — negative prompts are supported. The quantized MMDiT includes NaN-safe checks for numerical stability.