SD 3.5

Triple encoder architecture (CLIP-L + CLIP-G + T5-XXL) with a quantized MMDiT transformer and NaN-safe inference.

Developer: Stability AI
License: Stability AI Community License
HuggingFace: stabilityai/stable-diffusion-3.5-large, stabilityai/stable-diffusion-3.5-large-turbo, stabilityai/stable-diffusion-3.5-medium

Variants

Model	Steps	Size	Notes
`sd3.5-large:q8`	28	8.5 GB	8.1B params
`sd3.5-large:q4`	28	5.0 GB	Smaller footprint
`sd3.5-large-turbo:q8`	4	8.5 GB	Fast 4-step
`sd3.5-medium:q8`	28	2.7 GB	2.5B params, efficient

Defaults

Resolution: 1024x1024
Guidance: 4.0
Steps: 28 (4 for turbo)

Recommended Dimensions

Width	Height	Aspect Ratio
1024	1024	1:1 (native)
1152	896	9:7
896	1152	7:9
1216	832	19:13
832	1216	13:19
1344	768	7:4
768	1344	4:7

Using non-recommended dimensions will trigger a warning. All values must be multiples of 16.

Example

SD 3.5 Large Q8 — 28 steps, seed 2024:

bash

mold run sd3.5-large:q8 \
  "A steampunk clocktower in a Victorian city at sunset, \
  gears and cogs visible through glass walls, dramatic clouds" \
  --seed 2024

Clocktower — SD 3.5

Notes

SD 3.5 uses classifier-free guidance — negative prompts are supported. The quantized MMDiT includes NaN-safe checks for numerical stability. Use --offload with BF16 when VRAM is tight; GGUF and LoRA offload are rejected.

SD 3.5 ​

Variants ​

Defaults ​

Recommended Dimensions ​

Example ​

Notes ​