Skip to content

Server API

When running mold serve, you get a REST API for remote image generation.

Endpoints

MethodPathDescription
POST/api/generateGenerate images from prompt
POST/api/generate/streamGenerate with SSE progress streaming
POST/api/expandExpand a prompt using LLM
GET/api/modelsList available models
POST/api/models/loadLoad/swap the active model
POST/api/models/pullPull/download a model
DELETE/api/models/unloadUnload model to free GPU memory
GET/api/statusServer health + status
GET/healthSimple 200 OK health check
GET/api/openapi.jsonOpenAPI spec
GET/api/docsInteractive API docs (Scalar)

Quick Examples

bash
# Generate an image
curl -X POST http://localhost:7680/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "a glowing robot"}' \
  -o robot.png

# Check status
curl http://localhost:7680/api/status

# List models
curl http://localhost:7680/api/models

# Load a specific model
curl -X POST http://localhost:7680/api/models/load \
  -H "Content-Type: application/json" \
  -d '{"model": "flux-dev:q4"}'

# Interactive docs
open http://localhost:7680/api/docs

/api/generate

POST /api/generate returns raw image bytes, not a JSON envelope. The response Content-Type matches the requested format, and the server includes an x-mold-seed-used header with the effective seed.

bash
curl -i -X POST http://localhost:7680/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a glowing robot in a rainy alley",
    "model": "flux-schnell:q8",
    "width": 1024,
    "height": 1024,
    "steps": 4,
    "guidance": 0.0,
    "output_format": "png"
  }' \
  -o robot.png

Representative headers:

http
HTTP/1.1 200 OK
content-type: image/png
x-mold-seed-used: 42

Generate Request Shape

json
{
  "prompt": "a cat on a skateboard",
  "model": "flux-schnell:q8",
  "width": 1024,
  "height": 1024,
  "steps": 4,
  "seed": 42,
  "guidance": 0.0,
  "batch": 1,
  "negative_prompt": "",
  "image": "<base64>",
  "strength": 0.75,
  "mask": "<base64>",
  "lora": "/path/to/adapter.safetensors",
  "lora_scale": 1.0,
  "expand": false
}

Only prompt is required. All other fields have defaults.

/api/generate/stream

The /api/generate/stream endpoint sends Server-Sent Events for progress:

text
event: progress
data: {"type":"queued","position":1}

event: progress
data: {"type":"stage_start","name":"Loading model weights"}

event: progress
data: {"type":"denoise_step","step":1,"total":25,"elapsed_ms":640}

event: complete
data: {"images":[{"data":[137,80,78,71],"format":"png","width":1024,"height":1024,"index":0}],"generation_time_ms":12345,"model":"flux-dev:q4","seed_used":42}

Typical terminal usage:

bash
curl -N http://localhost:7680/api/generate/stream \
  -H "Content-Type: application/json" \
  -d '{"prompt":"a glowing robot","model":"flux-dev:q4","steps":25,"width":1024,"height":1024}'

The final complete event matches the GenerateResponse JSON shape used by the server internally.

RunPod Note

RunPod's proxy has a 100-second timeout. Use the SSE streaming endpoint for long generations to keep the connection alive.

/api/status

Example response:

json
{
  "version": "0.3.1",
  "git_sha": "da039e1",
  "build_date": "2026-03-25",
  "models_loaded": ["flux-schnell:q8"],
  "busy": false,
  "gpu_info": {
    "name": "NVIDIA GeForce RTX 4090",
    "vram_total_mb": 24564,
    "vram_used_mb": 8192
  },
  "uptime_secs": 3600
}

/api/models/pull

Plain blocking response:

bash
curl -X POST http://localhost:7680/api/models/pull \
  -H "Content-Type: application/json" \
  -d '{"model":"flux-schnell:q8"}'

Example text response:

text
model 'flux-schnell:q8' pulled successfully

SSE streaming response:

bash
curl -N http://localhost:7680/api/models/pull \
  -H "Accept: text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"model":"flux-schnell:q8"}'

Representative events:

text
event: progress
data: {"type":"download_progress","filename":"flux1-schnell-Q8_0.gguf","file_index":1,"total_files":6,"bytes_downloaded":1048576,"bytes_total":12714452256}

event: progress
data: {"type":"pull_complete","model":"flux-schnell:q8"}

Image Output

Generated images are saved to ~/.mold/output/ by default. Override with a custom path:

bash
MOLD_OUTPUT_DIR=/srv/mold/output mold serve

To disable image persistence (TUI gallery will not function):

bash
MOLD_OUTPUT_DIR="" mold serve