Skip to content

Upscaling

mold supports image upscaling using Real-ESRGAN super-resolution models. Upscale generated images or existing photos to 2x or 4x their original resolution with AI-enhanced detail.

Quick Start

bash
# Upscale an image with the default model (Real-ESRGAN x4+)
mold upscale photo.png

# Specify a model
mold upscale photo.png -m real-esrgan-anime-v3:fp32

# Upscale and save to a specific path
mold upscale photo.png -o photo_hires.png

# Pipe from generation to upscale
mold run "a cat" | mold upscale -

Available Models

ModelArchitectureScaleSizeSpeedQuality
real-esrgan-x4plus:fp16RRDBNet (23 blocks)4x32 MBMediumBest
real-esrgan-x4plus:fp32RRDBNet (23 blocks)4x64 MBMediumBest
real-esrgan-x2plus:fp16RRDBNet (23 blocks)2x32 MBMediumBest
real-esrgan-x4plus-anime:fp16RRDBNet (6 blocks)4x8.5 MBFastGreat (anime)
real-esrgan-anime-v3:fp32SRVGGNetCompact4x2.4 MBFastestGood (anime)

Choosing a Model

  • Photos and realistic images: real-esrgan-x4plus:fp16 — the full 23-block RRDBNet produces the sharpest detail recovery on photographs, textures, and AI-generated photorealistic output. Use fp32 only if you see precision artifacts on Metal.
  • Anime, illustrations, and flat art: real-esrgan-x4plus-anime:fp16 — trained on anime data, preserves clean lines and flat color regions without adding unwanted texture. Lighter than x4plus (6 blocks vs 23) so it's faster too.
  • Batch processing or quick previews: real-esrgan-anime-v3:fp32 — the SRVGGNetCompact architecture is ~3x faster than RRDBNet and only 2.4 MB. Quality is lower but adequate for bulk upscaling or when speed matters more than fine detail.
  • Subtle 2x enhancement: real-esrgan-x2plus:fp16 — when 4x magnification is too aggressive. Good for upscaling already high-res images (e.g. 1024px → 2048px) where you want sharpening without extreme enlargement.

CLI Reference

mold upscale <IMAGE> [OPTIONS]

Arguments:
  <IMAGE>  Input image file path (or - for stdin)

Options:
  -m, --model <MODEL>      Upscaler model [default: real-esrgan-x4plus:fp16]
  -o, --output <PATH>      Output file path [default: <input>_upscaled.<ext>]
      --format <FORMAT>     Output format: png or jpeg [default: png]
      --tile-size <N>       Tile size for tiled inference (0 to disable) [default: 512]
      --host <URL>          Server URL override
      --local               Skip server, run inference locally

Tiled Inference

Large images are automatically split into overlapping tiles for memory-efficient processing. The default tile size is 512 pixels with 32 pixels of overlap. Tiles are blended using linear gradient weights to eliminate visible seams.

bash
# Custom tile size (smaller = less VRAM, slower)
mold upscale large_photo.png --tile-size 256

# Disable tiling (process entire image at once -- needs more VRAM)
mold upscale small_image.png --tile-size 0

Memory Requirements

Upscaler models are lightweight compared to diffusion models:

  • RRDBNet (x4plus): ~32-64 MB model + ~200 MB activations per 512x512 tile
  • SRVGGNetCompact: ~2-5 MB model + ~50 MB activations per 512x512 tile

With the default 512px tiling, any GPU with 1 GB+ VRAM can upscale images of any size.

Post-Generation Upscaling

Coming Soon

The --upscale flag is defined but not yet wired into the generation pipeline. Use the pipe workflow below instead.

The --upscale flag on mold run will upscale images immediately after generation:

bash
mold run "a cat" \
  --upscale real-esrgan-x4plus:fp16

This will generate at the model's native resolution (e.g. 1024x1024) and then upscale it (to 4096x4096 at 4x). In the meantime, use piping:

Piping

mold upscale is fully pipe-compatible:

bash
# Generate and upscale in a pipeline
mold run "a sunset" | mold upscale - | viu -

# Read from stdin, write to file
cat photo.png | mold upscale - -o upscaled.png

# Chain with other tools
mold upscale photo.png | convert - -resize 50% final.png

Server API

When a mold server is running, the upscale command uses the server for inference:

bash
# Server handles the upscaling
MOLD_HOST=http://gpu-server:7680 mold upscale photo.png

# Direct API call
curl -X POST http://localhost:7680/api/upscale \
  -H "Content-Type: application/json" \
  -d '{"model": "real-esrgan-x4plus:fp16", "image": "<base64>"}'

Environment Variables

VariableDefaultDescription
MOLD_UPSCALE_MODELreal-esrgan-x4plus:fp16Default upscaler model
MOLD_UPSCALE_TILE_SIZE512Default tile size