Runpod × OpenAI: Parameter Golf challenge is live
You've unlocked a referral bonus! Sign up today and you'll get a random credit bonus between $5 and $500
You've unlocked a referral bonus!
Claim Your Bonus
Claim Bonus
Emmett Fear
Emmett Fear

How to Run Automatic1111 (Stable Diffusion Web UI) on Runpod

Automatic1111 (A1111) is the most widely used web interface for Stable Diffusion, offering a comprehensive set of tools for image generation, img2img, inpainting, ControlNet, LoRA, and model training workflows. This guide walks through how to run Automatic1111 on Runpod's GPU cloud, from first deployment to loading custom models and extensions.

What is Automatic1111?

Automatic1111 is an open-source web UI for Stable Diffusion, maintained at github.com/AUTOMATIC1111/stable-diffusion-webui. It wraps the core Stable Diffusion model in a browser-based interface, exposing generation parameters, sampling methods, model switching, and extension support without requiring any command-line interaction.

Key capabilities include:

  • txt2img: Generate images from text prompts with control over sampler, steps, CFG scale, resolution, and seed
  • img2img: Transform existing images using a prompt and denoising strength
  • Inpainting: Edit specific regions of an image using a mask
  • ControlNet: Condition image generation on depth maps, pose skeletons, edge maps, and other structural inputs
  • LoRA and Textual Inversion: Load community-trained style and concept adapters alongside the base model
  • Model switching: Load any checkpoint (SD 1.5, SD 2.1, SDXL, Pony, Flux-compatible) from the interface without restarting
  • Extensions: A large library of community extensions including upscalers, regional prompting, and automation scripts

Automatic1111 runs locally on a GPU. For users without a dedicated GPU, or who want access to more VRAM than their local hardware provides, running it on a cloud GPU through Runpod eliminates the hardware dependency entirely.

Automatic1111 vs Stable Diffusion Forge

Stable Diffusion Forge is a community fork of Automatic1111, maintained at github.com/lllyasviel/stable-diffusion-webui-forge. It is largely compatible with A1111 (uses the same extension ecosystem and checkpoint format) but introduces backend optimizations that reduce VRAM usage and improve generation speed, particularly for SDXL and newer model architectures.

The practical differences:

  • VRAM efficiency: Forge uses less VRAM for equivalent models, making it more practical on GPUs with 8-12 GB. On cloud GPUs where VRAM is less constrained, the difference is smaller.
  • Speed: Forge generates images faster than A1111 on most model configurations, particularly SDXL.
  • Model compatibility: Forge added support for Flux models earlier and more completely than A1111.
  • Extensions: Most A1111 extensions work in Forge, but some are incompatible. The A1111 ecosystem remains larger and more stable.
  • Stability: A1111 has a longer track record and broader community support. Forge updates more frequently, which can introduce breaking changes.

For most workflows on Runpod, either A1111 or Forge works well. A1111 is the better starting point if you have existing A1111 workflows, extensions, or presets. Forge is worth trying if you need maximum speed or are working with Flux models. Runpod maintains an official Forge template; A1111 templates on Runpod are community-maintained.

How to Run Automatic1111 on Runpod

1. Create a Runpod account

Go to runpod.io and sign up. Add credits to your account before deploying a pod.

2. Find an Automatic1111 template

Browse available templates at console.runpod.io/hub. Search for "A1111" or "Stable Diffusion" to see current options.

A few things to know about the template landscape:

  • A1111 templates are community-maintained. Runpod does not currently maintain an official A1111 template. The available options (including several variants by different maintainers) are contributed by the community. Well-used community templates are generally reliable, but check the image version and last-updated date before deploying.
  • Forge has an official Runpod template. If you want to use Stable Diffusion Forge instead of A1111, Runpod maintains an official template (runpod/forge:latest) that is kept current. This is the lower-maintenance option for new deployments.
  • Template options change over time. The Runpod Hub is a live catalog. New templates are added by the community regularly, and older versions may be deprecated. Check the hub directly for the most current options rather than relying on a specific image tag referenced in any guide.

3. Choose a GPU

Recommended GPUs for Automatic1111 on Runpod:

  • SD 1.5 / SD 2.1 (standard generation): RTX 3080 (10 GB) or RTX 4000 Ada (20 GB) provide good speed at a low per-hour cost
  • SDXL / Pony: RTX 3090 (24 GB) or RTX 4090 (24 GB) recommended for comfortable SDXL inference at 1024x1024
  • Flux: RTX 4090 (24 GB) minimum for Flux.1-dev at reasonable speed; H100 (80 GB) for fastest inference
  • DreamBooth / LoRA training: RTX 3090 (24 GB) or A100 (40-80 GB) for faster training runs

Runpod shows live GPU availability and per-hour pricing before you deploy. You can also choose between Secure Cloud (datacenter-grade reliability) and Community Cloud (lower cost, slightly less guaranteed availability).

4. Configure storage

Before deploying, add a Network Volume if you plan to use multiple sessions. Network Volumes persist between pod deployments, meaning your downloaded models, LoRAs, embeddings, and generated images are available every time you spin up a new pod. Without a volume, all data is lost when the pod is terminated.

A volume of 50-100 GB is sufficient for a few checkpoints and a library of LoRAs. Mount it to /workspace so A1111 can reference it automatically.

5. Launch and connect

After deploying, wait for the pod status to show Running. In the pod dashboard, click Connect and open the HTTP service on port 7860. This opens the Automatic1111 web interface in your browser.

The first load may take a minute or two as the model is initialized. Once the interface appears, you can enter a prompt and click Generate to produce your first image.

Loading Models in Automatic1111

The default template ships with SD 1.5. To use other models:

  • SDXL: Download the SDXL base checkpoint (and optionally the refiner) from Stability AI or Hugging Face. Place the .safetensors file in /workspace/stable-diffusion-webui/models/Stable-diffusion/. Click the refresh icon next to the checkpoint dropdown in the UI and select the model.
  • Community models: Models from Civitai (Pony Diffusion, Realistic Vision, anime checkpoints, etc.) follow the same process. Download the file and place it in the checkpoints folder.
  • Flux: Flux requires the GGUF or diffusers format. Some A1111 versions and Forge handle this via extension. Check the relevant extension repository for current installation instructions as this workflow changes frequently.

Installing Extensions

Automatic1111 extensions are installed through the Extensions tab in the interface. You can install from the available list or directly from a GitHub URL. Widely used extensions include:

  • ControlNet: Adds structural control inputs. Install from the Extensions tab and download the ControlNet models separately into models/ControlNet/.
  • Adetailer: Automatic face and hand detail improvement on generated images.
  • sd-webui-regional-prompter: Enables region-specific prompting in a single generation.
  • LORA training scripts: For fine-tuning on custom subjects or styles.

Extensions installed during a session will be lost if you are not using a persistent Network Volume. With a volume mounted, extensions installed to the /workspace path persist between sessions.

GPU Recommendations by Use Case

  • Casual generation (SD 1.5, SD 2.1): RTX 3080 or RTX 4000 Ada. Fast generation at low cost.
  • SDXL and high-resolution: RTX 3090 or RTX 4090 (24 GB). SDXL at 1024x1024 runs comfortably without memory pressure.
  • Flux models: RTX 4090 minimum. Flux.1-dev at full precision requires close to 24 GB.
  • LoRA and DreamBooth training: RTX 3090, A100 40GB, or A100 80GB. Training is significantly faster on higher-VRAM GPUs and training time directly reduces cost on pay-per-hour billing.
  • Batch generation and upscaling: A100 80GB or H100 for maximum throughput on large batches or Extras upscaling pipelines.

Serverless Stable Diffusion on Runpod

For developers who want to serve Stable Diffusion through an API rather than interact with the web UI directly, Runpod Serverless provides an alternative deployment model. Instead of a running pod, you deploy a worker container that scales on demand and returns to zero when idle.

This is useful for integrating Stable Diffusion into applications, automating generation pipelines, or reducing cost for irregular workloads where a persistent pod would accumulate idle GPU hours. Runpod maintains a serverless Stable Diffusion worker at github.com/runpod-workers/worker-a1111.

FAQs

What is Automatic1111?

Automatic1111 is an open-source web interface for Stable Diffusion, providing browser-based access to image generation, img2img, inpainting, ControlNet, LoRA, and model training without requiring command-line usage. It is the most widely used Stable Diffusion web UI in the community.

How do I install Automatic1111 without a GPU?

You don't need a local GPU. Deploying via Runpod gives you browser access to Automatic1111 running on a cloud GPU. Browse available templates at console.runpod.io/hub, search for A1111, select a community template, choose a GPU, and connect via port 7860. No local installation is needed.

What is the difference between Automatic1111 and Forge?

Stable Diffusion Forge is a fork of Automatic1111 with backend optimizations for better VRAM efficiency and generation speed, particularly for SDXL and Flux models. Most A1111 extensions work in Forge. A1111 has a larger extension ecosystem and longer stability track record. Runpod maintains an official Forge template; A1111 templates are community-maintained.

Which GPU should I use for SDXL on Runpod?

The RTX 3090 or RTX 4090 (24 GB VRAM) is the recommended choice for SDXL generation. SDXL at 1024x1024 requires approximately 8-12 GB for the base model, leaving headroom for ControlNet and refiners on a 24 GB card.

Can I use my own models and LoRAs on Runpod?

Yes. You can download any model or LoRA from Civitai, Hugging Face, or other sources and place them in the appropriate directory within your pod. Using a persistent Network Volume ensures your model library is available across sessions without re-downloading.

How much does it cost to run Automatic1111 on Runpod?

Cost depends on the GPU selected and the duration the pod is running. Typical rates start from around $0.20-$0.50/hr for mid-tier GPUs and $0.44/hr for RTX 4090 instances. You are only billed while the pod is active. Stopping the pod when not in use keeps costs low. See current rates on the Runpod pricing page.

Do my models and outputs persist between sessions?

Not by default. Pod storage is ephemeral and lost when a pod is terminated. To persist models, LoRAs, embeddings, and generated images across sessions, attach a Runpod Network Volume before deploying and mount it to /workspace.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.