Announcing Runpod Flash

GPU Cloud Pricing

Simple pricing plans for teams of all sizes,
designed to scale with you.

Pods

Thousands of GPUs across 30+ regions. Simple pricing plans for teams of all sizes, designed to scale with you.

GPU

>80GB VRAM

H200

141 GB VRAM
276 GB RAM
24
vCPUs
$3.59/hr

B200

180 GB VRAM
283 GB RAM
28
vCPUs
$5.98/hr

RTX Pro 6000

96 GB VRAM
188 GB RAM
16
vCPUs
$1.69/hr

H100 NVL

94 GB VRAM
94 GB RAM
16
vCPUs
$2.59/hr
80GB VRAM

H100 PCIe

80 GB VRAM
188 GB RAM
16
vCPUs
$1.99/hr

H100 SXM

80 GB VRAM
125 GB RAM
20
vCPUs
$2.69/hr

A100 PCIe

80 GB VRAM
117 GB RAM
8
vCPUs
$1.19/hr

A100 SXM

80 GB VRAM
125 GB RAM
16
vCPUs
$1.39/hr
48GB VRAM

L40S

48 GB VRAM
94 GB RAM
16
vCPUs
$0.79/hr

RTX 6000 Ada

48 GB VRAM
167 GB RAM
10
vCPUs
$0.74/hr

A40

48 GB VRAM
50 GB RAM
9
vCPUs
$0.35/hr

L40

48 GB VRAM
94 GB RAM
8
vCPUs
$0.69/hr

RTX A6000

48 GB VRAM
50 GB RAM
9
vCPUs
$0.33/hr
32GB VRAM

RTX 5090

32 GB VRAM
35 GB RAM
9
vCPUs
$0.69/hr
24GB VRAM

L4

24 GB VRAM
50 GB RAM
12
vCPUs
$0.44/hr

RTX 3090

24 GB VRAM
125 GB RAM
16
vCPUs
$0.22/hr

RTX 4090

24 GB VRAM
41 GB RAM
6
vCPUs
$0.34/hr

RTX A5000

24 GB VRAM
25 GB RAM
9
vCPUs
$0.16/hr
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Serverless

Cost effective for every inference workload. Save 25% over other Serverless cloud providers on flex workers alone.

GPU

Workers
180
GB
B200

Maximum throughput for big models.

$
8.64
/hr
141
GB
H200

Extreme throughput for big models.

$
5.58
/hr
96
GB
RTX 6000 Pro
PRO

High throughput for large model inference workloads.

$
4.00
/hr
80
GB
H100
PRO

Extreme throughput for big models.

$
4.18
/hr
80
GB
A100

High throughput GPU, yet still very cost-effective.

$
2.72
/hr
48
GB
L40, L40S, 6000 Ada
PRO

Extreme inference throughput on LLMs like Llama 3 7B.

$
1.90
/hr
48
GB
A6000, A40

A cost-effective option for running big models.

$
1.22
/hr
32
GB
5090
PRO

Extreme throughput for small-to-medium models.

$
1.58
/hr
24
GB
4090
PRO

Extreme throughput for small-to-medium models.

$
1.10
/hr
24
GB
L4, A5000, 3090

Great for small-to-medium sized inference workloads.

$
0.69
/hr
16
GB
A4000, A4500, RTX 4000, RTX 2000

The most cost-effective for small models.

$
0.58
/hr
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Clusters

Launch multi-GPU clusters in minutes with no commitments—scale up to 64 GPUs, attach shared storage, and pay only for what you use.

GPU

H200 SXM
$
4.31
/hr
A100 SXM
$
1.79
/hr
L40S
Contact sales
H100 SXM
Contact sales
B200
Contact sales
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Reserved Clusters

Dedicated GPU clusters with guaranteed availability, custom configurations, SLA-backed uptime, and discounted rates for enterprises scaling to 10,000+ GPUs.

GPU

H200 SXM
Contact sales
A100 SXM
Contact sales
L40S
Contact sales
H100 SXM
Contact sales
B200
Contact sales
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Storage

Flexible and persisitent storage options starting at $0.05/GB/mo with standard and high-performance tiers.

Storage Type

Container Disk
$0.10/GB/mo
Volume Disk
Running - $0.10/GB/mo
Idle - $0.20/GB/mo
Network Storage (Standard)
Under 1TB - $0.07/GB/mo
Over 1TB - $0.05/GB/mo
Network Storage (High-Performance)
$0.14/GB/mo
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Public Endpoints

Instant access to pre-deployed AI models via API—no infrastructure setup required.

Model Name

Audio
Pruna / Whisper V3 Large
$0.05 per 1000 characters
resembleai / Chatterbox Turbo
$0.00 per 1000 characters.
minimax / Minimax Speech 02 HD
$0.05 per 1000 characters
minimax / Minimax Speech 02 HD
$0.05 per 1000 characters
Image
bytedance / Seedream 4.0 Edit
$0.0270 per request
bytedance / Seedream 4.0 T2I
$0.0270 per request
google / Nano Banana Edit
$0.0380 per request
google / Nano Banana Pro Edit
$0.14 per request
pruna / Pruna Image T2I
$0.0050 per request
pruna / Pruna Image Edit
$0.01 per request
alibaba / WAN 2.6 T2I
$0.03 per request
qwen / Qwen Image Edit 2511
$0.02 per request
qwen / Qwen Image Edit 2511 LoRA
$0.025 per request
Tongyi-MAI / Z Image Turbo
$0.0050 per request.
black-forest-labs / FLUX.1 [dev]
$0.02 per megapixel
black-forest-labs / FLUX.1 Kontext [dev]
$0.0250 per request
black-forest-labs / FLUX.1 Schnell
$0.0024 per megapixel
Bytedance / Seedream 3.0
$0.0300 per request
qwen / Qwen Image Edit
$0.0200 per request
qwen / Qwen Image
$0.0200 per request
qwen / Qwen Image LoRA
$0.0250 per request
Language
deep-cogito / Deep Cogito v2 Llama 70B
$0.00001 per 1m tokens
qwen / Qwen3 32B AWQ
$10.00 per 1m tokens
minimax / Minimax Speech 02 HD
$0.05 per 1000 characters
minimax / Minimax Speech 02 HD
$0.05 per 1000 characters
ibm / IBM Granite 4.0 H Small
$1.00 per 1m tokens
Video
Bytedance / Seedance 1.0 pro
5s: $0.12(480p) per request
Alibaba / Wan 2.2 I2V 720p
5s: $0.30 per request
Alibaba / Wan 2.2 T2V 720p
5s: $0.30 per request
Alibaba / Wan 2.1 I2V 720p
$0.30 per request
Alibaba / Wan 2.1 T2V 720p
$0.30 per request
kwaivgi / Kling v2.6 Standard Motion Control
1-3s $0.21 per request
Alibaba / WAN 2.6 T2V
5s: $0.50 per request
bytedance / Seedance V1.5 Pro I2V
$0.024 per second
kwaivgi / Kling Video O1 R2V
$0.112 per second
Alibaba / Wan 2.6 I2V
5s: $0.50 per request
OpenAI / SORA 2 Pro I2V
4s $1.20 pre request
MeiGen-AI / InfiniteTalk
$0.25 · 720p per request
OpenAI / SORA 2 I2V
4s: $0.40 per request
Alibaba / Wan 2.5 I2V
5s $0.25 pe request
kwaivgi / Kling v2.1 I2V Pro
5s: $0.45 per request
Alibaba / Wan 2.2 I2V 720p LoRA
5s: $0.35 per request
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Gain additional savings
with reservations.

Save more with long-term commitments. Speak with our team to reserve discounted active and flex workers.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.