GPU Pricing
GPU Cloud Pricing
Runpod pricing depends on the GPU workload you run: Pods for dedicated GPU instances, Serverless for API inference, and Clusters for multi-node jobs. For enterprise capacity, review compliance resources or request enterprise support.

How Runpod GPU pricing works
Runpod pricing is based on the type of GPU workload you run. Pods are dedicated GPU instances for development and long-running jobs, Serverless bills inference workers based on usage, and Clusters support multi-node workloads and reserved capacity.
Storage and deployment choices affect total cost, so teams should choose the model that matches workload duration, traffic pattern, and control needs.
Pods
Thousands of GPUs across 30+ regions. Simple pricing plans for teams of all sizes, designed to scale with you.
Serverless
Cost effective for every inference workload. Save 25% over other Serverless cloud providers on flex workers alone.
Clusters
Launch multi-GPU clusters in minutes with no commitments—scale up to 64 GPUs, attach shared storage, and pay only for what you use.
Reserved Clusters
Dedicated GPU clusters with guaranteed availability, custom configurations, SLA-backed uptime, and discounted rates for enterprises scaling to 10,000+ GPUs.
Storage
Flexible and persistent storage options starting at $0.05/GB/mo with standard and high-performance tiers.
