GPU Cloud
GPU Cloud Pricing
Thousands of GPUs across 30+ regions. Simple pricing plans for teams of all sizes,
designed to scale with you.
designed to scale with you.
GPU
32GB VRAM
Serverless Pricing
Serverless Pricing
Cost effective for every inference workload. Save 15% over other Serverless cloud
providers on flex workers alone.
GPU
Flex
Active
4.18
3.35
80GB
H100
Extreme throughput for big models.
2.72
2.17
80GB
A100
High throughput GPU, yet still very cost-effective.
1.9
1.33
48GB
L40, L40S, 6000 Ada
Extreme inference throughput on LLMs like Llama 3 7B.
1.22
0.85
48GB
A6000, A40
A cost-effective option for running big models.
1.1
0.77
24GB
4090
Extreme throughput for small-to-medium models.
0.69
0.48
24GB
L4, A5000, 3090
Great for small-to-medium sized inference workloads.
0.58
0.4
16GB
A4000, A4500, RTX 4000
The most cost-effective for small models.
Are you an early-stage startup or ML researcher?
Get up to $25K in free compute credits to use on demand GPUs and serverless endpoints.
