Pricing

Cost-effective GPUs
for AI & ML teams.

Choose pay-per-second GPUs from $0.00011
or a predictable monthly subscription.
GPU Cloud

GPU Cloud Pricing

Thousands of GPUs across 30+ regions. Simple pricing plans for teams of all sizes,
designed to scale with you.
Serverless Pricing

Serverless Pricing

Cost effective for every inference workload. Save 15% over other Serverless cloud
providers on flex workers alone.

GPU

Per second

Per hour

Flex
Active
4.18
3.35
80GB
H100
Extreme throughput for big models.
2.72
2.17
80GB
A100
High throughput GPU, yet still very cost-effective.
1.9
1.33
48GB
L40, L40S, 6000 Ada
Extreme inference throughput on LLMs like Llama 3 7B.
1.22
0.85
48GB
A6000, A40
A cost-effective option for running big models.
1.1
0.77
24GB
4090
Extreme throughput for small-to-medium models.
0.69
0.48
24GB
L4, A5000, 3090
Great for small-to-medium sized inference workloads.
0.58
0.4
16GB
A4000, A4500, RTX 4000
The most cost-effective for small models.

Gain additional savings
with reservations.

Save more with long-term commitments. Speak with our team to reserve discounted active and flex workers.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.