Blog

Deploy When Available is now GA

Queue for any GPU spec, even one that's fully rented out, and we'll deploy it the moment capacity opens up. No more refreshing the console or running a sniping tool.

The industry-wide GPU supply crunch has introduced an enormous amount of friction into what should be a simple process. What used to be a one-click deploy flow has turned into refreshing the deploy page or leaning on a third-party sniping tool to catch the moment a card frees up. Today we're putting an end to that. Deploy When Available is now generally available, and it does the waiting for you.

The idea is simple. Instead of deploying only what's free right this second, you can queue for any spec that isn't immediately available, whether it's a configuration that's completely rented out at the moment, or you just need more than we can hand you on the spot. Runpod watches for capacity and deploys your pod automatically as soon as it can.

Getting started

If you're already on the new deploy flow, you're ready to go. If you're still on the old one, you'll need to switch over first:

Head to Early Access features at console.runpod.io/user/early-access and enable the New Pod deploy page.
Select the pod spec you want.
If it's currently out of capacity, the deploy button at the bottom becomes Deploy When Available. Click it.
Review the confirmation prompt, set a subscription window if you'd like, and confirm.

That's it. You can close the tab and get on with your day.

How it works

When the spec you want is out of capacity, the deploy button changes to Deploy When Available. Click it, confirm the prompt, and you're in the queue. The moment that capacity becomes available, your pod deploys and starts running.

One important note: your pod begins billing as soon as it deploys. That's the whole point of the feature, but it means a card could come free at 3 a.m. while you're asleep and start charging you for time you can't use.

If that's a concern, use the subscription window to set the valid times for deployment. You can mark hours as off-limits so you're never billed for a pod you can't actually get to.

Notifications

When your pod goes live, we'll let you know through the console or by email, depending on your preference. SMS notifications are on the way and will ship shortly.

A note on feasible configurations

For a deployment to succeed, the configuration you queue for has to be something that might realistically become available at some point, even if only briefly. If you request a spec that could never feasibly free up, it won't deploy. As always, we're continuously working on supply to make larger deploys possible over time.

Try it out

Deploy When Available should make reserving the GPU you want a lot less of a chore. Give it a try on your next deploy, and let us know what you think.

Blog Posts

View All

How to run Kimi K3 on Runpod's Public Endpoint

The weights for Kimi K3 have been released - and we've got a Public Endpoint where you can get started with it right away with all of the security, privacy, and compliance that Runpod offers.

Runpod's REST API v2 is here: One API for your entire GPU stack

We've just released the next iteration of our REST API - find out what you can build with it.

What to get right before your next AI deployment on Runpod

A good AI deployment usually comes down to a few decisions you want to make early: which GPU fits the job, where your data should live, how to keep storage persistent, and which region makes sense for the workload. We pulled together the mistakes we see most often and the fixes that can save teams time, budget, and a few rebuilds.

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.

Get started