We've cooked up a bunch of improvements designed to reduce friction and make the.



Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
Unordered list
Bold text
Emphasis
Superscript
Subscript
We just served our 10 billionth serverless request.
That's 10 billion images generated.
10 billion videos created.
10 billion training steps.
10 billion moments where someone had an idea and our infrastructure helped make it real.
But we didn't build this.
You did.
Every one of those requests represents a developer who trusted us with their workload. A startup that bet on us to scale with them. A creator who chose RunPod when they could have gone anywhere else.
Three years ago, serverless was an experiment. Today, it's powering production workloads for teams building the future of AI. From solo developers training their first model to infrastructure teams at companies processing millions of requests per day, serverless has become the way modern AI gets built.
We've watched this evolution happen in real-time. The first serverless requests were tentative—developers testing the waters, seeing if this whole "pay per second" thing actually worked. Then came the hockey stick. Suddenly we were seeing endpoints that processed thousands of images per hour, video generation pipelines handling viral traffic spikes, and code generation tools serving entire development teams.
And why is this important? Serverless represents the perfect bite-size segmentation of workloads, letting you put that GPU to work directly on what matters rather than scaffolding around what you're actually after.
Traditional GPU infrastructure makes you think about the wrong things. How many instances do I need? What if traffic spikes? What about idle time? You end up spending more time being a cloud architect than building your actual product.
Serverless flips that model. No idle costs. No infrastructure headaches. No guessing at capacity. Just your code, running exactly when it needs to, scaling from zero to hundreds of workers in seconds.
The math is simple: if your workload is bursty, unpredictable, or event-driven—which most AI workloads are—you shouldn't be paying for GPUs sitting idle. You should be paying for compute only when you're actually computing.
That's what 10 billion requests looks like when infrastructure gets out of your way.
Thank you. For building with us, for pushing us to be better, and for showing us what's possible when great tools meet great builders.
We're not stopping here. We're working on faster cold starts, more flexible scaling policies, and deeper integrations with the tools you're already using. Because every one of those 10 billion requests taught us something about what you need.
Here's to the next 10 billion.
Want to learn more about serverless? Check out our docs or our YouTube channel.

Join us as we celebrate our fielding of our 10 billionth serverless request.
.jpeg)
We just served our 10 billionth serverless request.
That's 10 billion images generated.
10 billion videos created.
10 billion training steps.
10 billion moments where someone had an idea and our infrastructure helped make it real.
But we didn't build this.
You did.
Every one of those requests represents a developer who trusted us with their workload. A startup that bet on us to scale with them. A creator who chose RunPod when they could have gone anywhere else.
Three years ago, serverless was an experiment. Today, it's powering production workloads for teams building the future of AI. From solo developers training their first model to infrastructure teams at companies processing millions of requests per day, serverless has become the way modern AI gets built.
We've watched this evolution happen in real-time. The first serverless requests were tentative—developers testing the waters, seeing if this whole "pay per second" thing actually worked. Then came the hockey stick. Suddenly we were seeing endpoints that processed thousands of images per hour, video generation pipelines handling viral traffic spikes, and code generation tools serving entire development teams.
And why is this important? Serverless represents the perfect bite-size segmentation of workloads, letting you put that GPU to work directly on what matters rather than scaffolding around what you're actually after.
Traditional GPU infrastructure makes you think about the wrong things. How many instances do I need? What if traffic spikes? What about idle time? You end up spending more time being a cloud architect than building your actual product.
Serverless flips that model. No idle costs. No infrastructure headaches. No guessing at capacity. Just your code, running exactly when it needs to, scaling from zero to hundreds of workers in seconds.
The math is simple: if your workload is bursty, unpredictable, or event-driven—which most AI workloads are—you shouldn't be paying for GPUs sitting idle. You should be paying for compute only when you're actually computing.
That's what 10 billion requests looks like when infrastructure gets out of your way.
Thank you. For building with us, for pushing us to be better, and for showing us what's possible when great tools meet great builders.
We're not stopping here. We're working on faster cold starts, more flexible scaling policies, and deeper integrations with the tools you're already using. Because every one of those 10 billion requests taught us something about what you need.
Here's to the next 10 billion.
Want to learn more about serverless? Check out our docs or our YouTube channel.
The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.