Back
News
February 6th, 2026
minute read

10 billion Serverless requests and counting

Brendan McKeag
Brendan McKeag

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

We just served our 10 billionth serverless request.

That's 10 billion images generated.

10 billion videos created.

10 billion training steps.

10 billion moments where someone had an idea and our infrastructure helped make it real.

But we didn't build this.

You did.

Built by Builders

Every one of those requests represents a developer who trusted us with their workload. A startup that bet on us to scale with them. A creator who chose RunPod when they could have gone anywhere else.

Three years ago, serverless was an experiment. Today, it's powering production workloads for teams building the future of AI. From solo developers training their first model to infrastructure teams at companies processing millions of requests per day, serverless has become the way modern AI gets built.

We've watched this evolution happen in real-time. The first serverless requests were tentative—developers testing the waters, seeing if this whole "pay per second" thing actually worked. Then came the hockey stick. Suddenly we were seeing endpoints that processed thousands of images per hour, video generation pipelines handling viral traffic spikes, and code generation tools serving entire development teams.

Why Serverless Matters

And why is this important? Serverless represents the perfect bite-size segmentation of workloads, letting you put that GPU to work directly on what matters rather than scaffolding around what you're actually after.

Traditional GPU infrastructure makes you think about the wrong things. How many instances do I need? What if traffic spikes? What about idle time? You end up spending more time being a cloud architect than building your actual product.

Serverless flips that model. No idle costs. No infrastructure headaches. No guessing at capacity. Just your code, running exactly when it needs to, scaling from zero to hundreds of workers in seconds.

The math is simple: if your workload is bursty, unpredictable, or event-driven—which most AI workloads are—you shouldn't be paying for GPUs sitting idle. You should be paying for compute only when you're actually computing.

That's what 10 billion requests looks like when infrastructure gets out of your way.

What's Next

Thank you. For building with us, for pushing us to be better, and for showing us what's possible when great tools meet great builders.

We're not stopping here. We're working on faster cold starts, more flexible scaling policies, and deeper integrations with the tools you're already using. Because every one of those 10 billion requests taught us something about what you need.

Here's to the next 10 billion.

Want to learn more about serverless? Check out our docs or our YouTube channel.

Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
We're officially SOC 2 Type II Compliant
You've unlocked a referral bonus! Sign up today and you'll get a random credit bonus between $5 and $500
You've unlocked a referral bonus!
Claim Your Bonus
Claim Bonus
Blog

10 billion Serverless requests and counting

Join us as we celebrate our fielding of our 10 billionth serverless request.

Author
Brendan McKeag
Date
February 6, 2026
Table of contents
Share
10 billion Serverless requests and counting

We just served our 10 billionth serverless request.

That's 10 billion images generated.

10 billion videos created.

10 billion training steps.

10 billion moments where someone had an idea and our infrastructure helped make it real.

But we didn't build this.

You did.

Built by Builders

Every one of those requests represents a developer who trusted us with their workload. A startup that bet on us to scale with them. A creator who chose RunPod when they could have gone anywhere else.

Three years ago, serverless was an experiment. Today, it's powering production workloads for teams building the future of AI. From solo developers training their first model to infrastructure teams at companies processing millions of requests per day, serverless has become the way modern AI gets built.

We've watched this evolution happen in real-time. The first serverless requests were tentative—developers testing the waters, seeing if this whole "pay per second" thing actually worked. Then came the hockey stick. Suddenly we were seeing endpoints that processed thousands of images per hour, video generation pipelines handling viral traffic spikes, and code generation tools serving entire development teams.

Why Serverless Matters

And why is this important? Serverless represents the perfect bite-size segmentation of workloads, letting you put that GPU to work directly on what matters rather than scaffolding around what you're actually after.

Traditional GPU infrastructure makes you think about the wrong things. How many instances do I need? What if traffic spikes? What about idle time? You end up spending more time being a cloud architect than building your actual product.

Serverless flips that model. No idle costs. No infrastructure headaches. No guessing at capacity. Just your code, running exactly when it needs to, scaling from zero to hundreds of workers in seconds.

The math is simple: if your workload is bursty, unpredictable, or event-driven—which most AI workloads are—you shouldn't be paying for GPUs sitting idle. You should be paying for compute only when you're actually computing.

That's what 10 billion requests looks like when infrastructure gets out of your way.

What's Next

Thank you. For building with us, for pushing us to be better, and for showing us what's possible when great tools meet great builders.

We're not stopping here. We're working on faster cold starts, more flexible scaling policies, and deeper integrations with the tools you're already using. Because every one of those 10 billion requests taught us something about what you need.

Here's to the next 10 billion.

Want to learn more about serverless? Check out our docs or our YouTube channel.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.

You’ve unlocked a
referral bonus!

Sign up today and you’ll get a random credit bonus between $5 and $500 when you spend your first $10 on Runpod.