Back
News
January 23rd, 2026
minute read

A Note to the Developers who Built Runpod with Us

Zhen Lu
Zhen Lu

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

I've been putting off writing this post for weeks. Not because there isn't anything to say, but because I've never been great at the reflective, milestone-celebration thing. I'd rather ship features.

But… $120M in ARR feels like a moment worth pausing on. Not because of the number itself, but because of what it represents: over 500,000 developers who trusted us with their workloads, their ideas, and their businesses.

I still find that part slightly surreal.

So here it is. Some thoughts on how we got here, what I've learned, and where we're going.

The basement years

In late 2021, Pardeep and I had about $50,000 worth of specialized GPU rigs in our New Jersey basements, mining Ethereum. The hobby had stopped being fun and we were chomping at the bit to rip into something more. 

We had a choice: sell the hardware, or accept that we now owned a problem we couldn’t unsee.

We’d both been doing machine learning on the side and knew firsthand how painful it was. The actual experience of developing software on top of GPUs was, frankly, hot garbage. Environment setup was brittle. Networking was opaque. Working on a single box was painful enough - forget scaling anything beyond that.

So we decided to fix it.

A few months later, we had something we were willing to show other people. Which immediately surfaced the next problem: as first-time founders, we didn’t really know how to market, or honestly, how to do much of anything outside of building.

So I did the only thing that felt natural. I posted on Reddit.

We posted in a couple of AI-oriented subreddits with a simple offer: free access to our product in exchange for feedback. That was the entire go-to-market strategy. Those beta testers became our first paying customers. Within nine months, we'd blown through $1 million in revenue.

Here's what we didn't do: take venture capital or debt. For almost two years, we bootstrapped. We never offered a free tier. The business had to at least pay for itself, even if it wasn't throwing off profit. When we needed to scale beyond our basements, we formed partnerships with data centers instead of raising money.

By May 2024, we'd grown to 100,000 developers. That's when Radhika Malik from Dell Technologies Capital reached out. She'd found us through those Reddit posts. Around the same time, Julien Chaumond, the co-founder of Hugging Face, messaged us through our support chat. He'd been using the product and wanted to invest. We raised $20 million from Dell and Intel, with Julien and Nat Friedman participating.

We haven't raised since. 

What makes an AI-first cloud different

I’ve been asked recently what distinguishes an "AI-first cloud" from a traditional cloud. The honest answer is that we figured it out by getting it wrong first.

Traditional clouds were built for Web 2.0. Small amounts of data shuttling between services. IO-bounded workloads. AI is fundamentally different. It's compute-bounded. You're moving model weights, training data, media files. Orders of magnitude more data. The architecture has to start from first principles.

An AI-first cloud means hardware and software co-designed from the ground up. Multi-level caching beyond just exposed storage layers. Caching at shared memory levels and sharded storage across different storage types globally. Model-aware placement. High-throughput networking optimized for AI workloads. You can't get this level of optimization by cobbling together existing services from a traditional cloud provider.

But the bigger lesson was this: most companies that bought thousands of GPUs had no idea how to actually use them. The hardware wasn't the problem. The software layer was. How do you manage networking? How do you divide GPU resources among developers without creating a security and management nightmare? How do you handle the inevitable hardware failures?

The reality of GPU reliability

Here’s something most people don’t expect: GPUs are surprisingly fragile.

Failure rates in the low single-digit percentages are common. Much higher than traditional server hardware. You have to build systems that assume failure and recover automatically.

The hardest issues aren’t the obvious failures. They’re what I think of as gray outages. Partial failures where compute appears to be running, but will never complete. A workload stuck in a non-productive state due to subtle networking inconsistencies, kernel issues, or hardware edge cases.

The instance keeps consuming expensive compute, but the job will never finish.

Catching these requires real monitoring hooks, utilization-level KPIs, and automated remediation. This is the unglamorous work that actually makes AI infrastructure usable in production.

I’m proud to say that we’ve got an industry leading > 99.9% of uptime reliability due to our efforts here, but our customers deserve better. We’re committed to investing time and effort towards this difficult problem until we are satisfied that we have a rock solid foundation.

Who our customers became

We started with creatives playing with Disco Diffusion. Then developers building commercial products. Then startups. Now we serve companies like Cursor, Replit, Perplexity, Wix, and some of the top AI research labs across 31 regions worldwide.

What's interesting is how the use cases evolved. Today, three patterns dominate:

Generative media remains huge. Fashion virtual try-on, real estate staging with AI-generated video walkthroughs, digital avatars with voice cloning. Computationally expensive, bandwidth-hungry. We're now delivering over 8 exabytes of global network traffic annually.

Small language model agents. Companies running sub-70B parameter models for customer support and internal workflows. They prototype with API providers but migrate to running their own fine-tuned models. The reasons: control, predictability, cost, and avoiding the risk of model deprecation breaking their carefully engineered prompts. A fine-tuned smaller model can outperform much larger general-purpose models for narrow use cases.

High-accuracy transcription. Companies whose core business depends on transcription run their own models to fine-tune for specific audio environments. A smaller, specialized model outperforms the generic options. Doing it in a single pass on the same compute worker is more efficient than a multi-step pipeline.

The common thread: teams that need control over their AI stack, not just API access.

What $120M actually means

Revenue is a lagging indicator. It tells you what happened, not what's coming.

What I care about more:

120% net dollar retention. Customers expand because the platform solves real problems.

155% growth in signups year over year. The market is moving fast, and we're keeping pace.

The support tickets we don't get anymore. Features that used to require documentation and workarounds now just work.

The biggest signal is the shift in who's reaching out. Three years ago, it was individual developers. Now it's infrastructure teams at Fortune 500 companies evaluating us against hyperscalers and making the switch. We're supporting over 20 terabits per second of internal Infiniband and Ethernet network capacity for teams training large models.

Where we're going

I believe agents will become the software, not just features wrapped in deterministic code. We're moving toward a world where businesses use a mix of off-the-shelf agents for common tasks and custom-built agents for their unique problems.

That changes what infrastructure needs to do. Databases, APIs, and services designed for human interaction patterns need fundamental redesigns. The most valuable skill won't be wrapping an AI model in deterministic code. It'll be building, testing, and iterating on the agent itself. New tools like CI/CD for prompts will become standard.

Runpod's job is to make that iteration loop as fast as possible. Remove the friction between having an idea and seeing it work. Whether that's a creator running Stable Diffusion or an enterprise deploying a fleet of fine-tuned agents.

The team behind this

I want to be clear about something. Pardeep and I started this, but we didn't build it alone.

The engineering team that ships features faster than our competitors. The support team that treats every ticket like it matters. The customers who file detailed bug reports and tell us exactly what's broken.

This milestone belongs to all of them.

To everyone who's been part of the Runpod journey so far: thank you. We're just getting started.

Zhen

Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
We're officially SOC 2 Type II Compliant
You've unlocked a referral bonus! Sign up today and you'll get a random credit bonus between $5 and $500
You've unlocked a referral bonus!
Claim Your Bonus
Claim Bonus
Blog

A Note to the Developers who Built Runpod with Us

Runpod surpasses $120M ARR, now serving over 500,000 developers worldwide. Founder Zhen reflects on the journey from basement GPU rigs to AI-first cloud infrastructure powering startups, research labs, and Fortune 500 teams.

Author
Zhen Lu
Date
January 23, 2026
Table of contents
Share
A Note to the Developers who Built Runpod with Us

I've been putting off writing this post for weeks. Not because there isn't anything to say, but because I've never been great at the reflective, milestone-celebration thing. I'd rather ship features.

But… $120M in ARR feels like a moment worth pausing on. Not because of the number itself, but because of what it represents: over 500,000 developers who trusted us with their workloads, their ideas, and their businesses.

I still find that part slightly surreal.

So here it is. Some thoughts on how we got here, what I've learned, and where we're going.

The basement years

In late 2021, Pardeep and I had about $50,000 worth of specialized GPU rigs in our New Jersey basements, mining Ethereum. The hobby had stopped being fun and we were chomping at the bit to rip into something more. 

We had a choice: sell the hardware, or accept that we now owned a problem we couldn’t unsee.

We’d both been doing machine learning on the side and knew firsthand how painful it was. The actual experience of developing software on top of GPUs was, frankly, hot garbage. Environment setup was brittle. Networking was opaque. Working on a single box was painful enough - forget scaling anything beyond that.

So we decided to fix it.

A few months later, we had something we were willing to show other people. Which immediately surfaced the next problem: as first-time founders, we didn’t really know how to market, or honestly, how to do much of anything outside of building.

So I did the only thing that felt natural. I posted on Reddit.

We posted in a couple of AI-oriented subreddits with a simple offer: free access to our product in exchange for feedback. That was the entire go-to-market strategy. Those beta testers became our first paying customers. Within nine months, we'd blown through $1 million in revenue.

Here's what we didn't do: take venture capital or debt. For almost two years, we bootstrapped. We never offered a free tier. The business had to at least pay for itself, even if it wasn't throwing off profit. When we needed to scale beyond our basements, we formed partnerships with data centers instead of raising money.

By May 2024, we'd grown to 100,000 developers. That's when Radhika Malik from Dell Technologies Capital reached out. She'd found us through those Reddit posts. Around the same time, Julien Chaumond, the co-founder of Hugging Face, messaged us through our support chat. He'd been using the product and wanted to invest. We raised $20 million from Dell and Intel, with Julien and Nat Friedman participating.

We haven't raised since. 

What makes an AI-first cloud different

I’ve been asked recently what distinguishes an "AI-first cloud" from a traditional cloud. The honest answer is that we figured it out by getting it wrong first.

Traditional clouds were built for Web 2.0. Small amounts of data shuttling between services. IO-bounded workloads. AI is fundamentally different. It's compute-bounded. You're moving model weights, training data, media files. Orders of magnitude more data. The architecture has to start from first principles.

An AI-first cloud means hardware and software co-designed from the ground up. Multi-level caching beyond just exposed storage layers. Caching at shared memory levels and sharded storage across different storage types globally. Model-aware placement. High-throughput networking optimized for AI workloads. You can't get this level of optimization by cobbling together existing services from a traditional cloud provider.

But the bigger lesson was this: most companies that bought thousands of GPUs had no idea how to actually use them. The hardware wasn't the problem. The software layer was. How do you manage networking? How do you divide GPU resources among developers without creating a security and management nightmare? How do you handle the inevitable hardware failures?

The reality of GPU reliability

Here’s something most people don’t expect: GPUs are surprisingly fragile.

Failure rates in the low single-digit percentages are common. Much higher than traditional server hardware. You have to build systems that assume failure and recover automatically.

The hardest issues aren’t the obvious failures. They’re what I think of as gray outages. Partial failures where compute appears to be running, but will never complete. A workload stuck in a non-productive state due to subtle networking inconsistencies, kernel issues, or hardware edge cases.

The instance keeps consuming expensive compute, but the job will never finish.

Catching these requires real monitoring hooks, utilization-level KPIs, and automated remediation. This is the unglamorous work that actually makes AI infrastructure usable in production.

I’m proud to say that we’ve got an industry leading > 99.9% of uptime reliability due to our efforts here, but our customers deserve better. We’re committed to investing time and effort towards this difficult problem until we are satisfied that we have a rock solid foundation.

Who our customers became

We started with creatives playing with Disco Diffusion. Then developers building commercial products. Then startups. Now we serve companies like Cursor, Replit, Perplexity, Wix, and some of the top AI research labs across 31 regions worldwide.

What's interesting is how the use cases evolved. Today, three patterns dominate:

Generative media remains huge. Fashion virtual try-on, real estate staging with AI-generated video walkthroughs, digital avatars with voice cloning. Computationally expensive, bandwidth-hungry. We're now delivering over 8 exabytes of global network traffic annually.

Small language model agents. Companies running sub-70B parameter models for customer support and internal workflows. They prototype with API providers but migrate to running their own fine-tuned models. The reasons: control, predictability, cost, and avoiding the risk of model deprecation breaking their carefully engineered prompts. A fine-tuned smaller model can outperform much larger general-purpose models for narrow use cases.

High-accuracy transcription. Companies whose core business depends on transcription run their own models to fine-tune for specific audio environments. A smaller, specialized model outperforms the generic options. Doing it in a single pass on the same compute worker is more efficient than a multi-step pipeline.

The common thread: teams that need control over their AI stack, not just API access.

What $120M actually means

Revenue is a lagging indicator. It tells you what happened, not what's coming.

What I care about more:

120% net dollar retention. Customers expand because the platform solves real problems.

155% growth in signups year over year. The market is moving fast, and we're keeping pace.

The support tickets we don't get anymore. Features that used to require documentation and workarounds now just work.

The biggest signal is the shift in who's reaching out. Three years ago, it was individual developers. Now it's infrastructure teams at Fortune 500 companies evaluating us against hyperscalers and making the switch. We're supporting over 20 terabits per second of internal Infiniband and Ethernet network capacity for teams training large models.

Where we're going

I believe agents will become the software, not just features wrapped in deterministic code. We're moving toward a world where businesses use a mix of off-the-shelf agents for common tasks and custom-built agents for their unique problems.

That changes what infrastructure needs to do. Databases, APIs, and services designed for human interaction patterns need fundamental redesigns. The most valuable skill won't be wrapping an AI model in deterministic code. It'll be building, testing, and iterating on the agent itself. New tools like CI/CD for prompts will become standard.

Runpod's job is to make that iteration loop as fast as possible. Remove the friction between having an idea and seeing it work. Whether that's a creator running Stable Diffusion or an enterprise deploying a fleet of fine-tuned agents.

The team behind this

I want to be clear about something. Pardeep and I started this, but we didn't build it alone.

The engineering team that ships features faster than our competitors. The support team that treats every ticket like it matters. The customers who file detailed bug reports and tell us exactly what's broken.

This milestone belongs to all of them.

To everyone who's been part of the Runpod journey so far: thank you. We're just getting started.

Zhen

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.

You’ve unlocked a
referral bonus!

Sign up today and you’ll get a random credit bonus between $5 and $500 when you spend your first $10 on Runpod.