Careers at RunPod

All roles are remote/hybrid. Most of our team is based in New Jersey and SF.
That said, we have a preference for candidates based in SF who are enthousiastic about working in-person in SF.
Everyone on our team is technical. Whether you're doing sales, operations, design, or product. Our customers are developers - everything we do is downstream from their needs. So naturally, having a strong grasp of existing GPU cloud workflows and being able to deeply resonate with the pain points of our customers is a strong plus. Being technical is not a hard requirement, but preferred.
Everyone wears multiple hats. Our sales team helps manage infrastructure, our growth team ships new features based on customer feedback, our engineering team helps onboard customers, our ML team runs benchmarks and creates custom deployments for enterprise customers, etc. At this stage, we need to move quickly. That means you may have to take on a couple of functions that aren't in your job description.
We would like you to grow into a leadership position as the team scales. We currently have 9 full-time people on the team, and we anticipate scaling to ~30 in the next 6-9mo. We're looking for people with a bias towards leadership, who can hire and manage talented builders as their function within RunPod scales to hundreds of employees.
Below is a list of open positions at RunPod. After applying, we will provide a thorough outline of roles and responsibilities.
Our founding team has decades of cloud architecture and machine learning experience - we're deeply familiar with the points developers face when training, benchmarking, and scaling AI models in production.

Open Positions

ML Engineer
We are looking for ML Engineers with strong backgrounds in optimizing inference on large language models.
  • We are looking for ML Engineers with strong backgrounds in optimizing inference on large language models.
  • You can develop and deploy holistic solutions, enabling users to train new models, fine-tune existing ones, or run efficient inferences.
  • You can keep up with the latest in AI research, ensuring our tech stack remains up to par with best practices.
  • You can engage proactively with RunPod customers, grasping their challenges and consistently delivering valuable solutions that meet their needs.
  • You are proficient in identifying and circumventing common technical pitfalls.
If this sounds like you, reach out to careers@runpod.io and include your LinkedIn and/or portfolio.

FullStack Engineer
We are looking for software engineer who is able to deliver end-to-end functionality of a product from Backend (NodeJs) to Frontend (ReactJS / NextJS).
  • You have developed end-to-end features using NodeJS and ReactJS.
  • You can communicate and understand complex web architectures.
  • You have worked with open-source AI models and have a strong understanding of the current AI landscape.
If this sounds like you, reach out to careers@runpod.io and include your LinkedIn and/or portfolio.

Support Engineer
Generalist engineer who can communicate with clients, knows how to debug common issues, and has general knowledge of the web.
  • You have excellent communication skills.
  • You are a problem solver by any means necessary and can engineer complex solutions.
  • You can debug logs, data sets and other sources to find root cause of issues.
If this sounds like you, reach out to careers@runpod.io and include your LinkedIn and/or portfolio.

Senior Systems Engineer
Ability to solve complex problems to help accelerate AI adoption.
  • You can develop and optimize complex systems using Golang or Rust.
  • You have a proven track record of accomplishments.
  • You are a lone wolf and can work with a team when needed.
  • You strive for perfection but understand MVP delivery.
  • You can reduce container cold-starts for AI workloads.
  • You can optimize network storage to increase throughout and store LLM models at scale.
  • You can optimize container runtime for specific workloads to get the best performance.
If this sounds like you, reach out to careers@runpod.io and include your LinkedIn and/or portfolio.

UX Engineer
UX Engineer who specializes in pixel-perfect user interfaces.
  • You are a designer first, engineer second.
  • You can create pixel-perfect UI.
  • You can design and develop in ReactJS.
  • You have a strong understanding of worklows used for deploying and scaling AI models.
If this sounds like you, reach out to careers@runpod.io and include your LinkedIn and/or portfolio.

Customer Success
You make sure our customers are happy and solve their problems. You have a level of technical proficiency that allows you to diagnose customer issues and report them to our engineering team.
You are patient, can multi-process dozens of communication channels, and have a passion for creating an impeccable user experience.
If this sounds like you, reach out to careers@runpod.io and include your LinkedIn and/or portfolio.

The RunPod Engineering Stack

We've built RunPod atop dozens of frameworks (many of which we've written ourselves), but here are the primary stacks you'll be using:
  • AI/ML: Python and C++
  • RunPod website and user console: NodeJS, NextJS, GraphQL
  • The Cloud: Golang

The RunPod Team

Work is an incredible place when you're working with a team of people who are relentless about the mission and energized to help each other grow in every way. Through the highs and lows, we have shared many moments of laughter, tears, jokes, and joy.
Things that make our culture what it is:
  • We all have experience building products and hacking on GPUs. Many of our founding team members ran data centers and joined RunPod after integrating their hardware onto the platform.
  • We live and breathe discord. We hate needless bureaucracy and make many of most important decisions over discord voice. Feel free to join our discord and say hi!
  • We love new product ideas. Regardless of your official role, if you have a great idea and want to see it implemented, you can always make a PR. 9/10 times, we will be all for it.
  • We are a team of intrinsically curious and ambitious people. We ask a lot of questions, move quickly, and pivot on the fly. We value a bias towards action very highly and take pride in our work.
  • Our last offsite was in New Jersey. We raced go-karts and went to Dave and Buster's.

Our Value Prop to You

  • Compensation package with sign-on bonus, company equity, and benefits. We know how rare mission-driven talent is, and we strive to reflect this through ownership and pay.
  • Environment for growth and learning. You will have the opportunity to drive great impact and gain exposure to all functions of the company. Here, you can flex multiple realms of your skillset, strategic mindset, and creativity.
  • Accelerate innovation in GPU cloud infrastructure. We are leading the change in GPU cloud infrastructure against Big Cloud and outdated systems. You’ll be able to operate in a fast-paced environment and iterate quickly.
  • An energizing, ambitious team. Our team cares deeply about each other. We strive to elevate and uplift each other in our day-to-day work to do the best for one another. We don't believe in bureaucratic nonsense.
  • Supporting your wellbeing. We provide benefits to allow you to do your best work:
    1. Remote and in-person hybrid work options. We’re based in NJ and SF.
    2. Stipend to upgrade your work-from-home setup.
    3. Unlimited paid time off (PTO).
    4. Paid company off-sites, meetups, and team bonding events. You’ll get to see everyone outside of their Zoom box.

About at RunPod

We are building Cloud services to accelerate AI adoption.
Whether you're an experienced ML developer training a large language model, or an enthusiast tinkering with stable diffusion, we strive to make GPU compute as seamless and affordable as possible.
Our founding team has decades of cloud architecture and machine learning experience - we're deeply familiar with the points developers face when training, benchmarking, and scaling AI models in production.

RunPod's Founding Story

Our founding team comes from Comcast, where we lead the cloud architecture division and cut costs by 100M per year.
We founded RunPod in March 2022 with 2 core insights: 1) AI infrastructure requirements are compounding every year, and will continue to grow exponentially over the next decade. 2) There aren't any AI-native cloud service providers built specifically to accelerate training, benchmarking, and inference workflows.
Existing providers like Big Cloud (AWS, GCP, Azure) have made it incredibly costly for developers, startups, and enthusiasts to access GPU resources. We knew we wouldn't get far as a fancy wrapper on top of Big Cloud, so we built our own infrastructure from the ground up.
In the early days, we didn't have the capital required to purchase thousands of GPUs, so we turned to the AI community for support. Hundreds of GPU owners across the world deeply resonated with our mission and listed their GPUs on Community Cloud - RunPod's first on-demand cloud platform.
As we scaled our capacity to the thousands, we saw more and more users reach out about needing larger clusters, higher reliability, and extremely fast networking speeds to train foundational models and deploy them in production. So we introduced Secure Cloud to the platform - GPUs we source and manage in some of the most reliable data centers across the world. With Secure Cloud, developers can access clusters of up to 1000x GPUs with incredibly high data transfer speeds, RAID 2 redundancy, localized network volumes, and best-in-class security, all at a 50%+ lower rate than Big Cloud.
Since then, we've built Serverless - autoscaling architecture that abstracts away all of the devops expertise required to scale infrastructure up and down for inference. We also launched Flashboot, our cache architecture that allows for less than 250ms P70 cold start times on hundreds of models.
We have lots of cool stuff on our product roadmap, and we're excited to bring on engineers who can help shape RunPod into the world's best platform for building and scaling AI.