.jpeg)
Deploy When Available is now GA
Queue for any GPU spec, even one that's fully rented out, and we'll deploy it the moment capacity opens up. No more refreshing the console or running a sniping tool.
Blog
Run MoonshotAI's Kimi-K2-Instruct on Runpod Clusters using H200 SXM GPUs and a 2TB shared network volume for seamless multi-node training. This guide.

1. Create Network Storage (2TB), Use the CA-MTL-4 region (recommended for now).
2. Spin Up a Pod using Runpod official Pytorch template and mount the network volume you just created, once the pod is running, connect to Jupyter Lab
3. Download the Model
4. Launch the Instant Cluster
1. installation on Node 0 with a shared volume
2. Node 1 Instructions
3. You should see the following:
4. Run on node with ip as host.


Known IssuesCurrently as of July 21st, vllm library is not up to date so need to build from nightly builds.
https://github.com/MoonshotAI/Kimi-K2/issues/19
uv environment on a Network Volume is slow to initialize ray, recommend any python environments be ran on the machine itself instead of on the Network Volume.
Author profile: Brendan McKeag
Blog Posts
.jpeg)
Queue for any GPU spec, even one that's fully rented out, and we'll deploy it the moment capacity opens up. No more refreshing the console or running a sniping tool.

Explore why faster chips have shifted the bottleneck to AI infrastructure, and what that means for teams running production workloads.
.jpeg)
With MIG, we can partition RTX 6000 Pro cards into isolated 24 GB instances. Here's when it makes sense for your workloads.