.jpeg)
Multi-Instance GPUs on Runpod: Stop Paying for Compute You Don't Need
With MIG, we can partition RTX 6000 Pro cards into isolated 24 GB instances. Here's when it makes sense for your workloads.
Blog
Runpod now integrates directly with AI IDEs like Cursor and Claude Desktop using MCP. Launch pods, deploy endpoints, and manage infrastructure—right from your editor.

The way you build with AI is changing—and now, so is the way you interact with your infrastructure. We just shipped the official Runpod MCP server, unlocking first-class, chat-native access to your GPU fleet from any AI-first IDE.
Drop it into Cursor. Or Windsurf. Or Cline. Or Claude Desktop. If your editor speaks Model Context Protocol (MCP), it now speaks fluent Runpod.
No more context switching. No more curl commands. Just talk to your editor, and let your model do the heavy lifting—spin up pods, deploy endpoints, manage volumes, and more.
MCP (short for Model Context Protocol) is an open JSON‑RPC 2.0 standard that lets language model interfaces and tool providers speak the same language.
Instead of bolting together fragile glue code and one-off REST wrappers, you just define a few message types (initialize, request, result, etc.) and let the model decide what tools to invoke—and when. The MCP server handles the call. The client (Cursor, Claude Desktop, etc.) handles the UI. Your model handles the logic.
With the Runpod MCP server, that means full access to your Runpod account—directly from your editor, no context switch required.
Once you’ve plugged the MCP server into your setup, your LLM can access all of this (don’t worry, you can toggle each tool off and on at will, so what your LLM can do is completely within your control):
Pods
create-podlist-podsstart-podstop-poddelete-podget-podServerless Endpoints
create-endpointlist-endpointget-endpointdelete-endpointupdate-endpointTemplates
list-templateget-templatecreate-templateupdate-templatedelete-templateNetwork Volumes
list-network-volumesget-network-volumecreate-network-volumeupdate-network-volumedelete-network-volumeContainer Registry Auth
list-container-registry-authsget-container-registry-authcreate-container-registry-authdelete-container-registry-authUnder the hood, each tool wraps the same REST operations you’re already familiar with—just simplified. Your LLM handles the parameters, validation, and error handling for you.
Want to wire Runpod into Cursor? Just drop this JSON block into .cursor/mcp.json:
Swap in your Runpod personal API key from the console, restart your IDE, and you’re done.
Or just run the one-liner:
Now your model knows how to talk to Runpod.
Once set up, you can talk to your AI assistant like this:
“Create a serverless endpoint using my template called jacobs-comfyui.”
Behind the scenes, your IDE routes that natural-language request to your LLM, which selects the right MCP tool (create-endpoint), fills in the parameters, and fires it off to Runpod. A few seconds later, your endpoint is live.
No terminal. No docs. Just code and chat.
For Smithery-supported clients (like Claude Desktop):
Smithery handles install, build, and registration in one shot.
For Cursor, add this to .cursor/mcp.json:
Restart Cursor and ask something like:
“Spin up a 1×A100 pod using my template nightly-train.”
Your LLM will call create-pod and stream the pod ID back in seconds.
We’re big believers in developer flow. The future of infrastructure isn’t menus and dashboards—it’s context-aware, assistant-driven, and conversational. By building Runpod’s MCP server, we’re making it feel native to every AI-first editor you already use.
Less friction. Fewer tabs. More building.
Try it out. Clone the repo. Give it a star. And let us know what you want to see next.
Blog Posts
.jpeg)
With MIG, we can partition RTX 6000 Pro cards into isolated 24 GB instances. Here's when it makes sense for your workloads.

How 1,100 researchers beat OpenAI's own baseline with 16 megabytes and 10 minutes.

Learn how to set up a real-world agentic system with our new Flash framework.