Announcing Runpod Flash

Runpod Blog.

Our team’s insights on building better
and scaling smarter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Deep Cogito Releases Suite of LLMs Trained with Iterative Policy Improvement
Brendan McKeag
August 1, 2025

Deep Cogito Releases Suite of LLMs Trained with Iterative Policy Improvement

Deploy DeepCogito’s Cogito v2 models on Runpod to experience frontier-level reasoning at lower inference costs—choose from 70B to 671B parameter variants and leverage Runpod’s optimized templates and Clusters for scalable, efficient AI deployment.

AI Infrastructure
All
How to Run MoonshotAI’s Kimi-K2-Instruct on Runpod Instant Cluster
Brendan McKeag
July 25, 2025

How to Run MoonshotAI’s Kimi-K2-Instruct on Runpod Instant Cluster

Run MoonshotAI’s Kimi-K2-Instruct on Runpod Clusters using H200 SXM GPUs and a 2TB shared network volume for seamless multi-node training. This guide shows how to deploy with PyTorch templates, optimize Docker environments, and accelerate LLM inference with scalable, low-latency infrastructure.

AI Workloads
All
Comparing the 5090 to the 4090 and B200: How Does It Stack Up?
Brendan McKeag
July 25, 2025

Comparing the 5090 to the 4090 and B200: How Does It Stack Up?

Benchmark Qwen2.5-Coder-7B-Instruct across NVIDIA’s B200, RTX 5090, and 4090 to identify optimal GPUs for LLM inference—compare token throughput, cost per token, and memory efficiency to match your workload with the right performance tier.

All
Iterative Refinement Chains with Small Language Models: Breaking the Monolithic Prompt Paradigm
Brendan McKeag
July 18, 2025

Iterative Refinement Chains with Small Language Models: Breaking the Monolithic Prompt Paradigm

As prompt complexity increases, large language models (LLMs) hit a “cognitive wall,” suffering up to 40% performance drops due to task interference and overload. By decomposing workflows into iterative refinement chains (e.g., the Self-Refine framework) and deploying each stage on serverless platforms like Runpod, you can maintain high accuracy, scalability, and cost efficiency.

AI Workloads
All
Introducing the New Runpod Referral & Affiliate Program
Emmett Fear
July 17, 2025

Introducing the New Runpod Referral & Affiliate Program

Runpod enhanced its referral program with exciting new features including randomized rewards up to $500, a premium affiliate tier offering 10% cash commissions, and continued lifetime earnings for existing users, creating more ways than ever to earn while building the future of AI infrastructure.

Product Updates
All
Running a 1-Trillion Parameter AI Model In a Single Pod: A Guide to MoonshotAI’s Kimi-K2 on Runpod
Brendan McKeag
July 14, 2025

Running a 1-Trillion Parameter AI Model In a Single Pod: A Guide to MoonshotAI’s Kimi-K2 on Runpod

Moonshot AI’s Kimi-K2-Instruct is a trillion-parameter, mixture-of-experts open-source LLM optimized for autonomous agentic tasks—with 32 billion active parameters, Muon-trained performance rivaling proprietary models (89.5 % MMLU, 97.4 % MATH-500, 65.8 % pass@1), and the ability to run inference on as little as 1 TB of VRAM using 8-bit quantization.

AI Workloads
All
The Dos and Don’ts of VACE: What It Does Well, What It Doesn’t
Brendan McKeag
June 27, 2025

The Dos and Don’ts of VACE: What It Does Well, What It Doesn’t

VACE introduces a powerful all-in-one framework for AI video generation and editing, combining text-to-video, reference-based creation, and precise editing in a single open-source model. It outperforms alternatives like AnimateDiff and SVD in resolution, flexibility, and controllability — though character consistency and memory usage remain key challenges.

AI Workloads
All
Poddy mascot displayed as a retro TV with static, indicating no results found
We couldn't find anything. Try a different search.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.