Our team’s insights on building better and scaling smarter.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Brendan McKeag
30 April 2025
Qwen3 Released: How Does It Stack Up?
Alibaba’s Qwen3 is here—with major performance improvements and a full range of models from 0.5B to 72B parameters. This post breaks down what’s new, how it compares to other open models, and what it means for developers.
GPU Clusters: Powering High-Performance AI (When You Need It)
Different stages of AI development call for different infrastructure. This post breaks down when GPU clusters shine—and how to scale up only when it counts.
Mixture of Experts (MoE): A Scalable AI Training Architecture
MoE models scale efficiently by activating only a subset of parameters. Learn how this architecture works, why it’s gaining traction, and how Runpod supports MoE training and inference.
RunPod Global Networking Expands to 14 More Data Centers
RunPod’s global networking feature is now available in 14 new data centers, improving latency and accessibility across North America, Europe, and Asia.
Learn how to fine-tune large language models using Axolotl on RunPod. This guide covers LoRA, 8-bit quantization, DeepSpeed, and GPU infrastructure setup.
RTX 5090 LLM Benchmarks: Is It the Best GPU for AI?
See how the NVIDIA RTX 5090 stacks up in large language model benchmarks. We explore real-world performance and whether it’s the top GPU for AI workloads today.