Brendan McKeag

Why NVidia's Llama 3.1 Nemotron 70B Might Be the Most Reasonable LLM Yet

NVidia’s Llama 3.1 Nemotron 70B is outperforming larger and closed models on key reasoning tasks. In this post, Brendan tests it against a long-unsolved challenge: consistent, in-character roleplay with zero internal monologue or user coercion—and finds it finally up to the task.
Read article
AI Workloads

Why LLMs Can't Spell 'Strawberry' And Other Odd Use Cases

Large language models can write poetry and solve logic puzzles—but fail at tasks like counting letters or doing math. Here’s why, and what it tells us about their design.
Read article
Learn AI

Evaluate Multiple LLMs Simultaneously Using Ollama on Runpod

Use Ollama to compare multiple LLMs side-by-side on a single GPU pod—perfect for fast, realistic model evaluation with shared prompts.
Read article
AI Workloads

Supercharge Your LLMs with SGLang: Boost Performance and Customization

Discover how to boost your LLM inference performance and customize responses using SGLang, an innovative framework for structured LLM workflows.
Read article
AI Workloads

Mastering Serverless Scaling on Runpod: Optimize Performance and Reduce Costs

Learn how to optimize your serverless GPU deployment on Runpod to balance latency, performance, and cost. From active and flex workers to Flashboot and scaling strategy, this guide helps you build an efficient AI backend that won’t break the bank.
Read article
AI Infrastructure

Run Larger LLMs on Runpod Serverless Than Ever Before – Llama-3 70B (and beyond!)

Runpod Serverless now supports multi-GPU workers, enabling full-precision deployment of large models like Llama-3 70B. With optimized VLLM support, flashboot, and network volumes, it's never been easier to run massive LLMs at scale.
Read article
Product Updates

Announcing Runpod's New Serverless CPU Feature

Runpod introduces Serverless CPU: high-performance VM containers with customizable CPU options, ideal for cost-effective and versatile workloads not requiring GPUs.
Read article
Product Updates

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.

12:22