Blog

Runpod AI Infrastructure Blog

Runpod product updates, AI infrastructure guides, GPU tutorials, and deployment patterns for developers building with cloud GPUs.

RAG vs. Fine-Tuning: Which Is Best for Your LLM?

Shaamil Karim

July 11, 2024

RAG vs. Fine-Tuning: Which Is Best for Your LLM?

Retrieval-Augmented Generation (RAG) and fine-tuning are powerful ways to adapt large language models. Learn the key differences, trade-offs, and when to.

AI Workloads

How to Benchmark Local LLM Inference for Speed and Cost Efficiency

Jonmichael Hands

July 4, 2024

How to Benchmark Local LLM Inference for Speed and Cost Efficiency

Explore how to deploy and benchmark LLMs locally using tools like Ollama and NVIDIA NIMs. This deep dive covers performance, cost, and scaling insights.

AI Workloads

Benchmarking LLMs: A Deep Dive into Local Deployment & Optimization

Jonmichael Hands

July 4, 2024

Benchmarking LLMs: A Deep Dive into Local Deployment & Optimization

Curious how local LLM deployment stacks up? This post explores benchmarking strategies, optimization tips, and what DevOps teams need to know about.

AI Infrastructure

AMD MI300X vs. Nvidia H100 SXM: Performance Comparison on Mixtral 8x7B Inference

Marut Pandya

July 1, 2024

AMD MI300X vs. Nvidia H100 SXM: Performance Comparison on Mixtral 8x7B Inference

Runpod benchmarks AMD's MI300X against Nvidia's H100 SXM using Mistral's Mixtral 8x7B model. The results highlight performance and cost trade-offs across.

AMD MI300X vs. NVIDIA H100: Mixtral 8x7B Inference Benchmark

Marut Pandya

July 1, 2024

AMD MI300X vs. NVIDIA H100: Mixtral 8x7B Inference Benchmark

We benchmarked AMD’s MI300X against NVIDIA’s H100 on Mixtral 8x7B. Discover which GPU delivers faster inference and better performance-per-dollar.

Partnering with Defined AI to Bridge the Data Wealth Gap

Shaamil Karim

June 17, 2024

Partnering with Defined AI to Bridge the Data Wealth Gap

Runpod and Defined.ai launch a pilot program to provide startups with access to high-quality training data and compute, enabling sector-specific.

Product Updates

Run Larger LLMs on Runpod Serverless Than Ever Before – Llama-3 70B (and beyond!)

Brendan McKeag

June 6, 2024

Run Larger LLMs on Runpod Serverless Than Ever Before – Llama-3 70B (and beyond!)

Runpod Serverless now supports multi-GPU workers, enabling full-precision deployment of large models like Llama-3 70B. With optimized VLLM support.

Product Updates

Poddy mascot displayed as a retro TV with static, indicating no results found

We couldn't find anything. Try a different search.

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.

Get started