James Sandy

How to Run Serverless AI and ML Workloads on Runpod

Learn how to train, deploy, and scale AI/ML models using Runpod Serverless. This guide covers real-world examples, deployment best practices, and how serverless is unlocking new possibilities like real-time video generation.
Read article
Product Updates

How Much Can a GPU Cloud Save You? A Cost Breakdown vs On-Prem Clusters

We crunched the numbers: deploying 4x A100s on Runpod’s GPU cloud can save over $124,000 versus an on-prem cluster across 3 years. Learn why cloud beats on-prem for flexibility, cost, and scale.
Read article
Cost Optimization

Quantization Methods Compared: Speed vs. Accuracy in Model Deployment

Explore the trade-offs between post-training, quantization-aware training, mixed precision, and dynamic quantization. Learn how each method impacts model speed, memory, and accuracy—and which is best for your deployment needs.
Read article
AI Workloads

How to Fine-Tune LLMs with Axolotl on RunPod

Learn how to fine-tune large language models using Axolotl on RunPod. This guide covers LoRA, 8-bit quantization, DeepSpeed, and GPU infrastructure setup.
Read article
AI Workloads

Cost-Effective AI with Autoscaling on RunPod

Learn how RunPod autoscaling helps teams cut costs and improve performance for both training and inference. Includes best practices and real-world efficiency gains.
Read article
AI Workloads

Deploying Multimodal Models on RunPod

Multimodal models handle more than just text—they process images, audio, and more. This guide shows how to deploy and scale them using RunPod’s infrastructure.
Read article
AI Workloads

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.

12:22