
Runpod Serverless Pricing Update
Runpod introduces new Serverless pricing with Flex and Active worker types, offering better scalability and up to 40% lower costs for consistent workloads.
Blog
Our team’s insights on building better and scaling smarter.


Runpod introduces new Serverless pricing with Flex and Active worker types, offering better scalability and up to 40% lower costs for consistent workloads.

Falcon-180B is the largest open-source LLM to date, requiring 400GB of VRAM to run unquantized. This post explores how to deploy it on Runpod with A100s, L40s, and quantized alternatives like GGUF for more accessible use.

Reflections on generating conversational German audio with LLMs and Bark, highlighting common pitfalls in parsing, generation reliability, and the importance of fault-tolerant workflows.

his week’s roundup covers Alibaba’s vision-language model Qwen-VL, Meta’s new code-focused LLM Code Llama, and FACET—a benchmark for detecting bias in computer vision datasets.

Bench, Neuralangelo, and Marqo highlight this week’s updates—open-source tools for evaluating LLMs, reconstructing 3D scenes, and enabling GPU-powered vector search.

A simple guide to training custom video LoRAs using Diffusion-Pipe on Runpod—perfect for creators and AI enthusiasts.

Learn how to set up Stable Diffusion with ComfyUI on Runpod for fast, flexible AI image generation.
