
How to Use 65B+ Language Models on Runpod
Large language models like Guanaco 65B can run on Runpod with the right optimizations. Learn how to handle quantization, memory, and GPU sizing.

Large language models like Guanaco 65B can run on Runpod with the right optimizations. Learn how to handle quantization, memory, and GPU sizing.

New 8k context models from TheBloke—like WizardLM, Vicuna, and Manticore—allow longer, more immersive text generation in Oobabooga. With more room for character memory and story progression, these models enhance AI storytelling.

Learn how to use Runpod's new Savings Plans to save up to 20% on Secure Cloud pods with monthly or quarterly commitments—ideal for users with high GPU workloads.

Runpod is now an official sponsor of StockDory, a rapidly evolving open-source chess engine that improves faster than Stockfish. StockDory offers deep positional insight, lightning-fast calculations, and full customization—making it ideal for anyone looking to explore AI-driven chess analysis.

While Oobabooga is a popular choice for text-based AI roleplay, KoboldAI offers a powerful alternative with smart context handling, more flexible editing, and better long-term memory retention. This guide compares the two frontends and walks through deploying KoboldAI on Runpod for writers and roleplayers looking for a deeper, more persistent AI interaction experience.

Runpod now offers access to NVIDIA’s powerful H100 GPUs, designed for generative AI workloads at scale. These next-gen GPUs deliver 7–12x performance gains over the A100, making them ideal for training massive models like GPT-4 or deploying demanding inference tasks.

Oobabooga has a 2048-token context limit, but with the Long Term Memory extension, you can store and retrieve relevant memories across conversations. This guide shows how to install the plugin, use the Character panel for persistent memory, and work around current context limitations.

