Blog

Runpod RoundUp 2 – 32k Token Context LLMs and New StabilityAI Offerings

This week's Runpod RoundUp covers major releases including Llama-2 with 32k context support, SDXL 1.0's public release, and StabilityAI's new Stable.

Runpod RoundUp 2 – 32k Token Context LLMs and New StabilityAI Offerings

Welcome to the Runpod Roundup for the week ending July 29, 2023. In this issue, we'll be discussing the newest advancements in AI models over the past week, with a focus on new offerings that you can run in a Runpod instance right this second. In this issue, we'll be looking at the new SDXL release as well as new LLM model advancements.

High Context Llama-2 Models Now Available

It's not even been a week since Meta and Microsoft released Llama-2, and the community has been hard at work...

togethercomputer has released their 32k token context version of Llama-2 7b for anyone to download off of Huggingface. Although there have been several versions of closed models (GPT-4, et al) that have had 32k token context, this is I believe the first freely available open-sourced model that has made the jump to 32k. Although many front ends such as Oobabooga do not yet support 32k context windows, that is likely to change as these models become more commonplace.

conceptofmind has also released 16k context versions of Llama 2 (called LLongMA-2) and they have 7b and 13b versions available.

(Could OpenAI also be looking into releasing an open-source LLM of their own in the not-too-distant future? Maybe...)

Stable Diffusion XL 1.0 Released For All

The long-awaited SDXL is finally out and available for use on Runpod - no research credentials required! Not to belabor the point, but SDXL really is the next generation (pun intended) of AI art, and we've got a helpful blog post that will help you get it up and running in your pod.

Two AI-generated images: a tiger in work overalls on a dock and a panda astronaut in a cafe

StabilityAI Releases Stable Beluga 1 and 2 LLMs

It's clearly been a busy week for Stability AI, as they have also released two LLMs, Stable Beluga 1 and 2. The Stable Beluga 1 entry on Huggingface only holds delta weights (likely due to licensing issues with Llama-1) but Stable Beluga 2 is the complete model available to download. The former is a Llama-1 65b model, while the latter is a Llama-2 70b version, both of which have been finetuned on Microsoft Orca style datasets. According to their press release, both models have a focus on intricate reasoning and answering complex questions relating to specialized domains requiring subject matter expertise, such as law and mathematical problem solving.

Questions?

Feel free to reach out to Runpod directly if you have any questions about these latest developments, and we'll see what we can do for you!

‍

Agentic AI Workflows Explained: Patterns, Infrastructure, and GPU Requirements

Agentic workflows plan, loop, and burst differently than a single model call — here's what that means for the infrastructure underneath.

Inside the Runpod Flash Hack Day

What eleven teams built at the Runpod Flash Hack Day, and the three demos that took home the top prizes.

Inference, optimized: How we benchmarked Runpod Overdrive

We tested four models across sixteen workload profiles. Here's exactly what we measured and how.

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.

Get started