.jpeg)
Multi-Instance GPUs on Runpod: Stop Paying for Compute You Don't Need
With MIG, we can partition RTX 6000 Pro cards into isolated 24 GB instances. Here's when it makes sense for your workloads.
Blog
We've added two new public endpoints to the Runpod Hub: both purpose-built for video generation, and both live right now.

P-Video is Pruna AI's multimodal video generation model, and its headline feature is a built-in draft mode that changes how you iterate.
Before committing to a full render, you can preview your 5-second 720p video in about 2.5 seconds. If the motion, framing, and timing look right, you run the full render,done in roughly 10 seconds. That feedback loop makes a real difference when you're testing prompts or dialing in a concept.
Beyond speed, P-Video handles text-to-video, image-to-video, and audio-to-video through a single endpoint. Native audio generation is built in with dialogue, sound effects, and background music, so you're not managing a separate audio pipeline. You can also import your own audio tracks and sync them to the generated visuals.
Pricing: $0.02/sec at 720p.
Vidu Q3 comes from Shengshu Technology and is currently ranked #2 globally on Artificial Analysis benchmarks for AI video generation.
The standout capability here is native audio-video generation in a single pass (similar to LTX-2): dialogue, SFX, and background music generated simultaneously with the visuals, not added after. This produces tighter synchronization than post-processing approaches and removes an entire step from the workflow.
Clips go up to 16 seconds at up to 1080p, with support for multi-shot sequencing in a single generation. You can describe multiple camera angles and scene transitions in one prompt, and Q3 will handle the cuts. Text-to-video and image-to-video are both supported.
Pricing: $0.15/sec.
Create something cool with the endpoints? Hop into the new #built-on-runpod channel on Discord to show it off!
Blog Posts
.jpeg)
With MIG, we can partition RTX 6000 Pro cards into isolated 24 GB instances. Here's when it makes sense for your workloads.

How 1,100 researchers beat OpenAI's own baseline with 16 megabytes and 10 minutes.

Learn how to set up a real-world agentic system with our new Flash framework.