New Public Endpoints, Load Balancing Serverless, Billing Updates, Deploy Hub Listings as Pods

New Public Endpoints

We’ve got a plethora of new Public Endpoints for you to try out, and we’re continually adding more as time goes on! Here’s a complete list of new endpoints we’ve added:

Pruna image edit/t2i: P-Image is focused on lightning quick inference with a 768 × 1344 image often generating in under three seconds. Excellent for rapid prototyping and fast iteration.
Nano Banana Pro: Google’s new iteration of its image editing model. Supports resolutions of up to 2k (with upscaling to 4k) with enhanced research capabilities to drive its editing process.
Sora 2 Pro I2V: OpenAI’s video generation model; accurately models complex movements, synchronized audio generation, supports 1792×1024/1024×1792 resolutions. Generate up to 12 seconds through the endpoint.
InfiniteTalk- Synchronizes not only lips but also head movements, body posture, and facial expressions with audio input, creating more natural and comprehensive video animations.
Granite 4.0 - This is a 32B MoE model with 9B active parameters from IBM, using a hybrid architecture that focuses on fast throughput and efficiency—so a smaller active parameter count like this makes it really quick
Deep Cogito 2.1 671B - MoE model that matches/exceeds Deepseek R1 performance while doing so in fewer tokens, resulting in both cheaper and faster inference at the same quality point. The older version has been removed to make way for the new model.

Private endpoints: Our Public Endpoints have additional configuration options and can be forked into a private version if you have special needs or use cases. What can be done differs from endpoint to endpoint, but click on Fork Private Endpoint on the UI for any Public Endpoint to send our team a message and we will see what we can do for you.

Load balancing serverless endpoints: Provide direct HTTP access to workers, bypassing the traditional queue infrastructure for lower latency responses. Unlike queue-based endpoints that use standardized handlers and fixed /run or /runsync paths, load balancing endpoints allow you to build custom REST APIs with any HTTP framework (FastAPI, Flask, Express.js) and define your own URL paths, methods, and contracts.

Deploy Hub Listings as Pods: You can now deploy listings from the Hub as a Pod in addition to a serverless endpoint. This makes it perfect for troubleshooting, testing, and development as it removes a layer of complication; you can focus on the actual task handling and testing before you decide to push it into a more economical serverless task handling format.

Backup payment methods: You can now add a backup payment method that gets automatically charged should your first one fail. This can help prevent data loss due to an unfunded account having its stopped pods or network volumes terminated.

New Public Endpoints, Load Balancing Serverless, Billing Updates, Deploy Hub Listings as Pods

Build what’s next.

You’ve unlocked areferral bonus!

You’ve unlocked a
referral bonus!