Hot starts, batch inference, and what's next for Runpod Serverless. Webinar June 25.

Serverless | Migrating and Deploying Cog Images on Runpod Serverless from Replicate

A step-by-step guide to migrating a Cog image from Replicate to a Runpod Serverless endpoint using Docker and the cog-worker repo.

Serverless | Migrating and Deploying Cog Images on Runpod Serverless from Replicate

>_ docker build --tag user/repo:tag --build-arg COG_REPO=user --build-arg COG_MODEL=model_name --build-arg COG_VERSION=model_version .

>_ docker push user/repo:tag
Switching cloud platforms or migrating existing models can often feel like a Herculean task, especially when it necessitates additional developmental efforts. This guide aims to simplify this process for individuals who have deployed models via replicate.com or utilized the Cog framework. Through a few straightforward steps, you'll learn how to establish a Runpod serverless worker from an existing Replicate image. This tutorial presumes you are operating within a Linux terminal environment and have Docker installed on your system. For demonstration purposes, we'll be transitioning the lucataco/hotshot-xl model to a Runpod serverless endpoint.

Step 1: Clone and Navigate to the cog-worker Repository
Begin by cloning the cog-worker repository and then navigate to the root folder of the repository:


The cog-worker repository contains essential scripts and configuration files required for the migration.

Step 2: Identify Model Information
Identify the username, model name, and version you wish to use from Replicate.

Required model information extracted from https://replicate.com/lucataco/hotshot-xl/versions
Required model information extracted from https://replicate.com/lucataco/hotshot-xl/versions

Step 3: Build and Push Docker Image
Build the Docker image by providing the necessary arguments for your model. Once your Docker image is built, push it to a container repository such as DockerHub:

The --tag option allows you to specify a name and tag for your image, while the --build-arg options provide the necessary information for building the image.

Step 4: Create and Deploy a Serverless Endpoint on Runpod
Open Runpod and initiate the creation of a serverless endpoint template. Once the template is set up, deploy the endpoint. This will now allow you to send requests to your new endpoint.

Runpod template config form for cog-worker with container image, Docker command, and disk size fields
Runpod serverless endpoint settings for cog-worker with worker counts and prioritized GPU selection

Your Runpod serverless endpoint is now ready to handle requests! Depending on the specifics of your application, you may need to modify the handler file before building, especially if you intend to upload images to object storage, for instance.

The next step, of course, is to now test your API with ReqBin. Every package has their own parameters to pass to the API, but generally the overarching structure of the request is the same no matter what. Check out this article if you need to learn how to send a request through Reqbin to your serverless worker to see if it's ready for prime time. You also may want to check out our previous article on Serverless APIs which includes an example using cURL at the bottom.

By following this streamlined process, transitioning from Replicate to a Runpod serverless endpoint is made significantly less daunting, enabling a smoother migration and deployment of your Cog image.

Author profile: Justin Merrell

Related articles

View All
Deploy When Available is now GA

Deploy When Available is now GA

Queue for any GPU spec, even one that's fully rented out, and we'll deploy it the moment capacity opens up. No more refreshing the console or running a sniping tool.

All
The Chips Got Faster. The Stack Didn't.

The Chips Got Faster. The Stack Didn't.

Explore why faster chips have shifted the bottleneck to AI infrastructure, and what that means for teams running production workloads.

All

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.