Hot starts, batch inference, and what's next for Runpod Serverless. Webinar June 25.

Worker | Local API Server Introduced with runpod-python 0.10.0

Starting with runpod-python 0.10.0, you can launch a local API server for testing your worker handler using --rp_serve_api. This feature improves the.

Worker | Local API Server Introduced with runpod-python 0.10.0


Up to this point, developing a serverless worker has required test inputs to be passed in through a test_input.json file or, alternatively, passed in with the --test_input argument. While this method works fine, it doesn't fully replicate the interactive nature of an API server. Today, we're excited to announce a significant improvement to your testing and development workflow.

Starting with the 0.10.0 release of runpod-python, you can now quickly launch local API servers for testing requests. This is achieved by calling your handler file with the --rp_serve_api argument.

Scale When Ready

The transition is simple once you are satisfied that your worker is functioning correctly and ready for the scalability feature offered by Runpod Serverless. Create an endpoint template with the same image, and you're ready to go.

Introducing a local API server marks a big step in our ongoing commitment to making serverless development as easy and efficient as possible. We're excited to see how you'll leverage this new feature in your serverless applications.

Quick Example

Run Locally

Let's look at our IsEven example and see how we can serve it locally. First, navigate to the directory containing our whatever.py handler file. We can call this file with the addition of the --rp_serve_api argument flag:

Once the command is executed, you should see an output indicating the server is running on port 8000.

To verify that it's working, navigate to http://localhost:8000/docs in your web browser. Here, you should see the API documentation page:

Runpod Test Worker local API server docs showing a POST /runsync endpoint and request schemas

To test the API, you can use a tool such as Postman or curl to submit a POST request to your API:

Remember to replace <Your JSON Payload> and <Your API Endpoint> with the appropriate values for your application.

Postman POST request to localhost:8000/runsync returning 200 OK with output false

Standard Pod

As an alternative to running locally, we can also host our API as a standard Pod. This is particularly useful if you need to test GPU workloads and your local development environment does not have the required hardware.

First, you will need to build your Docker image. Then, head over to Runpod and create a new GPU pod template.

Runpod template config with Docker image Runpod/ai-API-example-iseven:dev and HTTP port 8000 exposed

Using the same container image that you will use for serverless, you will override the Docker command to include the additional arguments python whatever.py --rp_serve_api --rp_api_host='0.0.0.0'. This will start an API server that you can now submit requests to.

Postman POST request to a Runpod proxy /runsync URL returning 200 OK with output true

NOTE: Do not include these arguments when running as a serverless endpoint.

Author profile: Justin Merrell

Related articles

View All
Deploy When Available is now GA

Deploy When Available is now GA

Queue for any GPU spec, even one that's fully rented out, and we'll deploy it the moment capacity opens up. No more refreshing the console or running a sniping tool.

All
The Chips Got Faster. The Stack Didn't.

The Chips Got Faster. The Stack Didn't.

Explore why faster chips have shifted the bottleneck to AI infrastructure, and what that means for teams running production workloads.

All

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.