Blog

Deploy Python ML Models on Runpod, No Docker Needed

Learn how to deploy Python machine learning models on Runpod without touching Docker. This guide walks you through using virtual environments, network.

Deploy Python ML Models on Runpod, No Docker Needed

What if I told you, you can now deploy pure python machine learning models with zero-stress on Runpod! Excuse that this is a bit of a hacky workflow at the moment. We'll be providing better abstractions in the future!

Prerequisites and Notes

The tutorial only works for containers installed purely from pypi, so system installed packages unfortunately won't work with this tutorial
A good understanding of virtual environments in python
A network drive (see the Runpod network volumes guide)
A decent amount of knowledge on how to use the terminal

For the ease of this tutorial, I am going to do all this in the Jupiter interface (for editing python files), however, this tutorial may be repeated in vscode, if that is a coding environment you are more comfortable with

The Grand Idea

we're essentially connecting and running our serverless functions and updating the handler in serverless, using GPU cloud

and for updates

when we update the handler file via GPU cloud on the network drive, we also update it on serverless

Lets Get Started!

Select your network drive in secure cloud, and click it to select it, it should appear on your top bar

2. Lets start a Runpod Pytorch 2 (you can use any runtime container that you like) template with Runpod, by selecting the pod you wish for with the template

Runpod pod setup with RTX 4090 and PyTorch 2 template, SSH terminal and Jupyter Notebook options checked

(ensure your network drive is selected on the pod)

3. start the pod and get into the Jupyter Lab interface, and then open a terminal

Screenshot from Python ML model deployment tutorial

4. now in the terminal, create a python virtual environment by typing in (ensure your current directory is /workspace)

update ubuntu with

then create a virtual environment

this should create a virtual environment in /workspace

next we activate the virtual environment with

5. for now, I am going to develop and deploy bark, a text-to-speech engine that produces realistic sounding audio

lets install the package in the terminal by typing, as well as runpod and scipy

6. lets create a python file, and type in the following code into it, lets name it "handler.py" and save it in /workspace/handler.py

the python code for running bark

code explanation :
this basically returns the generated audio as a file

7. lets write a startup file - this will be the code used to start docker up with serverless, we'll save this in "/workspace/pod-startup.sh"

Testing the API out on runpod - in GPU cloud

You can test the api out by creating a testing json in /workspace/test_input.json

2.

1. deactivate the venv we're in in the terminal by typing deactivate
2. try sh pod-startup.sh to test it out

This should show the generation process, and we should be able to see the process being run!

Testing the API out via serverless

Go ahead to serverless and lets create a new serverless template

Goto https://www.console.runpod.io/serverless/user/templates and create a new template

2. and set its variables to the following

Runpod template config for 'the bark template' with PyTorch container image and custom startup Docker command

3. now setup a runpod serverless api with your network volume connected, using the template

Runpod serverless endpoint form for 'bark API' with worker counts, network volume, and queue delay scaling

ensure you connect it to your network volume, and your template

and ta-da, you've setup your api to it!

Actually test it out!

you can now make requests to it, so here's some sample code to make a request and download a file (ensure you install the requests the library for this with pip install requests)

‍

What's new in Runpod Serverless: Faster cold starts, batch inference, and no-Docker deploys

Whether you're already running production endpoints on Runpod or you're sizing us up for the first time, here's a plain-language tour of what Runpod Serverless does today, why it's faster and cheaper than it was six months ago, and how to deploy your first endpoint in minutes.

Beyond the Notebook: The Engineering Realities of Production AI Agents

Shift from stateless inference to stateful architectures to resolve infrastructure bottlenecks like memory management, concurrency limits, and runaway jobs in production AI agents.

One Million Developers on Runpod, and the Cloud We’re Building Next

We raised a $100 million Series A. Here's what it means for you.

Build what’s next.

Build, train, and scale AI workloads on Runpod with cloud GPUs, Serverless, and Clusters.

Get started