Explore our credit programs for startups
Back
News
September 3rd, 2025
minute read

Exploring Runpod Serverless: Create Workers From Templates

Eliot Cowley

Runpod Serverless is a cloud computing solution designed for short-lived, event-driven tasks. Runpod automatically manages the underlying infrastructure so you don’t have to worry about scaling or maintenance. You only pay for the compute time that you actually use, so you don’t pay when your application is idle.

You configure an endpoint for your Serverless application with compute resources and other settings, and workers process requests that arrive at that endpoint. You create a handler function that defines how workers process incoming requests and return results. Runpod automatically starts and stops workers based on demand to optimize resource usage and minimize cost.

When a client sends a request to your endpoint, it is put into a queue and waits for a worker to become available. A worker processes the request using your handler function and returns a result to the client.

You can certainly create custom workers from scratch, but in most cases it’s easiest to start with a template. Runpod provides several templates to help you get started. Let’s create workers using a few of these templates.

What you’ll learn

In this blog post you’ll learn how to:

  • Create a Serverless worker from a template on GitHub
  • Test a worker on your local computer
  • Deploy a worker to Runpod Serverless from a GitHub repository

Requirements

worker-basic

The worker-basic template is a minimal Serverless example. When the endpoint receives a request, Runpod spins up a worker to execute the handler function, which in this case prints out some text and sleeps for a few seconds.

Let’s try testing this template locally:

  1. Open a terminal on your local computer.
  2. Clone the worker-basic repository on GitHub:
git clone https://github.com/runpod-workers/worker-basic.git
  1. Open the worker-basic folder in your preferred code editor. Take a look through the files:
  • Dockerfile: Configures the environment for a Docker container. Notice that it configures Python and installs the necessary packages before calling the handler function.
  • README.md: Instructions for deploying the worker.
  • requirements.txt: Sets the Python packages for Docker to install.
  • rp_handler.py: Script containing the handler function for the worker.
  • test_input.json: Mock input data to test the handler function.
  1. Create a Python virtual environment:
python -m venv venv
  1. Activate the Python virtual environment.
  • On macOS/Linux:
source venv/bin/activate
  • On Windows:
venv\Scripts\activate
  1. Install the Runpod SDK:
pip install runpod
  1. Run rp_handler.py. The script will automatically read test_input.json as input, passing it to the handler function as an event:
python rp_handler.py
  • You should get output similar to the following:
--- Starting Serverless Worker |  Version 1.7.13 ---
INFO   | Using test_input.json as job input.
DEBUG  | Retrieved local job: {'input': {'prompt': 'John Doe', 'seconds': 15}, 'id': 'local_test'}
INFO   | local_test | Started.
Worker Start
Received prompt: John Doe
Sleeping for 15 seconds...
DEBUG  | local_test | Handler output: John Doe
DEBUG  | local_test | run_job return: {'output': 'John Doe'}
INFO   | Job local_test completed successfully.
INFO   | Job result: {'output': 'John Doe'}
INFO   | Local testing complete, exiting.
  1. Take a look at test_input.json. Notice that the input object matches the input that the handler function took. Now change the prompt and seconds fields and rerun the handler function. You should see output that matches the new input:
--- Starting Serverless Worker |  Version 1.7.13INFO   | Using test_input.json as job input.
DEBUG  | Retrieved local job: {'input': {'prompt': 'George Washington', 'seconds': 5}, 'id': 'local_test'}
INFO   | local_test | Started.
Worker Start
Received prompt: George Washington
Sleeping for 5 seconds...
DEBUG  | local_test | Handler output: George Washington
DEBUG  | local_test | run_job return: {'output': 'George Washington'}
INFO   | Job local_test completed successfully.
INFO   | Job result: {'output': 'George Washington'}
INFO   | Local testing complete, exiting.

In this example, the worker simply prints some text and sleeps for a given number of seconds. In a real application, you would replace this with functionality like running a Large Language Model (LLM) or performing some other compute-intensive operation. We will try doing this later.

Let’s look through rp_handler.py so we can understand how it works:

import runpod
import time  

def handler(event):
    print(f"Worker Start")
    input = event['input']
    
    prompt = input.get('prompt')  
    seconds = input.get('seconds', 0)  

    print(f"Received prompt: {prompt}")
    print(f"Sleeping for {seconds} seconds...")
    
    # Replace the sleep code with your Python function to generate images, text, or run any machine learning workload
    time.sleep(seconds)  
    
    return prompt 

if __name__ == '__main__':
    runpod.serverless.start({'handler': handler })

The handler(event) function is the entry point for the worker.

event is a dictionary containing the request input in the input key. Here, we store the input values in local variables, print them to the console, and sleep.

When we run the script, it calls runpod.serverless.start, which requests a worker at the endpoint, and sets the handler function to handler.

We will learn how to deploy a worker later - for now, let’s check out another template.

worker-template

  1. Open a terminal on your local computer.
  2. Clone the worker-template repository on GitHub:
git clone https://github.com/runpod-workers/worker-template.git
  1. Open the worker-template folder in your preferred code editor. Look through the files - in particular, let’s look at the Dockerfile. Note that it uses the runpod/base image, which includes CUDA, multiple versions of Python, uv, jupyter notebook and common dependencies.
  2. Create a Python virtual environment:
python -m venv venv
  1. Activate the Python virtual environment.
  • On macOS/Linux:
source venv/bin/activate
  • On Windows:
venv\Scripts\activate
  1. Install the Runpod SDK:
pip install runpod
  1. Run handler.py. The script will automatically read test_input.json as input, passing it to the handler function as an event:
python handler.py
  • You should get output similar to the following:
--- Starting Serverless Worker |  Version 1.7.13 ---
INFO   | Using test_input.json as job input.
DEBUG  | Retrieved local job: {'input': {'name': 'John Doe'}, 'id': 'local_test'}
INFO   | local_test | Started.
DEBUG  | local_test | Handler output: Hello, John Doe!
DEBUG  | local_test | run_job return: {'output': 'Hello, John Doe!'}
INFO   | Job local_test completed successfully.
INFO   | Job result: {'output': 'Hello, John Doe!'}
INFO   | Local testing complete, exiting.
  1. Take a look at test_input.json. Notice that the input object matches the input that the handler function took. Now change the name field and rerun the handler function. You should see output that matches the new input:
--- Starting Serverless Worker |  Version 1.7.13INFO   | Using test_input.json as job input.
DEBUG  | Retrieved local job: {'input': {'name': 'George Washington'}, 'id': 'local_test'}
INFO   | local_test | Started.
DEBUG  | local_test | Handler output: Hello, George Washington!
DEBUG  | local_test | run_job return: {'output': 'Hello, George Washington!'}
INFO   | Job local_test completed successfully.
INFO   | Job result: {'output': 'Hello, George Washington!'}
INFO   | Local testing complete, exiting.

In this example, the worker simply prints some text. In a real application, you would replace this with functionality like running a Large Language Model (LLM) or performing some other compute-intensive operation. We will try doing this later.

Let’s look through handler.py so we can understand how it works:

"""Example handler file."""

import runpod

# If your handler runs inference on a model, load the model here.
# You will want models to be loaded into memory before starting serverless.

def handler(job):
    """Handler function that will be used to process jobs."""
    job_input = job["input"]

    name = job_input.get("name", "World")

    return f"Hello, {name}!"

runpod.serverless.start({"handler": handler})

As the comments mention, if your handler function uses an LLM, you should load it at the start of your script rather than in the handler function itself so that it’s not loaded every time the handler function is called.

The handler(job) function is the entry point for the worker.

job is a dictionary containing the request input in the input key. Here, we store the input value name in a local variable and print it to the console.

The runpod.serverless.start function requests a worker at the endpoint, and sets the handler function to handler.

Deploy a worker from GitHub

Now that we have learned how to create a simple worker from a template, let’s learn how to deploy it:

  1. Sign in to GitHub and fork the worker-basic or worker-template repository. Alternatively, you can create a new repository and copy one of the template’s files into it.
  2. Open the Settings page in the Runpod Console.
  3. Under Connections, find the GitHub card and select Connect.
  1. Sign in to your GitHub account.
  1. Choose which repositories Runpod can access:
  • All repositories: Access to all current and future repositories.
  • Only select repositories: Choose specific repositories. In this case, make sure you select the template repository that you forked.
  1. GitHub redirects you back to your Runpod settings, where you should see that your GitHub account is now connected. You can edit the connection settings at any time by selecting Edit Connection.
  1. In the left sidebar, under Manage, select Serverless.
  1. Select New Endpoint.
  1. Under Import Git Repository, use the search bar to find the repository that you forked from the worker template and select it.
  1. Configure the deployment settings:
  • Select which Branch to deploy from.
  • Enter the Dockerfile Path from the root of the repository.
  • Select Next.
  1. Configure the endpoint settings:
  • Enter an Endpoint Name.
  • Select a GPU Configuration. For this example, the 16 GB GPU is sufficient.
    1. Note: Make sure you have credits in your Runpod account.
  • Select Deploy Endpoint
  1. If Runpod successfully deploys your endpoint, it redirects you to the endpoint’s page. Select the Builds tab to check on the initial build and wait for it to finish. Once it’s finished, wait for Runpod to roll out the workers. When the endpoint is ready, it should look like this:
  1. Select the Requests tab. You should see a sample request similar to the following:
{
  "input": {
    "prompt": "Hello World"
  }
}
  1. Select Run to send the request to the endpoint:
  1. Runpod sends the request to the queue, where it waits for an available worker. When a worker becomes available, it assigns the request to the worker, and the worker executes the handler function using the given input. You can view the status of the request in the console:

Next steps

Congratulations, you have successfully created a worker from a template repository and deployed it from GitHub! These examples were very basic, but there are many other more practical templates available, which we will explore in future blog posts. You can also check them out yourself on GitHub.

Try modifying your handler function to do something more interesting, like having an LLM process a query, or running compute-intensive code. You can also implement GitHub Actions for Continuous Integration/Continuous Deployment to automatically test and deploy every time you push to your repository.

Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->
Newly  Features

We've cooked up a bunch of improvements designed to reduce friction and make the.

Create ->

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.

You’ve unlocked a
referral bonus!

Sign up today and you’ll get a random credit bonus between $5 and $500 when you spend your first $10 on Runpod.