Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

Author

Eliot Cowley

Date

October 28, 2025

Table of contents

TOC

Get started

Model fine-tuning is the process of adapting a pre-trained machine learning model to perform better on a specific task or dataset. This technique allows for improved performance and efficiency compared to training a model from scratch, as it leverages the knowledge already learned by the model.

Fine-tuning is ideal when you have limited data but want to enhance model performance. It is beneficial when the task differs significantly from the original training task of the model.

Fine-tuning is a subset of transfer learning, using knowledge an existing model already has as the starting point for learning new tasks. It’s easier and cheaper to hone an existing model’s capabilities rather than train a new model from scratch. For example, you can fine-tune an existing Large Language Model (LLM) to adjust its tone when responding to inquiries, or give it knowledge specific to your domain or business.

What you’ll learn

In this blog post you’ll learn how to:

Deploy a pod on Runpod based on the axolotl-runpod template
SSH into the pod
Fine-tune a Llama 3 model using LoRA
Prompt the fine-tuned model using an interactive UI

Requirements

Create a Runpod account
Add money to your Runpod account
Set up SSH for your Runpod account
Create a Hugging Face user access token (not necessary for this tutorial, but required for gated models)

1. Deploy a pod

You can fine-tune a model on Runpod using Axolotl, an open-source tool for fine-tuning AI models. Let’s deploy a pod that will fine-tune a model based on a dataset, and then run that model so we can test how it has changed.

Log in to the Runpod Console and select Pod Templates from the left sidebar.
Search for “axolotl” and select the axolotl-runpod template. This template uses an official Axolotl Docker image.

Select Deploy Pod.

Select a GPU to use to train and run the model. GPUs with more RAM typically cost more money, but perform better, while GPUs with less RAM are cheaper but slower. I went with the RTX A4000, a previous-generation NVIDIA GPU. Make sure that you choose an NVIDIA GPU, because the template expects one.
Enter a Pod Name.
Leave the other settings at their defaults and select Deploy On-Demand.

Wait for the pod to initialize. When the pod is ready, a green circle is displayed next to its name

2. Explore the workspace

Now that we have a pod that has Axolotl up and running, let’s access the pod from a terminal on our local machine and see the files and directories in our workspace.

If you have added a public SSH key to your Runpod account, you will see a command that you can copy and paste into a terminal on your local machine to connect to your pod.

In your terminal, you should see a welcome message from Axolotl:

Let’s explore the files in our workspace. By default, you should be in /workspace/axolotl. Enter dir to see the files and folders in this directory. There’s a lot of stuff here!

To train a model with Axolotl, we must use a configuration file. Axolotl provides example configuration files for many different models. Enter cd examples and then dir to list them:

Let’s look at the Llama 3 examples. Llama is a series of LLMs by Meta, and Llama 3 is the previous generation. Enter cd llama-3 and then dir to list the example configuration files:

There are a lot of confusingly named YAML files here, many of them sounding the same. Let’s look specifically at lora-1b.yml.

LoRA, which stands for Low-Rank Adaptation, is a technique used in fine-tuning that adapts models to new contexts in an efficient and performant way without requiring full retraining of the model. It will speed up our fine-tuning and be less costly than full retraining. The 1b in the filename means that the model has one billion parameters.
‍
Let’s read through the configuration file. Open it up in a text editor:

nano lora-1b.yml ‍
Notice the first few fields:

The model we will fine-tune is NousResearch/Llama-3.2-1B, a Llama 3.2 text model with one billion parameters that even lower-end hardware can run.

We will train it on the teknium/GPT4-LLM-Cleaned dataset, a GPT-4 LLM instruction dataset with OpenAI disclaimers and refusals filtered out.

Press Ctrl+X (Cmd+X on Mac) to exit the Nano text editor.

‍

3. Fine-tune a model

Okay, we familiarized ourselves with the pod’s workspace and chose a configuration file. Now let’s go ahead and fine-tune our model!

Enter cd /workspace/axolotl to navigate back to the starting directory so that our model is saved in the correct place. Now enter the following command:

axolotl train examples/llama-3/lora-1b.yml

This may take a while depending on the GPU you chose for your pod. Once Axolotl finishes fine-tuning the model, you should see something like the following message:

Training completed! Saving trained model to ./outputs/lora-out.

Let’s test the model to see if it responds to prompts in a way that aligns with the dataset. Enter the following command to start interactively prompting the model we just fine-tuned. The first argument is the location of the configuration file that we used to train the model; the --lora-model-dir argument is the location of the fine-tuned model:

axolotl inference examples/llama-3/lora-1b.yml --lora-model-dir="./outputs/lora-out" --gradio ‍‍
Axolotl should generate URLs that you can open in your browser to prompt the model using a UI called Gradio. Open the public URL:

Enter a prompt from the dataset, such as:

Give three tips for staying healthy. ‍
Check the model’s output against the expected output from the dataset. For example, in my testing, the above prompt resulted in the following output:

1. Eat nutritious foods and exercise regularly 2. Get plenty of sleep and avoid stressors like nicotine, alcohol or excess caffeine intake. 3. Manage your weight by eating a balanced diet and engaging in physical activity at least 30 minutes per day.
‍
Whereas the expected output from the dataset is:
‍
1. Eat a balanced and nutritious diet: Make sure your meals are inclusive of a variety of fruits and vegetables, lean protein, whole grains, and healthy fats. This helps to provide your body with the essential nutrients to function at its best and can help prevent chronic diseases. 2. Engage in regular physical activity: Exercise is crucial for maintaining strong bones, muscles, and cardiovascular health. Aim for at least 150 minutes of moderate aerobic exercise or 75 minutes of vigorous exercise each week. 3. Get enough sleep: Getting enough quality sleep is crucial for physical and mental well-being. It helps to regulate mood, improve cognitive function, and supports healthy growth and immune function. Aim for 7-9 hours of sleep each night.It’s close, but not quite the same. This is one of the drawbacks of using LoRA - lower precision. But still pretty good!

Next steps

Congratulations, you’ve fine-tuned a model based on a dataset! Runpod and Axolotl enable you to take existing models and adapt them to new contexts, without requiring you to create your own model from scratch. Here are some things you can do to take this further:

As you saw, our fine-tuned model didn’t exactly match our dataset’s expected output. Try fully fine-tuning a model using another of Axolotl’s example configuration files (examples/llama-3/fft-8b.yaml) and check the output against the dataset’s expected output. As this configuration uses an eight billion-parameter model and fully fine-tunes it, the output should be more accurate.
Try fine-tuning a model using Quantized Low-Rank Adaptation (QLoRA). QLoRA is similar to LoRA, but quantizes the model, compressing complex, more precise parameters into smaller, less precise parameters. Therefore, fine-tuning with QLoRA is even more efficient than LoRA, but also results in less precise output. Axolotl provides example configuration files that use QLoRA, such as examples/llama-3/qlora.yml, which fine-tunes an eight billion-parameter model. Compare the time it takes to fine-tune a model using full fine-tuning, LoRA, and QLoRA.
Runpod also offers an Axolotl serverless template. Try spinning up an endpoint and fine-tuning a model by sending a JSON request.

‍Note: When you’re done with your pod, don’t forget to terminate it, otherwise it will keep costing you money!

How to fine-tune a model using Axolotl

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

What you’ll learn

Requirements

1. Deploy a pod

2. Explore the workspace

3. Fine-tune a model

Next steps

How to fine-tune a model using Axolotl

What you’ll learn

Requirements

1. Deploy a pod

2. Explore the workspace

3. Fine-tune a model

Next steps

Deploy ComfyUI as a Serverless API Endpoint

Stable Diffusion + ComfyUI on Runpod: Easy Setup Guide

Why the Future of AI Belongs to Indie Developers

Build what’s next.

How to fine-tune a model using Axolotl

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

What you’ll learn

Requirements

1. Deploy a pod

2. Explore the workspace

3. Fine-tune a model

Next steps

How to fine-tune a model using Axolotl

What you’ll learn

Requirements

1. Deploy a pod

2. Explore the workspace

3. Fine-tune a model

Next steps

Related articles.

Deploy ComfyUI as a Serverless API Endpoint

Stable Diffusion + ComfyUI on Runpod: Easy Setup Guide

Why the Future of AI Belongs to Indie Developers

Build what’s next.

You’ve unlocked areferral bonus!

You’ve unlocked a
referral bonus!