Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

I've found that many users are using the Automatic1111 stable diffusion repo not only as a GUI interface, but as an API layer. If you're trying to scale a service on top of A1111, shaving off a few seconds from your start time can be really important. If you need to make your automatic1111 install start faster, this is the article for you!

We will be referencing the files found in this repository for this blog post: https://github.com/runpod/containers/tree/main/serverless-automatic

There are two major performance optimizations that we will cover in this blog post:

1) Make sure that needed huggingface files are cached

2) Pre-calculate the model hash

Both of these optimizations are taken care of in the Dockerfile line that runs the cache.py script:

The cache.py script simply imports and runs a few functions from webui and modules out of automatic1111:

‍

If you run this against an installation of Automatic via command line, you will find that it will do two major things:

1) It will download some files and store them in the huggingface cache (/root/.cache/huggingface)

If you don't do this prior to launching your serverless template, it will have to download these files on every cold start! yikes!

2) It will calculate the model hash and store it in /workspace/stable-diffusion-webui/cache.json. Automatic does this by default on launch. You can also disable this by using the --no-hashing command line argument.

Here's the comparison before and after:

Before

‍

After

‍

We have found that the startup time for automatic1111 is very cpu-bound, which means that a faster CPU will yield a faster startup time. We've found this to be a linear relationship to single-core CPU performance.

If you look closely, you will see that there is still a relatively long time spent importing both the pytorch and gradio modules. The next blog post will cover possibly optimizing these import times. Stay tuned!

‍

Reduce Your Serverless Automatic1111 Start Time

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Reduce Your Serverless Automatic1111 Start Time

Mixture of Experts (MoE): A Scalable AI Training Architecture

Use DeepFloyd To Create Actual English Text Within AI!

Set Up a Chatbot with Oobabooga on RunPod

Build what’s next.

Reduce Your Serverless Automatic1111 Start Time

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Reduce Your Serverless Automatic1111 Start Time

Related articles.

Mixture of Experts (MoE): A Scalable AI Training Architecture

Use DeepFloyd To Create Actual English Text Within AI!

Set Up a Chatbot with Oobabooga on RunPod

Build what’s next.

You’ve unlocked areferral bonus!

You’ve unlocked a
referral bonus!