Zhen Lu

DIY Deep Learning Docker Container

May 7, 2022

Are you tired of using someone else's container, only to find out that they have the wrong versions of your tools installed? Maybe you have just installed everything from scratch every time you wanted to start over and thought to yourself, "this is a waste of time"? I've personally gone through this, because I'd rather do what I need to do to get working, rather than mess with tooling that I don't need. Honestly, though, it's pretty easy to build your own docker container and customize it to your needs. If you do this, then you can get started fresh, with your tools, every time.

In this blog post, we'll go over the fundamentals of how to build your own Docker image for machine learning and push it to DockerHub. I'll use the custom tensorflow image that I built for Runpod as an example. Finished Dockerfiles can be found in github. Let's get started.

How to Start

First, you'll want to sign up for an account on Docker Hub. If you aren't familiar with docker hub, it's like Github for Docker container images. Once you push your container, you'll be able to clone it and use it wherever you want. Save your credentials in your favorite password manager for later.

Next, you'll want to find a suitable base image. If you're a purist, you can start with a minimal image like ubuntu, or something with CUDA already installed like one of the nvidia/cuda images. I'm going to start with the tensorflow/tensorflow:latest-gpu image as it already comes with tensorflow installed, and I know that I'm going to want to use TF 2.8.0 already.

To start a Dockerfile with a base image, you want to create a file called "Dockerfile" and add the following lines to it:

You could also just use the following, if you don't think that you will want to refer to the base image name later.

The following lines instruct Docker to use bash as the default shell instead of sh:

Now for the fun part: we get to customize the stuff that gets installed in our docker image!

In this image, I am going to do a few things:

  • Fix the public key issue that Nvidia has right now
  • apt-get update/upgrade to patch Ubuntu vulnerabilities
  • Install utilities like wget/openssh
  • Upgrade pip
  • Install Jupyter Lab

As you can see, it's super easy to automate what you would have had to install manually. Just type in the install commands using the Docker RUN keyword. The benefit here is that your installed utilities will be cached within your Docker image, and you won't have to wait for them to install the next time you want to use this development environment.

The last thing that we'll do is give Docker a start command. This defines what your Docker image will do when you start it. In this case, I define a start script (start.sh) in the same directory:

The ADD command copies the script into the root of my container file system, the RUN chmod command makes it executable, and the CMD command tells docker to run start.sh when the docker container starts.

Here's what start.sh looks like in the same directory as your Dockerfile:

This just says to run the OpenSSH daemon if a public key is provided in env, and to run Jupyter Lab if a Jupyter password is provided in the env. Both processes are run in the background, so we must also provide a sleep infinity command if we don't want the Docker container to exit automatically.

To build your container, go to the folder you have your Dockerfile in, and run

In this case my repo is runpod, my name is tensorflow, and my tag is latest.

Once your image is built, you can push it by first logging in

Then running

Your image should get uploaded to dockerhub, where you can check it out!

This just scratches the surface of what you can do with docker containers, but it's a good example to get your feet wet.

Build what’s next.

The most cost-effective platform for building, training, and scaling machine learning models—ready when you are.