Docker build patterns

Posted on by Matthias Noback

The "builder pattern"

As a programmer you may know the Gang-of-Four Builder design pattern. The Docker builder pattern has nothing to do with that. The builder pattern describes the setup that many people use to build a container. It involves two Docker images:

  1. a "build" image with all the build tools installed, capable of creating production-ready application files.
  2. a "service" image capable of running the application.

Sharing files between containers

At some point the production-ready application files need to be copied from the build container to the host machine. There are several options for that.

1. Using docker cp

This is what the Dockerfile for the build image roughly looks like:

# in Dockerfile.build

# take a useful base image
FROM ...

# install build tools
RUN install build-tools

# create a /target directory for the executable
RUN mkdir /target

# copy the source code from the build context to the working directory
COPY source/ .

# build the executable
RUN build --from source/ --to /target/executable

To build the executables, simply build the image:

docker build \
    -t build \              # tag the image as "build"
    -f Dockerfile.build \   # use Dockerfile.build
    .                       # use current directory as build context

In order to be able to reach in and grab the executable, you should first create a container (not a running one) based on the given image:

docker create \
    --name build \     # name the container "build"
    build              # use the "build" image

You can now copy the executable file to your host machine using docker cp:

docker cp build:/target/executable ./executable

2. Using bind-mount volumes

I don't think making the compile step part of the build process of the container is good design. I like container images to be reusable. In the previous example, when the source files are modified, you need to rebuild the build image itself. But I'd just like to run the same build image again.

This means that the compile step should instead be moved to the ENTRYPOINT or CMD instruction. And that the source/ files shouldn't be part of the build context, but mounted as a bind-mount volume inside the running build container:

# in Dockerfile.build
FROM ...
RUN install build-tools

ENTRYPOINT build --from /project/source/ --to /project/target/executable

This way, we should first build the build image, then run it:

# same build process
docker build \
    -t build \
    -f Dockerfile.build \
    .

# now we *run* the container
docker run \
    --name build \
    --rm \                     # remove the container after running it
    -v `pwd`:/project \        # bind-mount the entire project directory
    build

Every time you run the build container it will compile the files in /project/source/ and produce a new executable in /project/target/. Since /project is a bind-mount volume, the executable file is automatically available on the host machine in target/ - there's no need to explicitly copy it from the container.

Once the application files are on the host machine, it will be easy to copy them to the service image, since that is done using a regular COPY instruction.

The "multi-stage build"

A feature that has just been introduced to Docker is the "multi-stage build". It aims to solve the issue that for the above build process you need two Dockerfiles, and a (Bash) script to coordinate the build process, and get the files where they need to be, with a short detour via the host filesystem.

With a multi-stage build (see Alex Elis's introductory article on this feature), you can describe the build process in one file:

# in Dockerfile

# these are still the same instructions as before
FROM ...
RUN install build-tools
RUN mkdir /target
RUN build --from /source/ --to /target/executable

# another FROM; this defines the actual service container
FROM ...
COPY --from=0 /target/executable .
CMD ["./executable"]

There is only one image to be built. The resulting image will be the one defined last. It will contain the executable copied from the first, intermediate "build" image (which will be disposed afterwards).

Note that this requires the source files to be inside the build context. Also note that the build image itself is not reusable; you can't run it again and again after you've made changes to the code; you have to build the image again. Since Docker will cache previously built image layers, this should still be fast, but it's something to be aware of.

Pipes & filters

I recently saw this question passing by on Twitter:

People suggested to use bind-mount volumes, as described above. Nobody suggested docker cp. But the question prompted me to think of some other solution for getting generated files out of a container: why not stream the file to stdout? It has several major advantages:

  1. The data doesn't have to end up in a file anymore, only later to be moved/deleted anyway - it can stay in memory (which offers fast access).
  2. Using stdout allows you to send the output directly to some other process, using the pipe operator (|). Other processes may modify the output, then do the same thing, or store the final result in a file (inside the service image for example).
  3. The exact location of files becomes irrelevant. There's no coupling through the filesystem if you only use stdin and stdout. The build container wouldn't have to put its files in /target, the build script wouldn't have to look in /target too. They just pass along data.

In case you want to stream multiple files between containers, I think good-old tar is a very good option.

Take the following Dockerfile for example, which creates an "executable", then wraps it in a tar file which it streams to stdout:

FROM ubuntu
RUN mkdir /target
RUN echo "I am an executable" > /target/executable
RUN echo "I am a supporting file" > /target/supporting-file
ENTRYPOINT tar --create /target

To build this image, run:

docker build -t build -f docker/build/Dockerfile ./

Now run a container using the build image:

docker run --rm -v `pwd`:/project --name build build

The archive generated by tar will be sent to stdout. It can then be piped into another process, like tar itself, to extract the files again:

docker run --rm -v `pwd`:/project --name build build \
    | tar --extract --verbose

If you want another container to accept an archive, pipe it in through stdin (create the container in interactive mode):

docker run --rm -v `pwd`:/project --name build build \
    | docker run -i [...]

Conclusion

We discussed several patterns for building Docker images. I prefer separate build files (instead of a multi-stage build with one Dockerfile). And as an alternative for writing files to a bind-mount volume, I really like the option to make the build image stream a tar archive.

I hope there was something useful for you in here. If you find anything that can be improved/added, please let me know!

Docker Docker
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
Lucas Rangit Magasweran

The stdout pipe trick is great! Although I prefer to not put anything in the EXTRYPOINT or CMD directive of the Dockerfile. That way, you don't have half of the command in the Dockerfile and half in a script for some documentation.docker run --rm -v pwd:/project --name build tar --create /target | tar --extract --verbose

i think something like this will be more efficient

FROM ubuntu
RUN mkdir /target
RUN echo "I am an executable" > /target/executable
RUN echo "I am a supporting file" > /target/supporting-file
RUN tar --create /target > /target.tar
ENTRYPOINT ["/bin/cat", "/target.tar"]

what do you think?, because you don't need to run tar every time you invoke docker run

hmm, but tar without compressing will be same as cat but with capability to multiplexing many files

so this method maybe will be useful if you build on remote machine and just want the output