The "builder pattern"
As a programmer you may know the Gang-of-Four Builder design pattern. The Docker builder pattern has nothing to do with that. The builder pattern describes the setup that many people use to build a container. It involves two Docker images:
- a "build" image with all the build tools installed, capable of creating production-ready application files.
- a "service" image capable of running the application.
Sharing files between containers
At some point the production-ready application files need to be copied from the build container to the host machine. There are several options for that.
This is what the Dockerfile for the build image roughly looks like:
# in Dockerfile.build # take a useful base image FROM ... # install build tools RUN install build-tools # create a /target directory for the executable RUN mkdir /target # copy the source code from the build context to the working directory COPY source/ . # build the executable RUN build --from source/ --to /target/executable
To build the executables, simply build the image:
docker build \ -t build \ # tag the image as "build" -f Dockerfile.build \ # use Dockerfile.build . # use current directory as build context
In order to be able to reach in and grab the executable, you should first create a container (not a running one) based on the given image:
docker create \ --name build \ # name the container "build" build # use the "build" image
You can now copy the
executable file to your host machine using
docker cp build:/target/executable ./executable
2. Using bind-mount volumes
I don't think making the compile step part of the build process of the container is good design. I like container images to be reusable. In the previous example, when the source files are modified, you need to rebuild the build image itself. But I'd just like to run the same build image again.
This means that the compile step should instead be moved to the
CMD instruction. And that the
source/ files shouldn't be part of the build context, but mounted as a bind-mount volume inside the running
# in Dockerfile.build FROM ... RUN install build-tools ENTRYPOINT build --from /project/source/ --to /project/target/executable
This way, we should first build the
build image, then run it:
# same build process docker build \ -t build \ -f Dockerfile.build \ . # now we *run* the container docker run \ --name build \ --rm \ # remove the container after running it -v `pwd`:/project \ # bind-mount the entire project directory build
Every time you run the
build container it will compile the files in
/project/source/ and produce a new executable in
/project is a bind-mount volume, the
executable file is automatically available on the host machine in
target/ - there's no need to explicitly copy it from the container.
Once the application files are on the host machine, it will be easy to copy them to the service image, since that is done using a regular
The "multi-stage build"
A feature that has just been introduced to Docker is the "multi-stage build". It aims to solve the issue that for the above build process you need two Dockerfiles, and a (Bash) script to coordinate the build process, and get the files where they need to be, with a short detour via the host filesystem.
With a multi-stage build (see Alex Elis's introductory article on this feature), you can describe the build process in one file:
# in Dockerfile # these are still the same instructions as before FROM ... RUN install build-tools RUN mkdir /target RUN build --from /source/ --to /target/executable # another FROM; this defines the actual service container FROM ... COPY --from=0 /target/executable . CMD ["./executable"]
There is only one image to be built. The resulting image will be the one defined last. It will contain the
executable copied from the first, intermediate "build" image (which will be disposed afterwards).
Note that this requires the source files to be inside the build context. Also note that the build image itself is not reusable; you can't run it again and again after you've made changes to the code; you have to build the image again. Since Docker will cache previously built image layers, this should still be fast, but it's something to be aware of.
Pipes & filters
I recently saw this question passing by on Twitter:
Learning Docker so dumb q probably - if a docker image generates binary output to a file, how do I copy to host?— Raymond Camden (@raymondcamden) April 1, 2017
People suggested to use bind-mount volumes, as described above. Nobody suggested
docker cp. But the question prompted me to think of some other solution for getting generated files out of a container: why not stream the file to
stdout? It has several major advantages:
- The data doesn't have to end up in a file anymore, only later to be moved/deleted anyway - it can stay in memory (which offers fast access).
stdoutallows you to send the output directly to some other process, using the pipe operator (
|). Other processes may modify the output, then do the same thing, or store the final result in a file (inside the service image for example).
- The exact location of files becomes irrelevant. There's no coupling through the filesystem if you only use
buildcontainer wouldn't have to put its files in
/target, the build script wouldn't have to look in
/targettoo. They just pass along data.
In case you want to stream multiple files between containers, I think good-old
tar is a very good option.
Take the following
Dockerfile for example, which creates an "executable", then wraps it in a
tar file which it streams to
FROM ubuntu RUN mkdir /target RUN echo "I am an executable" > /target/executable RUN echo "I am a supporting file" > /target/supporting-file ENTRYPOINT tar --create /target
To build this image, run:
docker build -t build -f docker/build/Dockerfile ./
Now run a container using the
docker run --rm -v `pwd`:/project --name build build
The archive generated by
tar will be sent to
stdout. It can then be piped into another process, like
tar itself, to extract the files again:
docker run --rm -v `pwd`:/project --name build build \ | tar --extract --verbose
If you want another container to accept an archive, pipe it in through
stdin (create the container in interactive mode):
docker run --rm -v `pwd`:/project --name build build \ | docker run -i [...]
We discussed several patterns for building Docker images. I prefer separate build files (instead of a multi-stage build with one Dockerfile). And as an alternative for writing files to a bind-mount volume, I really like the option to make the build image stream a
I hope there was something useful for you in here. If you find anything that can be improved/added, please let me know!