Docker build patterns

Posted on by Matthias Noback

The "builder pattern"

As a programmer you may know the Gang-of-Four Builder design pattern. The Docker builder pattern has nothing to do with that. The builder pattern describes the setup that many people use to build a container. It involves two Docker images:

  1. a "build" image with all the build tools installed, capable of creating production-ready application files.
  2. a "service" image capable of running the application.

Sharing files between containers

At some point the production-ready application files need to be copied from the build container to the host machine. There are several options for that.

1. Using docker cp

This is what the Dockerfile for the build image roughly looks like:

# in Dockerfile.build

# take a useful base image
FROM ...

# install build tools
RUN install build-tools

# create a /target directory for the executable
RUN mkdir /target

# copy the source code from the build context to the working directory
COPY source/ .

# build the executable
RUN build --from source/ --to /target/executable

To build the executables, simply build the image:

docker build \
    -t build \              # tag the image as "build"
    -f Dockerfile.build \   # use Dockerfile.build
    .                       # use current directory as build context

In order to be able to reach in and grab the executable, you should first create a container (not a running one) based on the given image:

docker create \
    --name build \     # name the container "build"
    build              # use the "build" image

You can now copy the executable file to your host machine using docker cp:

docker cp build:/target/executable ./executable

2. Using bind-mount volumes

I don't think making the compile step part of the build process of the container is good design. I like container images to be reusable. In the previous example, when the source files are modified, you need to rebuild the build image itself. But I'd just like to run the same build image again.

This means that the compile step should instead be moved to the ENTRYPOINT or CMD instruction. And that the source/ files shouldn't be part of the build context, but mounted as a bind-mount volume inside the running build container:

# in Dockerfile.build
FROM ...
RUN install build-tools

ENTRYPOINT build --from /project/source/ --to /project/target/executable

This way, we should first build the build image, then run it:

# same build process
docker build \
    -t build \
    -f Dockerfile.build \
    .

# now we *run* the container
docker run \
    --name build \
    --rm \                     # remove the container after running it
    -v `pwd`:/project \        # bind-mount the entire project directory
    build

Every time you run the build container it will compile the files in /project/source/ and produce a new executable in /project/target/. Since /project is a bind-mount volume, the executable file is automatically available on the host machine in target/ - there's no need to explicitly copy it from the container.

Once the application files are on the host machine, it will be easy to copy them to the service image, since that is done using a regular COPY instruction.

The "multi-stage build"

A feature that has just been introduced to Docker is the "multi-stage build". It aims to solve the issue that for the above build process you need two Dockerfiles, and a (Bash) script to coordinate the build process, and get the files where they need to be, with a short detour via the host filesystem.

With a multi-stage build (see Alex Elis's introductory article on this feature), you can describe the build process in one file:

# in Dockerfile

# these are still the same instructions as before
FROM ...
RUN install build-tools
RUN mkdir /target
RUN build --from /source/ --to /target/executable

# another FROM; this defines the actual service container
FROM ...
COPY --from=0 /target/executable .
CMD ["./executable"]

There is only one image to be built. The resulting image will be the one defined last. It will contain the executable copied from the first, intermediate "build" image (which will be disposed afterwards).

Note that this requires the source files to be inside the build context. Also note that the build image itself is not reusable; you can't run it again and again after you've made changes to the code; you have to build the image again. Since Docker will cache previously built image layers, this should still be fast, but it's something to be aware of.

Pipes & filters

I recently saw this question passing by on Twitter:

People suggested to use bind-mount volumes, as described above. Nobody suggested docker cp. But the question prompted me to think of some other solution for getting generated files out of a container: why not stream the file to stdout? It has several major advantages:

  1. The data doesn't have to end up in a file anymore, only later to be moved/deleted anyway - it can stay in memory (which offers fast access).
  2. Using stdout allows you to send the output directly to some other process, using the pipe operator (|). Other processes may modify the output, then do the same thing, or store the final result in a file (inside the service image for example).
  3. The exact location of files becomes irrelevant. There's no coupling through the filesystem if you only use stdin and stdout. The build container wouldn't have to put its files in /target, the build script wouldn't have to look in /target too. They just pass along data.

In case you want to stream multiple files between containers, I think good-old tar is a very good option.

Take the following Dockerfile for example, which creates an "executable", then wraps it in a tar file which it streams to stdout:

FROM ubuntu
RUN mkdir /target
RUN echo "I am an executable" > /target/executable
RUN echo "I am a supporting file" > /target/supporting-file
ENTRYPOINT tar --create /target

To build this image, run:

docker build -t build -f docker/build/Dockerfile ./

Now run a container using the build image:

docker run --rm -v `pwd`:/project --name build build

The archive generated by tar will be sent to stdout. It can then be piped into another process, like tar itself, to extract the files again:

docker run --rm -v `pwd`:/project --name build build \
    | tar --extract --verbose

If you want another container to accept an archive, pipe it in through stdin (create the container in interactive mode):

docker run --rm -v `pwd`:/project --name build build \
    | docker run -i [...]

Conclusion

We discussed several patterns for building Docker images. I prefer separate build files (instead of a multi-stage build with one Dockerfile). And as an alternative for writing files to a bind-mount volume, I really like the option to make the build image stream a tar archive.

I hope there was something useful for you in here. If you find anything that can be improved/added, please let me know!

Docker Docker Comments

Making money with open source, etc.

Posted on by Matthias Noback

So, here's a bit of a personal blog post for once.

Symfony trademark policy

I saw this tweet:

Meaning: "Following a formal notice, I removed the tutorials that are related to Symfony from the Site. There will be no future videos on the Framework."

I got mad. I had heard of other people having trouble with Sensiolabs' trademark policy and I once had difficulty getting permission to use "Symfony" in my "Hexagonal Symfony training tour" (which became the "Hexagonal Architecture training tour"). So I tweeted about it:

I thought it would be good to speak up. It's a really scary thing to do anyway. It's not in my nature to voice criticism publicly like this. I know that doing so will make things harder, and close certain doors.

Regret

However, not long after the initial tweet by @grafikart_fr, Fabien Potencier (lead maintainer of Symfony) tweeted:

And also:

This made me feel really bad about myself. I don't want to play a part in someone's sad day. Of course.

As you know I’ve always been a big fan of Symfony and Fabien's work on it. I have learned a tremendous amount of things from using it, documenting it and creating open source tools for it. So: dear Fabien, I'm sorry for my act of public shaming. It isn't good style.

Here's the lesson I'm trying hard to learn in my life: voicing criticism isn't equal to saying that everything you are, do or create sucks. In fact, my girlfriend warns me about this when she gives me feedback and I "cringe": "I'm not saying that your personality sucks, I love you, don't forget that." I think that this somehow applies in this situation too. Fabien, please don't feel rejected.

Earning money with open source

An interesting perspective on this topic was brought up Jordi Boggiano:

It was interesting because I didn't consider it to be relevant at first. It has never occurred to me that Fabien didn't earn enough money to justify all his open source work. Well, maybe he has, but that doesn't matter. Working on open source and earning money with that is something that's rather hard to accomplish. Examples of these are Jordi's own experiences with maintaining Composer. And I can very much relate to this too. Many people, including rich people, and many companies, including very successful ones, assume they can demand a lot of free work from open source maintainers. Starting to ask money for essentially the same thing, is often met by outrage (or silent neglect), so it's really hard to run an "open source" business.

For the past few months I've been trying to pick up what you could call my "master plan". I hope to organize workshops and do freelance programming jobs, thereby financing all the unpaid "open source" work that I love to do, like blogging, speaking at meetups and conference, and open source programming. I've learned two things already:

  1. You can't take it for granted that people will pay you for work you've always done for free (like book writing, or organizing workshops).
  2. You need to sell yourself and your products. Marketing is very important.

Marketing is also very annoying, and personally I'm just way too modest to keep throwing commercial messages at my audience. It's quite a personal victory to even show a banner for my new book on my blog. I know though, that in order to do all this free work, I need to sell stuff too, so I hereby assume that you'll understand that I'll do a little bit more work in the future to properly market my products.

Conclusion

What I'm saying is that I totally get that being lead maintainer of an open source framework (or package manager, etc.), and trying to compensate somewhat for everything you've given away for free, can be very difficult. And I totally understand that the Symfony trademark needs to be protected too. The Symfony brand has been carefully cultivated. I just hope that Sensiolabs will be more careful about dismissing some - I think - proper uses of the Symfony trademark and logo. I assume they will, based on what Fabien already tweeted tonight:

PHP Symfony Symfony open source Comments

Microservices for everyone - The introduction

Posted on by Matthias Noback

I'm happy to share with you the first chapter of my new book, Microservices for everyone. I'm still in the process of writing it and intend to release parts of it during the next weeks. If you're interested, sign up on the book's landing page and receive a 25% discount when the first chapter gets released.

If you're curious about the table of contents so far: you'll find it in the PDF version of the introduction.


Introduction

Scepticism

I can almost hear you think: "Bah, microservices. Nothing good could come from that!"...

We replaced our monolith with micro services so that every outage could be more like a murder mystery.

— Honest Status Page @honest_update

I am absolutely certain that this a solvable problem, but nonetheless, it may scare you away from considering a microservice architecture as a viable choice for your company. Especially since you already receive reminders of what a bad choice that may be on a daily basis (at least if you're on Twitter):

If one piece of your web of microservices suffers an outage and your whole system crashes and burns, then you have a distributed monolith.

Matt Jordan

Which makes me wonder: how does this differ from when we have a single application? If something goes wrong in a monolith we usually throw an exception, and let it crash the application, right? Is it even possible to make our system as resilient as gets depicted here? When the disaster is too big, there may be nothing we can do to recover from it. Still, I'm certain that with a few simple implementation patterns we can make our microservice system much more resilient than any monolith we have encountered so far.

If your microservices must be deployed as a complete set in a specific order, please put them back in a monolith and save yourself some pain.

Matt Stine

This sounds like good advice though. One of the main design goals of microservices is that services can be created and removed on-the-fly and other services shouldn't produce any failures because of that. If this is not the case, indeed we should get back to our monolith. But not too fast! There are some good solutions available.

Microservices without asynchronous communication are as good as writing monolith app.

Ajey Gore

Asynchronous, event-driven communication is one way to approach the dependency problem. But it is not the one and only solution. In fact, as we will see later on, synchronous communication is still a viable solution. It needs a bit of extra work though. And as soon as you find out how to solve things in an asynchronous fashion, you'll be looking for other places where you can switch from synchronous to asynchronous communication.

I'm sure there are plenty of teams that have decided to make their next project a microservices project, which took a lot of research and a lot of work and the project may have ended up in quite a bad shape after all. There are many reasons for this. Probably most of those reasons are the same as for any other kind of software project: the regular problems related to estimations, deadlines, budgets, etc. Or, as often happens, developers were eager to try something new, to escape from the suffocating work on the "legacy system", seeking their salvation in a microservices architecture. Or, they were able to run their services on their own machine, but had trouble getting the whole system up and running, monitored and all, in an actual production environment.

Optimism

While still surrounded by microservice negativism the tech community has in fact been floating on a wave of microservices hype. Trusting on my built-in "tech radar" and "mood calculator" though, it feels like we're almost past this hype. If I look around me, we're more in the assess phase: "This could be something for us, let's find out." And I agree, it's time to prove that this can work. It's my current belief that we need the following ingredients for that:

  • We need to put a lot of focus on our domain and create (but also continuously refine) a suitable model for it. In order to do so, we need to apply Domain-Driven Design (DDD), take a sincere interest in the business domain and find out how we can contribute by creating software.
  • We need to consciously and continuously look for ways to refine our service boundaries and how services are connected.
  • We need to develop some organizational awareness, and look for bottlenecks in the way teams are structured and how they communicate.
  • We need to adopt a "devops" mindset, since we need to be able to set up and manage the infrastructure that runs our services.

That's a nice little list, but it might represent a lot of work for you. Not necessarily programming work, but learning work. And this is generally the hardest kind of work. As Alberto Brandolini puts it: "Learning is the bottleneck". This quote itself is derived from Dan North's article on Deliberate Discovery), who says that it's not learning but "ignorance" which is the "single greatest impediment to throughput". Looking at the list of ingredients above, you may well find that you're not quite ready to start you microservice journey, or maybe you are, but your team is not. You may not have much experience with DDD, you may not be concerned with organizational structures, and you may not like fixing things on a server. And above all, you may feel that you don't have the time to learn it all.

My first comforting message to you is: you are not alone. Looking at online lists of resources on various topics that might interest programmers, it becomes apparent that the target audience is expected to only take an interest in programming languages, programming techniques, OOP principles, patterns, frameworks and libraries. Take a look at lists of resources like Java Annotated Monthly from IntelliJ, or in my own community, PlanetPHP or PHPDeveloper.org, and you'll notice that almost nobody seems to concern themselves with Domain-Driven Design or devops.

My second comforting message to you is: it's not too late to catch up. In fact, at this very moment it's easier than ever before. All over Europe local communities are gathering in meetup groups about Domain-Driven Design and devops. It's not just local meetups, there are international conferences on these topics too, like DDD Europe, DDDx, DockerCon, etc. And besides a large number of learning initiatives, we now have a lot of powerful yet easy-to-use tools available. We can create stand-alone deployable artifacts for our software using Docker, and deploy them using Docker Swarm, or Kubernetes, or integrated, even more user-friendly solutions on top of these container orchestration tools.

I'm confident that learning about microservices will pay off. It should shake out many issues that had so far been hidden from sight, swept under the carpet, or worked around for ages. Think of issues like:

  1. Spaghetti code; everything knows about everything and can use any function or piece of data available in the entire system.
  2. Single-person, delayed deployments; only one person in the organization knows how to deploy the application, and does so only every week, month or quarter.
  3. Teams breaking the applications of other teams; they have trouble integrating their applications, which almost never succeeds in one go. Hence, they fear releasing their software (or only dare to do it in a coordinated fashion, late at night).
  4. Teams not being able to decide upon the best course of action, hence doing a lot of rework, or delivering sub-optimal solutions.
  5. Vendor lock-in; hosting providers that offer only a certain set of services (e.g. Nginx, MySQL, Memcache; while you would like to use Apache, Cassandra and Redis).
  6. Scaling issues; in order to accommodate higher demand, you've only focused on performance optimizations in the request-response flow, applying patches everywhere, caching results, etc. There is a limit to what your current vertically scaled setup can handle, but you don't know how to prepare for the next step.

Adopting microservices is going to make all these issues clear, out in the open and ready to get fixed:

  1. You can and need to start isolating data, and related behaviors.
  2. You and everyone on your team will be able to release their software and, if you want, even deploy it to production servers.
  3. You will be forced to explicitly define contracts for each service: how can other services communicate with it, which events does this service expose, etc. You have to think harder about explicit interfaces and focus on use cases first.
  4. Because the size of each service is relatively small, most of your design decisions only reach as far as the boundaries of the service itself. This means you can try radically new approaches and fall back on more traditional solutions if it doesn't work out as expected.
  5. Working with microservices allows you to try different types of databases and different technologies in general. This offers even more opportunities to experiment.
  6. With microservices, scaling gets another dimension. Instead of looking for bigger, stronger machines, you can now invest in more, yet simpler machines. Resource usage will be distributed more evenly, in particular when you start using asynchronous communication.

Whether or not you are actually going to create microservices, the things you're going to learn about modularization, team work, domain modelling, and operations is useful either way.

Why I have to write this book

I've been developing web application since 2003. At the risk of sounding like an old man: I've seen many things come and go. A couple of years ago I realized that my work as a software developer has become much more interesting than it was before. My activities started to stretch further than my keyboard could reach. With the advent of Domain-Driven Design and the Docker ecosystem, I feel more empowered to deliver useful software than I ever did before.

I feel that I'm the right person to write this book, as I enjoy writing, but I also enjoy reading. I've read a lot about microservices and adjacent topics, like DDD, continuous delivery, messaging integration, Docker, etc. It's crucial to note though that so far I have not had the opportunity to work on a large microservice system. So this book won't contain wild stories from the trenches. This is a book about the technologies involved in a microservices architecture, focusing mostly on the software development involved, and how you can make the best design decisions.

This book is not so much about showing you in overly enthusiastic words that you're crazy if you don't do microservices. It's about my hypothesis that over the past few years the tech community has been working their way towards the peak of Microservice Impediments. I want to prove that we are at a point where the overhead of implementing a microservice architecture starts being smaller and smaller, and is currently at least small enough to justify it, even for smaller teams. You don't have to be Netflix or Amazon to benefit from building your software as microservices.

We've reached the peak of *Microservice Impediments*

Design guidelines for this book

Since writing a book is a daunting task which can quickly get out of hand, I find that I'm better off with an explicit list of guidelines. This should help me decide on a case-by-case basis if I should write more, or if I should stop writing.

Since I want to combine insights from Domain-Driven Design (DDD) with practices of Continuous Delivery we're going to use Docker to create containers running single services. Each service encapsulates part of the overall domain model, so they are bounded contexts. This isn't a book about Docker though, nor about DDD, so I won't explain everything to you. Instead, I'll give you:

Short summaries, a little bit of background, quotes, and pointers.

Microservices come with an entire ecosystem. It would be impossible to provide you with the best solutions, the ultimate or ideal ones. In fact, I couldn't do that, because each situation requires a different solution, to be determined based on the particular context. In this book I want to provide simple solutions (to prove that you don't really need to put a lot of work in it to arrive at a minimum viable solution). So:

A few lines of code should be enough to show that it can work.

Like every tech book, this book will show you the happy path. Since many people have been talking quite negatively about microservices, warning about their dangers—and rightly so—it would be unfair to ignore the problems. So:

For each overly simplistic solution, add a list of things to consider once you really start implementing microservices.

I'd like to make the code examples in this book as general as possible, in order to be read and understood by people who are familiar with any programming language. Like in my previous books:

Code samples will be written in PHP.

If you know Java: PHP is much like Java anyway, just ignore the dollar signs. The main reason to choose PHP is that it's my "native" language. However, an important second reason is that the PHP community needs to be shown that they work with a language that may not be so well-designed; you can still do great things with it.

Rigor?

If you have read my previous book, Principles of Package Design, you know that I'm generally a man of rigor. I want things to be exactly right. Given a technical subject, I want to know what I'm talking about, so I'll investigate it until I'm sure that: 1. I don't say anything about it that's incorrect, and 2. I won't give anyone bad advice about it. So far, this approach has worked out well. I'm not a troubled perfectionist. I just know that the only way to go fast is to go well. However, I must admit that sometimes I get lost in a subject. In particular in the areas in which I'm less well versed, like operations. I'm learning my way in infrastructure-land, but the situation for me is sub-optimal at the moment and everything is still a time-sink, like programming was when I first started with it. In fact, I know that I do not know. The danger is, of course, that what I do or preach in this area is not the best thing one could do or preach. I've decided to let go of the feeling of uneasiness coming with that and to rely more on the feedback that will automatically fly back to me, the moment I publish something. I'm slowly, but surely, accepting that I don't have to know everything about everything and that help will always be given to those who ask for it.

Ethics

If I ever want to finish this book, I can't make it complete, rigorous, nor entirely correct. I'll have to cut some corners. In order to keep things moral though, for the both of us, I have to define my own ethics first. I like how Nassim Nicholas Taleb explicitly defines his own ethos in the introduction of his pretty heavy book Antifragile—things that gain from disorder. I never finished that one, but nevertheless got some interesting ideas from it (this is the least a book should offer to me, if it stops doing that, I'll put it away).

Nassim writes that "If the subject is not interesting enough for me to look it up independently, for my own curiosity or purposes, and I have not done so before, then I should not be writing about it at all, period." Of course, external sources are not banned, nor deemed worthless. But he doesn't want his writing to be directed by these. "Only distilled ideas, ones that sit in us for a long time, are acceptable—and those that come from reality." This is something I'd like to do myself too. I've read many books on software development and have developed lots of software, and would like to speak both from experience and existing knowledge about ideas that have been boiling for quite some time now.

In order to keep myself high-spirited, I'll use my internal compass, which revolves around procrastination. I know it's a pretty negative concept and I'm sure you all have some experience with it. There are times when I think I have to do activity A, while I'm craving to do activity B and often just start doing B anyway. As long as the lives around me don't completely derail, there are many good aspects about doing B, while not doing A. Again, Nassim has some interesting words about it:

A very intelligent group of revolutionary fellows in the United Kingdom created a political movement […] based on opportunistically delaying the revolution. […] In retrospect, it turned out to be a very effective strategy, not so much as a way to achieve their objectives, but rather to accommodate the fact that these objectives are moving targets. Procrastination turned out to be a way to let events take their course and give the activists the chance to change their minds before committing to irreversible policies.

I often find that not doing A will let me discover things about A that were wrong about it. Sometimes A isn't the best thing you can do to achieve a certain goal. There may be more effective ways. Maybe A is not helpful at all, or even harmful. Besides, B is nicer to do, more energizing at this moment. And it might put you on a trail to some place else, or provide you with a more interesting and compelling journey.

This is why I'll make writing as fun and interesting as possible. I'll likely come up with exotic topics, interesting implementation discussions, fun little open-source libraries, and if I notice my attention drifting away, or if I find myself bored by the writing itself, I'll focus on some other topic, trusting that you would otherwise get bored too.

Overview of the contents

The order in which I'm going to discuss all relevant topics is not certain yet, but this is a list of some of the keywords for this book:

Exploring the domain, event storming, bounded contexts, services, Docker, continuous delivery, CRUD, CQRS, event sourcing, shared database, synchronous HTTP calls, asynchronous messaging, ports & adapters, user interface, storage, serialization.

To be continued

For now, this is all that's available of Microservices for everyone. If you like it, please let me know.

Book microservices Comments