How to make Sculpin skip certain sources

Posted on by Matthias Noback

Whenever I run the Sculpin generate command to generate a new version of the static website that is this blog, I notice there are a lot of useless files that get copied from the project's source/ directory to the project's output/ directory. All the files in the output/ directory will eventually get copied into a Docker image based on nginx (see also my blog series on Containerizing a static website with Docker). And since I'm on a hotel wifi now, I realized that now was the time to shave off any unnecessary weight from this Docker image.

My biggest mistake was not googling for the quickest way to skip certain sources from getting copied to output/. Instead, I set out to hook into Sculpin's event system. I thought it would be a good idea to create an event subscriber and make it subscribe to the Sculpin::EVENT_BEFORE_RUN event. Event subscribers for this event will receive a so-called SourceSetEvent, allowing them to mark certain sources as "should be skipped".

Sculpin is built on many Symfony components and it turned out to be quite easy to set up a traditional event subscriber, which I called SkipSources:

final class SkipSources implements EventSubscriberInterface
{
    /**
     * @var string[]
     */
    private $patterns = [];

    public function __construct(array $patterns)
    {
        $this->patterns = $patterns;
    }

    public function skipSourcesMatchingPattern(SourceSetEvent $event): void
    {
        // see below
    }

    public static function getSubscribedEvents(): array
    {
        return [
            Sculpin::EVENT_BEFORE_RUN => ['skipSourcesMatchingPattern']
        ];
    }
}

You can create your own Symfony-style bundles for a Sculpin project, but in this case defining a simple service in sculpin_kernel.yml seemed to me like a fine option too:

# in app/config/sculpin_kernel.yml

services:
    skip_sources:
        class: SculpinTools\SkipSources
        arguments:
            # more about this below 
            - ["components/*", "_css/*", "_js/*"]
        tags:
            - { name: kernel.event_subscriber }

Due to the presence of the kernel.event_subscriber tag Symfony will make sure to register this service for the events returned by its getSubscribedEvents() method.

Looking for a way to use glob-like patterns to filter out certain sources, I stumbled on the fnmatch() function. After that, the code for the skipSourcesMatchingPattern() method ended up being quite simple:

foreach ($event->allSources() as $source) {
    foreach ($this->patterns as $pattern) {
        if (fnmatch($pattern, $source->relativePathname())) {
            $source->setShouldBeSkipped();
        }
    }
}

It matches a source with each of the patterns based on the source's relative pathname, as nothing outside of the source/ directory is relevant anyway. The patterns themselves are passed in as the event subscriber's first constructor argument. It's simply a list of glob-like string patterns.

My solution turned out to be quite an effective way to mark certain files as "should be skipped", which was my goal.

La grande finale

Just like in my previous blog post, I finally ran into another possible solution, that's actually built in to Sculpin - a simple ignore configuration key allowing you to ignore certain sources using glob-like patterns. It does use a rather elaborate pattern matching utility based on code from Ant. Not sure if this library and fnmatch() have "feature parity" though.

Turns out, all my extra work wasn't required after all. A simple Google search would have sufficed!

So I removed all of this code and configuration from my project. But I still wanted to share my journey with you. And who knows, it could just be useful to have an example lying around of how to register an event subscriber and hook into Sculpin's build lifecycle...

PHP Sculpin Comments

Making a Docker image ready for use with Swarm Secrets

Posted on by Matthias Noback

Here's what I wanted to do:

  • Run the official redis image as a service in a cluster set up with Docker Swarm.
  • Configure a password to be used by connecting clients. See also Redis's AUTH command. The relevant command-line option when starting the redis server would be --requirepass.

This is just a quick post, sharing what I've figured out while trying to accomplish all of this. I hope it's useful in case you're looking for a way to make a container image (an official one or your own) ready to be used with Docker Secrets.

I started out with this docker-compose.yml configuration file, which I provided as an option when running docker stack deploy:

version: '3.1'

services:
    redis:
        image: redis
        command: redis-server --requirepass $(cat /run/secrets/db_password)
        secrets:
            - db_password

secrets:
    db_password:
        file: db_password.txt

This configuration defines the db_password secret, the (plain text) contents of which should be read from the db_password.txt file on the host machine. The (encrypted) secret will be stored inside the cluster. When the redis service gets launched on any node in the cluster, Docker shares the (decrypted) secret with the container, by means of mounting it as a file (i.e. /run/secrets/db_password) inside that container.

The naive solution above looked simple and I thought that it might just work. However, I got this error message:

Invalid interpolation format for "command" option in service "redis": "redis-server --requirepass $(cat /run/secrets/db_password)"

Docker Compose does variable substitution on commands and thinks that $(...) is invalid syntax (it's expecting ${...}). I escaped the '$' by adding another '$' in front of it: redis-server --requirepass $$(cat /run/secrets/db_password). New errors:

Reading the configuration file, at line 2
>>> 'requirepass "$(cat" "/run/secrets/db_password)"'
Bad directive or wrong number of arguments

Bad stuff. I thought I'd just have to wrap the values into quotes: redis-server --requirepass "$(cat /run/secrets/db_password)". Now, everything seemed to be fine, the Redis service was up and running, except that the password wasn't set to the contents of the db_password. Instead, when I tried to connect to the Redis server, the password seemed to have become literally "$(cat /run/secrets/db_password)"...

At this point I decided: let's not try to make this thing work from inside the docker-compose.yml file. Instead, let's define our own ENTRYPOINT script for a Docker image that is built on top of the existing official redis image. In this script we can simply read the contents of the db_password file and use it to build up the command.

The Dockerfile would look something like this:

FROM redis:3.2.9-alpine
COPY override_entrypoint.sh /usr/local/bin/
ENTRYPOINT ["override_entrypoint.sh"]

And the override_entrypoint.sh script mentioned in it could be something like this:

#!/usr/bin/env sh -eux

# Read the password from the password file
PASSWORD=$(cat ${REDIS_PASSWORD_FILE})

# Forward to the entrypoint script from the official redis image
docker-entrypoint.sh redis-server --requirepass "${PASSWORD}"

Building the image, tagging it, pushing it, and using it in my docker-compose.yml file, I could finally make this work.

I was almost about to conclude that it would be smart not to try and fix everything in docker-compose.yml and simply define a new image that solves my uses case perfectly. However, the advantage of being able to pull in an image as it is is quite big: I don't have to rebuild my images in case a new official image is released. This means I won't have to keep up with changes that make my own modifications break in some unexpected ways. Also, by adding my own entrypoint script, I'm ruining some of the logic in the existing entrypoint script. For example, with my new script it's impossible to run the Redis CLI.

La grande finale

Then I came across some other example of running a command, and I realized, maybe I've been using the wrong syntax for my command. After all, there already appeared to be some kind of problem with chopping up the command and escaping it in unexpected ways. So I tried the alternative, array-based syntax for commands:

command: ["redis-server", "--requirepass", "$$(cat /run/secrets/db_password)"]

No luck. However, the problem was again that the password was taken literally, instead of being evaluated. I remember there was an option to provide a shell command as an argument (using sh -c), to be evaluated just like you would pass in a string to eval(). This turned out to be the final solution:

command: ["sh", "-c", "redis-server --requirepass \"$$(cat /run/secrets/db_password)\""]

I hope this saves you some time, some day.

Docker Docker Swarm Comments

The case for singleton objects, façades, and helper functions

Posted on by Matthias Noback

Last year I took several Scala courses on Coursera. It was an interesting experience and it has brought me a lot of new ideas. One of these is the idea of a singleton object (as opposed to a class). It has the following characteristics:

  • There is only one instance of it (hence it's called a "singleton", but isn't an implementation of the Singleton design pattern).
  • It doesn't need to be explicitly instantiated (it doesn't have the traditional static getInstance() method). In fact, an instance already exists when you first want to use it.
  • There doesn't have to be built-in protection against multiple instantiations (as there is and can only be one instance, by definition).

Converting this notion to PHP is impossible, but if it was possible, you could do something like this:

namespace {
    class Foo
    {
        public function bar(): void
        {
            // ...
        }
    }

    object Foo
    {
        public function baz(): void
        {
            // ...
        }
    }
}

Since object Foo would already be an object, its methods won't have to be marked static in order to be used anywhere.

// Use a local instance of class Foo:
$foo = new Foo();
$foo->bar();

// Use the globally available singleton object (requires no instantiation):
Foo->baz();

Inside object Foo you can also use $this if you like:

object Foo
{
    private $dependency;

    public function baz(): void
    {
        $this->dependency->doSomething();
    }
}

Of course such a collaborating object needs to be instantiated, for which we could define a parameter-less constructor (in Scala this would not require a constructor method, but the initialization could take place where you define the class properties itself). We could simulate the fact that it is impossible to instantiate object Foo yourself by making its constructor private:

object Foo
{
    private $dependency;

    private function __construct()
    {
        $this->dependency = // new ...();
    }

    public function baz(): void
    {
        $this->dependency->doSomething();
    }
}

An interesting characteristic of singleton objects is that they are companion objects, very much like friend objects. If you define both a regular class and a singleton object with the same name, they have access to each other's properties and methods, even the private ones:

class Foo
{
    public function bar(): void
    {
        Foo->privateHelperMethod();
    }
}

object Foo
{
    private function privateHelperMethod(): void
    {
        // ...
    }
}

The case for singleton objects

Regular classes offer the advantage of using dependency injection (DI) in order to change an object's behavior, without changing its code (what we mean by the Open/Closed Principle). As I think this setup is always preferable over hard-coded dependencies, I've ran into many cases where I wanted to offer both options, in order to improve the user experience.

For example, a couple of months ago I created a JSON serializer, based on some very simple design principles and assumptions (which would deserve a blog post on its own I think). What's important in the context of this article is that users of this library should be able to serialize or deserialize objects with one simple function call. It should work without any configuration, as long as their objects are in line with the assumptions of the library itself. Something like this:

$serializedObject = JsonSerializer::serialize($object);

$deserializedObject = JsonSerializer::deserialize(get_class($object), $serializedObject);

Very soon though, the main class started to require some dependencies (which could be injected as constructor arguments). This completely messed up the user experience, since instead of using public static methods, the client would have to instantiate a JsonSerializer and call a public (instance) method on it:

$serializer = new JsonSerializer(
    new DefaultTypeResolver(),
    // ...
);
$serializedObject = $serializer->serialize($object);

It would be a shame to create more than one instance of the JsonSerializer, as it would not be possible to reuse dependencies, or configuration. Soon we're back at remembering why we started using dependency injection in the first place. The client would be better off if it would get the JsonSerializer injected as a constructor argument, pre-configured, ready-to-use.

However, some solutions don't call for the overhead that comes with dependency injection, in particular the educational code I often write these days, which is usually trying to prove entirely different points than "use dependency injection".

To allow two different usages, one with static methods (to allow global usage), one with instance methods (to allow local, as well as potentially customized, usage), you could go for a setup like this:

class JsonSerializer
{
    private function __construct() 
    {
        $this->typeResolver = new DefaultTypeResolver();

        // set up other dependencies
    }

    public static serialize($object): string
    {
        return (new self())->doSerialize($object);
    }

    private function doSerialize($object): string
    {
        // do the real work
    }
}

// Usage:
JsonSerializer::serialize($object);

This doesn't reuse dependencies and creates new instances of JsonSerializer all the time. It also doesn't allow for customization of behavior, so it doesn't yet offer the alternative option to users: setting up an instance of JsonSerializer themselves and injecting it where they need it. You could easily adapt the class to allow for this to happen though:

class JsonSerializer
{
    public function __construct(TypeResolver $typeResolver) 
    {
        $this->typeResolver = $typeResolver;

        // set up other dependencies
    }

    private static createInstance(): JsonSerializer
    {
        return new self(
            new DefaultTypeResolver()
        );
    }

    public static serialize($object): string
    {
        return self::createInstance()->doSerialize($object);
    }

    public function doSerialize($object): string
    {
        // do the real work
    }
}

// Usage (with default dependencies):
JsonSerializer::serialize($object);

// Or (with custom dependencies):
$serializer = new JsonSerializer(
    new CustomTypeResolver()
);
$serializer->doSerialize($object);

Several things that start to feel wrong about this:

  1. Creating and configuring an instance of JsonSerializer has now become a responsibility of the class itself. Mixing these concerns makes the class feel a bit cluttered to me. Even more so, since they could be easily moved to a dedicated factory, which knows how to set up a JsonSerializer instance for you. Anyway, this is not how we/I usually write classes.
  2. We have serialize() and doSerialize() methods, which is not really nice, since one of them is just a proxy, and the other one is the real deal, but has do in front of it (which feels old-fashioned and really silly, it could be anything of course, but I'd rather have no pre-/postfix at all).

This is where I felt there really was a need for a singleton object! One that is pre-configured, always and globally available, e.g.

class JsonSerializer
{
    private $typeResolver;

    public function __construct(TypeResolver $typeResolver)
    {
        $this->typeResolver = $typeResolver;
    }

    public function serialize($object): string
    {
        // do the real work
    }
}

object JsonSerializer
{
    private $serializer;

    private function __construct()
    {
        $this->serializer = new JsonSerializer(
            new DefaultTypeResolver()
        );
    }

    public function serialize($object): string
    {
        return $this->serializer->serialize($object);
    }

    // ...
}

// Usage (DI-ready):
$serializer = new JsonSerializer(
    new CustomTypeResolver(),
    ...
);
$serializer->serialize($object);

// Or (stand-alone):
JsonSerializer->serialize($object);

Of course, that's not possible with PHP. We don't have singleton objects. The go-to solution for all things singleton is static, which takes away the option to work with a constructor:

class JsonSerializer
{
    private static $serializer;

    public static function serialize($object): string
    {
        if (self::$serializer === null) {
            self::$serializer = new JsonSerializer(
                new DefaultTypeResolver()
            );
        }

        return self::$serializer->serialize($object);
    }

    // ...
}

For everything there is only one instance in use, but this gives us really ugly and cluttered code. By the way, there are slight variations which may or may not make things better, e.g. using a local static variable for the serializer:

class JsonSerializer
{
    public static function serialize($object): string
    {
        static $serializer;

        $serializer = $serializer ?? new JsonSerializer(
            new DefaultTypeResolver()
        );

        return $serializer->serialize($object);
    }

    // ...
}

This design is "pretty nice", and it finally enables clients to use the JsonSerializer as global/static service or as an injected service. The only problem is that you can't have two classes with the same name in one namespace. So we should pick one of the following solutions:

  • Rename one of these classes (if that would make sense at all).
  • Move one class to a different namespace (the last part of which might be \Singleton).

There's one remaining question to me: what if we want to allow some clients to inject a JsonSerializer instance, and some clients to use the singleton? What if we also want to enforce the actual instance of JsonSerializer used by the singleton (the one that gets stored in its private static $serializer attribute), to be the exact same instance as the one clients get injected? It would mean maximum reuse and if we have a customized instance in one place, we allow the singleton implementation to use that exact same instance.

This sounds like a hard problem, but it's actually quite easy to solve. We could just add a setJsonSerializer() method to the singleton class:

class JsonSerializer
{
    private static $serializer;

    /**
     * @internal This method is not part of the client API
     */
    public static function setSerializer(JsonSerializer $serializer): void
    {
        self::$serializer = $serializer;
    }

    public static function serialize($object): string
    {
        if (self::$serializer === null) {
            self::$serializer = new JsonSerializer(
                new DefaultTypeResolver()
            );
        }

        return self::$serializer->serialize($object);
    }

    // ...
}

Façades

Maybe you noticed it already, but with this approach we're getting really close to the concept of a façade, as you may know it from working with Laravel.

The only difference is that, instead of setting an instance of one service by calling some static set[...]() method on the singleton class, a Laravel application sets a service locator on the singleton class (which is called a "façade" in Laravel jargon). Equivalent code for this functionality would look something like this:

class JsonSerializer
{
    private static $serviceLocator;

    public static function setServiceLocator(ServiceLocator $serviceLocator): void
    {
        self::$serviceLocator = $serviceLocator;
    }

    public static function serialize($object): string
    {
        return self::$serviceLocator->get('json_serializer')->serialize($object);
    }

    public static function deserialize(string $type, string $data): object
    {
        return self::$serviceLocator->get('json_serializer')->serialize($type, $data);
    }

    // ...
}

Of course, Laravel has a base class which contains the bulk of the "resolve" logic that would otherwise have to be duplicated over and over again.

Helper functions

Laravel goes even further with this concept and introduces helper functions, which use façades behind the scenes, yet make using complicated services even simpler. In our case, these helper functions could look something like this:

function json_serialize($object): string
{
    return JsonSerializer::serialize($object);
}

function json_deserialize(string $type, string $data): object
{
    return JsonSerializer::deserialize($type, $data);
}

Though they are mostly simple proxies, these functions are great in hiding complicated code and dependencies and providing a "drop-in" service wherever you are in the code base.

Conclusion

I'm not advocating the use of façades and helper functions per se. I assume they may have good use in specific parts of your application (just as the Laravel documentation states, by the way). My point is that, as long as you build into your libraries both options, you can offer a great developer experience, providing an out-of-the-box working solution of a useful service, which can still be used in a project that uses dependency injection everywhere.

PHP Design singleton façades Laravel Comments