PHP & Symfony About PHP and Symfony2 development

Behind the scenes at Coolblue

Posted on by Matthias Noback

Leaving Qandidate, off to Coolblue

After I had a very interesting conversation with the developers behind the Broadway framework for CQRS and event sourcing the day wasn't over for me yet. I walked about one kilometer to the north to meet Paul de Raaij, who is a senior developer at Coolblue, a company which sells and delivers all kinds of - mostly - electrical consumer devices. Their headquarters are very close to the new and shiny Rotterdam Central station. The company itself is doing quite well. With 1000+ employees they keep needing more office space.

Paul showed me around all departments and offices, which nowadays span three floors. There's developer teams everywhere. It's not a PHP-only company. Though PHP is well represented, there are also .NET and Delphi developers. Coolblue runs on quite a lot of software and hardware.

Heritage

Developers at Coolblue have learnt to call the legacy software they maintain "heritage". "Legacy software" often has a negative sound to it, while in fact, it's what enables the company itself to be so successful, so it doesn't really deserve that negative image. I don't fully agree with this approach since most of the relevant literature about this subject speaks of "legacy software", which, to me personally, doesn't mean anything bad. I'm well aware that anything I write today will be "legacy" tomorrow, because, literally, other people inherit that piece of code and need to maintain it. In my dictionary, "legacy software" isn't a bad thing (though I know that it often is, so I understand this little play of words).

New things: microservices

Paul mentioned that there is an ongoing struggle amongst developers who rather want to try "new" things, while they feel stuck in the "old" way of doing things. Paul argues that it's always possible to try something "new" in an older project as well. The infrastructure may not be there for it yet, but introducing it might therefore be even more challenging, as well as more satisfying. I fully agree with Paul on this, and I also like to work on an older code-base and introduce new concepts to it. Anyway, in my personal experience, thinking that you're better off working on a green-field application, because you can do everything "the right way", often turns out to be quite a fallacy. I'm sure you'll recognize this sentiment as well.

At Coolblue, "new things" currently means event sourcing and microservices. They have introduced several microservices so far. Microservices are one of these things Coolblue developers have been wanting to introduce for quite some time. It turned out to be not that hard, but, according to Paul, the key was to keep it small at first. They started by extracting several smaller and less crucial parts from the main application into microservices. You can read about some of their experiences on devblog.coolblue.nl.

New things: event sourcing

Paul and others have done quite some research with regard to event sourcing as well. They haven't taken the step yet to implement it in their software. Take a look at this slide deck to get an impression of what it might look like for them when they do.

Paul made an interesting observation with regard to "new things": there is often a mismatch between what a developer thinks of themselves, and what that same developer thinks of other developers. When listening to meetup or conference talks, you may start thinking that you're way behind on current developments in the world of (PHP) software development. Paul at first felt the same, but noticed that when you actually talk to developers about what you're doing, it might just as well turn out that you're doing fine.

Teams

Developer teams at Coolblue are separated based on the features they work on. There is a team working on "pre-cart", i.e. everything related to the presentation of the products, their categories, etc. Then there's a "post-cart" team, which works on actually making the sale, payment, etc. Paul himself is moving from team to team mostly, helping everyone solve any issues that they may be facing. This way, he gets a nice overview which enables him to take knowledge from each team to other teams. This also helps preventing the same mistakes from being made in different teams.

Walking through the corridors, we pass a lot of "team rooms". Walls are made of glass, but each team is still nicely separated from the others. They can see, but not hear each other, meaning that they can focus on what they're working on, while still feeling part of the organization. It appears that each team consists of about 6 people. This is the right amount, according to Paul; when you add more people to a team, it doesn't become any easier. Each team has a nice poster attached to the outside wall, showing the passer-by what the goals of this team are for the current quarter.

Operations

Although Coolblue has an "operations" department, developers are supposed to take responsibility for the deployment and functional correctness of the work they deliver. Pull requests need to contain any changes required for configuration management (Puppet) as well as logging and monitoring. Developer teams monitor the runtime performance of the software and any problems that occur in production environments.

Paul says that gradually the organization is moving towards a "devops" culture, where developers are fully able to deploy and configure the code they produce. I'm curious how this works out. I think it's a great idea. However, I've also noticed that many developers are quite reluctant to learn all the subtleties of all the available "operations" tools. They may prefer to do what they're already good at: developing software (for which they already have to learn so many tools).

On the other hand, creating a working software product doesn't end with merging the pull request. It has to work on a production server as well, so you have to think about what's required to run the software and keep it running. So it's definitely justifiable to ask from developers to learn about "servers" as well.

My current hypothesis is that growing a devops culture is easier when the tools have been selected, and you don't have to learn to work with Ansible and Puppet, Linux and Windows, etc. It looks like Coolblue is lucky to be in this situation: its developers only need to learn to work with a small number of operations-related tools.

Learning and sharing

Coolblue regularly hosts "behind the scenes" meetups. Paul himself and Victor Welling regularly speak at these as well as other meetups and conferences. Developers are also encouraged to participate in public speaking, but when they don't want to, there are also regular opportunities to present something internally only. I agree with Paul that this is very valuable for development teams, and of course it's good for the "attraction factor" of the company. Coolblue itself is growing very fast and that's why they currently use the slogan "Looking for 100 Developers". To me this sounds like a pretty big effort, since every company I know of struggles with attracting and keeping new (and good) developers. I wish them the best of luck of course, knowing that developers will be in good hands.

Categories: PHP

Tags: micro services event sourcing legacy

Comments: Comments

Meeting the Broadway team - talking DDD, CQRS and event sourcing

Posted on by Matthias Noback

Visiting Qandidate in Rotterdam

Last Thursday I had the honor of meeting (part of the) Broadway team: three very smart developers working for Qandidate in central Rotterdam: Gediminas Šedbaras, Willem-Jan Zijderveld and Fritsjan. A short walk from the train station brought me to their office, where we talked about CQRS, event sourcing and some of the practical aspects of using the Broadway framework.

As you may have read I've been experimenting with Broadway extensively during these last weeks. Knowing a lot about the theory behind it, it was really nice to see how everything worked so smoothly. I had some questions however, which have now been answered (thank you guys for taking the time to do this!).

Snapshotting

For example I wanted to know about snapshotting and what they thought of it. As I saw in the list of open issues for Broadway, some people are interested in this. When you're doing event sourcing, before an aggregate can undergo any changes, it needs to be reconstituted based on previous events that are all stored in the event store. When the event store contains many, many events for a given aggregate, this process can become too slow for use in a production environment. Snapshotting solves the issue by storing the current state of an aggregate and only replaying events that occurred later than the time of the snapshot.

At Qandidate they didn't experience this problem in a production application yet. They do feel that snapshotting is the proper solution for a real problem. Before resorting to it (because it may complicate maintenance a bit, see the section "Correcting mistakes" below), you should reconsider your design first. When your aggregates undergo so many changes that the need for snapshotting arises, you might have another problem. Possibly, an important concept is missing from your domain model (this is what DDD people like to say a lot ;)). Or you may be solving a "read" issue on the "write" side of your application. Anyway, in some cases the need for snapshotting is "legitimate" so Broadway will probably provide an off-the-shelf solution for snapshotting at some point (there is an open issue discussing this feature).

Open source vs company work

This brought me to the question how the Qandidate developers manage to divide their attention between "regular" work and "open source" work. Broadway is not just a small library to maintain, it's an actual framework. It still doesn't have an aweful lot of users, but nevertheless: it does take a serious amount of time to deal with issues and pull requests for it. Currently, the team is allowed to spend some time on this during working hours, which is really great. Adding new features doesn't often have a high priority though, since these are often features required by the community, but for the company itself there is no urgency. Judging by the way some of the current pull request are being handled, I personally feel that the current situation is just fine though.

Event store management

Currently several features are being worked on which are related to "event store management". The current version of Broadway's event store can't be queried for events of a certain type (the class of an event). Being able to do so would be nice, since it allows you to replay just certain events and let read model projectors process just a slice of all the events. Some people want to take this even further and want to be able to query for data inside stored events. This requires query/indexing capabilities within JSON blobs (event objects are always serialized to simple arrays, then persisted as JSON strings). This is impossible when using a MySQL database. But it was suggested that it might be possible if the events would be stored in a PostgreSQL database instead of a MySQL database like it currently is. The Broadway team itself is not fully convinced if querying the event store should be very convenient, but they imagine it can be useful in some high-performance environments.

Replaying events

Event stores are very powerful once they are combined with projectors. A projector can subscribe to domain events and produce read models based on what's changed in the write model. In general, each use case (a list of things, a detailed view of a thing, a list of the same things but this time just for administrators, etc.) requires a new read model projector as well as a new read model. The general recommendation by the Broadway team is to not always follow this advice blindly. If read model A has just one extra field compared to read model B, you might as well combine them and adjust the query a bit.

Combining read models

I always assumed that each read model should correspond to one use case and that a read model query should return all the data required for the view, no more no less. I asked the people from Qandidate about this and it turns out this isn't always feasible. For example, projector A which updates read model A, might need to use read model repository B to incorporate data from it into read model A. By doing this, read model A becomes sensitive to changes in read model B, which are not automatically reflected in A. So you have to enhance the projector of A to subscribe to the same events as B was already subscribed to. This duplicates some of the effort as well as some write-to-read translation knowledge. In reality you may need to run separate queries, for example in your controller, in order to be able to combine and provide the right data for your views (e.g. templates).

Correcting mistakes

My next question was: using event sourcing, is it easy to recover from a mistake? The first part of the answer first focussed on the read side: when you accidentally destroy some read model, or all read models (it happens to everyone, right?), it's extremely easy to reconstruct it, by simply loading all events in the event store, and letting the read model projectors do their work again. Reprocessing the entire history of your application might be a matter of minutes (if you're lucky of course).

Looking at the write side things are a bit more difficult. It's hard to correct a mistake that you made in the design of your model, for instance when aggregate boundaries need to lie somewhere else, events need different data, events were not generated when they should have been, etc. As the Broadway team told me: it's easy to fix these mistakes if the code hasn't been released yet, when it's running in production, it's a different story.

When the internal state of an aggregate needs to change (i.e. the values of its properties), it's not that bad, since that state is reconstituted from the event store for each change anyway. Once snapshotting (see above) is a supported feature, it may be harder to change the internals of an aggregate, since a snapshot contains the exact internal state of an aggregate at some point in time. So once you have stored a snapshot of an aggregate, changing one of its property names isn't that simple anymore. The solution would be to remove existing snapshots, then generate them again.

Upcasting

When events themselves need to change because a design mistake has been made, or a new feature has to be implemented, it might become a bit harder. The event store will be filled with serialized "version 1" events. Newly created events will be "version 2" events. A technique which the Broadway team suggested, and something they are still working on, is called upcasting. A PR is open for this. Upcasting can be described as a way to migrate your events to newer versions. Each time you want to change part of it, you write a bit of code that is able to convert an older serialized event to the format of newer events. The "upcasted" event is never stored in the event store, since the event store is truly append-only and should always only contain the events as they happened back then.

You can read more about what the Broadway team envisions in the documentation of Axon, a CQRS framework which served as a source of inspiration for Broadway itself.

Team expansion, knowledge sharing

Although I personally love the ideas behind CQRS and event sourcing, looking at the project I've been creating with Broadway I can imagine it would be hard to step in as a new developer who is used to doing (Symfony) projects in a more traditional way. I asked the Qandidate developers about this and they could imagine it as well. There are a lot of "moving parts". There is a pretty large amount of code to be understood. There are commands, events, entities, value objects, projectors, read models, processors, etc. It can be hard to get (and keep!) a full understanding of the flow of an application. A recommendation is to create and maintain some large visual overviews of the application flow. It's useful when you need to explain a new team member what's going on but it also serves as an aid in discussions about parts that have to be changed.

Testing things

Working with Broadway I wasn't much bothered by the amount of classes I had to produce. They are small and focussed anyway. I noticed that they are often quite simple as well, probably because Commands and Queries, Writes and Reads are completely separate. It turns out that these classes are much easier to unit test as well. Especially because Broadway comes with a lot of base classes for PHPUnit which offer scenario-style testing (given-when-then).

All of this allows you to unit test everything without too much effort. Most of the actual effort of developing an event-sourced application goes to where it's required the most: the domain model. However, I also noticed that unit testing isn't sufficient if you want to trust your application to function correctly: a lot can go wrong at the configuration level. Command handlers, event subscribers, metadata enrichers, etc. - everything has to be defined as a service and registered properly. I also noticed that my read models sometimes contained mistakes because I only gave little attention to their quality. My read model unit tests didn't capture my mistakes sometimes.

So, I started feeling the need for functional tests or acceptance tests. The Qandidate developers themselves like to use the modelling by example approach where you run acceptance tests twice: with and without the infrastructure layer "enabled". This turned out to be a very useful approach for them (and I agree: it's a great idea).

Asynchronous command/event handling

For SimpleBus I've created a RabbitMQ integration, which can be used when you like to send commands to a messaging server and let them be handled by another process. It can also be used to broadcast events to other applications, again, by sending them to a messaging server.

Broadway doesn't come with built-in support for asynchronous operations. But at Qandidate they have created a custom event processor which does exactly this: it sends events to RabbitMQ, thereby allowing "offline" processes to do some heavy work in the background. The Broadway team is a bit hesitant to also process commands asynchronously, even though they have strictly kept to CQRS, so it shoudln't be a big problem. However, it will require a lot more work on the UI side, which so far they didn't think is worth it.

CRUD vs event sourcing

When I recreated (part of) an existing CRUD-style application using Broadway, I could immediately solve about five severe problems which the original version had, just because I applied CQRS and used event sourcing. It occured to me that CQRS/event sourcing might be something which, if you try it once, you never want to let go. When asked about this, the Qandidate developers agreed: it's hard to get back to CRUD afterwards. On the other hand, for some types of applications event sourcing doesn't make sense and the CRUD style might suffice or be even better. Some clients require this as well - it's more of an Excel-based, data-driven approach to software.

Shortcomings, problems, etc.

I asked the Broadway team about what they think is Broadway's biggest shortcoming. It turned out, their main concern was with the tools related to the read model. They think that Broadway's job should end with calling the read model projector. After that, it's all up to the implementer: Broadway users are free to use whatever type of database (MySQL, MongoDB, ElasticSearch, etc.) to store their read data. Since Broadway itself comes bundled with just an ElasticSearch (and an in-memory) adapter for read model repositories, this may not be clear for them. They may be stuck with ElasticSearch and have to work around some of its issues. For example, using the "factory" settings, an ElasticSearch query returns only 500 results.

For me personally, this wouldn't really count as a shortcoming of Broadway itself. Its users just need to know something about the technology stack they're using and shouldn't count on everything to "just work". It reminds me of the law of leaky abstractions. No matter how nice your abstractions are, at some point, you need to deal with the underlying, low-level details.

Conclusion

I had a great time meeting the guys at Qandidate. Again, a big thank you for taking the time to answer my questions and a very interesting conversation in general. I hope Broadway itself will gain some more interest in the community. It really deserves to get a lot of attention.

If you'd like to find out more about Broadway, check out the Qandidate labs blog and, for starters, the article Bringing CQRS and Event Sourcing to PHP. Open sourcing Broadway!.

Categories: PHP

Tags: DDD event sourcing CQRS

Comments: Comments

Refactoring the Cat API client - Part III

Posted on by Matthias Noback

In the first and second part of this series we've been doing quite a bit of work to separate concerns that were once all combined into one function.

The major "players" in the field have been identified: there is an HttpClient and a Cache, used by different implementations of CatApi to form a testable, performing client of The Cat Api.

Representing data

We have been looking at behavior, and the overall structure of the code. But we didn't yet look at the data that is being passed around. Currently everything is a string, including the return value of CatApi::getRandomImage(). When calling the method on an instance we are "guaranteed" to retrieve a string. I say "guaranteed" since PHP would allow anything to be returned, an object, a resource, an array, etc. Though in the case of RealCatApi::getRandomImage() we can be sure that it is a string, because we explicitly cast the return value to a string, we can't be sure it will be useful to the caller of this function. It might be an empty string, or a string that doesn't contain a URL, like 'I am not a URL'.

class RealCatApi implements CatAPi
{
    ...

    /**
     * @return string URL of a random image
     */
    public function getRandomImage()
    {
        try {
            $responseXml = ...;
        } catch (HttpRequestFailed $exception) {
            ...
        }

        $responseElement = new \SimpleXMLElement($responseXml);

        return (string) $responseElement->data->images[0]->image->url;
    }
}

To make our code more robust and trustworthy, we can do a better job at making sure we return a proper value.

One thing we could do is verify the post-conditions of our method:

$responseElement = new \SimpleXMLElement($responseXml);

$url = (string) $responseElement->data->images[0]->image->url;

if (filter_var($url, FILTER_VALIDATE_URL) === false) {
    throw new \RuntimeException('The Cat Api did not return a valid URL');
}

return $url;

Though correct, this would be pretty bad for readability. It would become worse if there are multiple functions which all need the same kind of post-condition validations. We'd want to have a way to reuse this validation logic. In any case, the return value is still a meaningless string. It would be nice if we could preserve the knowledge that it's a URL. That way, any other part of the application that uses the return value of CatApi::getRandomImage() would be aware of the fact that it's a URL and would never mistake it for, say, an email address.

A value object representing a URL

Instead of writing post-conditions for CatApi::getRandomImage() implementations, we could write pre-conditions for image URLs. How can we make sure that a value like an image URL can't exist, except when it's valid? By turning it into an object and preventing it from being constructed using something that's not a valid URL:

class Url
{
    private $url;

    public function __construct($url)
    {
        if (!is_string($url)) {
            throw new \InvalidArgumentException('URL was expected to be a string');
        }

        if (filter_var($url, FILTER_VALIDATE_URL) === false) {
            throw new \RuntimeException('The provided URL is invalid');
        }

        $this->url = $url;
    }
}

This type of object is called a value object.

Creating it with invalid data is impossible:

new Url('I am not a valid URL');

// RuntimeException: "The provided URL is invalid"

So every instance of Url that you encounter can be trusted to represent a valid URL. We can now change the code of RealCatApi::getRandomImage() to return a Url instance:

$responseElement = new \SimpleXMLElement($responseXml);

$url = (string) $responseElement->data->images[0]->image->url;

return new Url($url);

Usually value objects have methods to create them from and convert them back to primitive types, in order to prepare them or reload them from persistence, or to reconstruct them based on different value objects. In this case we might end up with a fromString() factory method and a __toString() method. This paves the way for parallel construction methods (e.g. fromParts($scheme, $host, $path, ...)) and specialized "getters" (host(), isSecure(), etc.). Of course, you shouldn't implement these methods before you actually need them.

class Url
{
    private $url;

    private function __construct($url)
    {
        $this->url = $url;
    }

    public static function fromString($url)
    {
        if (!is_string($url)) {
            ...
        }

        ...

        return new self($url);
    }

    public function __toString()
    {
        return $this->url;
    }
}

Finally, we need to modify the contract of getRandomImage() and make sure the default image URL will also be returned as a Url object:

class RealCatApi implements CatAPi
{
    ...

    /**
     * @return Url URL of a random image
     */
    public function getRandomImage()
    {
        try {
            $responseXml = ...;
        } catch (HttpRequestFailed $exception) {
            return Url::fromString('http://cdn.my-cool-website.com/default.jpg');
        }

        $responseElement = new \SimpleXMLElement($responseXml);

        return Url::fromString((string) $responseElement->data->images[0]->image->url);
    }
}

Of course, this change should also be reflected in the Cache interface and any class implementing it, like FileCache, where you should convert from and to Url objects:

class FileCache implements Cache
{
    ...

    public function put(Url $url)
    {
        file_put_contents($this->cacheFilePath, (string) $url);
    }

    public function get()
    {
         return Url::fromString(file_get_contents($this->cacheFilePath));
    }
}

Parsing the XML response

One last part of the code I'd like to improve, is this:

$responseElement = new \SimpleXMLElement($responseXml);

$url = (string) $responseElement->data->images[0]->image->url;

I personally don't like SimpleXML, but that's not the problem here. What's really wrong is the fact that we're assuming we get a valid XML response, containing a root element, which contains one <data> element, which contains at least one <images> element, which contains one <image> element, which contains one <url> element, the value of which is supposed to be a string (the image URL). At any point in this chain, an assumption may be wrong, which might cause a PHP error.

We want to take control of this situation and instead of making PHP throw errors at us, define our own exceptions which we can catch and bypass if we want. Again, we should hide all the knowledge about the exact element names and the hierarchy of the response XML inside an object. That object should also handle any exceptional things. The first step is to introduce a simple DTO (data transfer object) which represents an "image" response from the Cat API.

class Image
{
    private $url;

    public function __construct($url)
    {
        $this->url = $url;
    }

    /**
     * @return string
     */
    public function url()
    {
        return $this->url;
    }
}

As you can see, this DTO hides the fact that there is also are <data>, <images>, <image>, etc. elements in the original response. We are only interested in the URL, and we can make it accessible as a simple getters (url()).

Now the code in getRandomImage() could become:

$responseElement = new \SimpleXMLElement($responseXml);
$image = new Image((string) $responseElement->data->images[0]->image->url);

$url = $image->url();

This doesn't help, since we're still stuck with that fragile piece XML traversal code.

Instead of creating the Image DTO inline, we'd be better off delegating it to a factory which will be given the knowledge about the expected XML hierarchy.

class ImageFromXmlResponseFactory
{
    public function fromResponse($response)
    {
        $responseElement = new \SimpleXMLElement($response);

        $url = (string) $responseElement->data->images[0]->image->url;

        return new Image($url);
    }
}

We only need to make sure to inject an instance of ImageFromXmlResponseFactory into RealCatApi and then we can reduce the code in RealCatApi::getRandomImage() to:

$image = $this->imageFactory->fromResponse($responseXml);

$url = $image->url();

return Url::fromString($url);

Moving code around like this gives us the opportunity to make something better of it. Small classes are easier to test. Like mentioned before, the tests we have in place for our classes at this point in general only test the "happy path". A lot of edge cases aren't covered by it. For example, what happens when:

  • The response body is empty
  • It contains invalid XML
  • It contains valid XML with an unexpected structure
  • ...

Moving the XML processing logic to a different class allows you to focus completely on behavior surrounding that particular subject. It even allows you to use a true TDD style of programming, where you define the situations (like the ones in the previous list of edge cases) and the expected results.

Conclusion

This concludes the "Refactoring the Cat API client" series. I hope you liked it. If you have other suggestions for refactoring, leave a comment at the bottom of this page.

Categories: PHP

Tags: OOP refactoring

Comments: Comments