There's no such thing as an optional dependency

Posted on by Matthias Noback

On several occasions I have tried to explain my opinion about "optional dependencies" (also known as "suggested dependencies" or "dev requirements") and I'm doing it again:

> There's no such thing as an optional dependency.

I'm talking about PHP packages here and specifically those defined by a composer.json file.

What is a dependency?

First let's make sure we all agree about what a dependency is. Take a look at the following piece of code:

namespace Gaufrette\Adapter;

use Gaufrette\Adapter;
use \MongoGridFS;

class GridFS implements Adapter
{
    private $gridFS;

    public function __construct(MongoGridFS $gridFS)
    {
        $this->gridFS = $gridFS;
    }

    public function read($key)
    {
        $file = $this->find($key);

        return ($file) ? $file->getBytes() : false;
    }
}

This GridFS class is part of the Gaufrette filesystem abstraction library, though I heavily modified it.

To determine all the dependencies of this code we can ask the following question:

> What is needed to run this code?

You need to think of several things:

  1. Which PHP version is needed to run the code without getting a syntax error? Maybe you even need a specific patch version (like 5.3.6) because of a bug in older 5.3 versions that could interfere with your code.

  2. Which PHP extensions should be installed?

  3. Which PEAR libraries should be installed?

  4. Which other packages should be installed?

In the case of the GridFS class the PHP version should be at least PHP 5.3, because of the use of namespace. Also the \MongoGridFS class should be available. This class is part of the mongo PECL extension for PHP. The \MongoGridFS class is only available since version 0.9.0 of that PHP extension, so we have to make sure that we explicitly mention this version constraint. Finally, it appears there are no other packages needed to be able to use the GridFS class. So when we would create a composer.json file for a package that contains the GridFS file, it would look like this:

{
    ...,
    "require": {
        "php": ">=5.3",
        "ext-mongo": ">=0.9.0"
    }
    ..
}

Now this is an exhaustive list of the dependencies of package that contains the GridFS class: when these dependencies are installed, nothing stands in the way of using this class in your application.

The actual list of dependencies of knplabs/gaufrette

As I already mentioned the GridFS class is part of the Gaufrette library which provides a filesystem abstraction layer so you can store files on different types of filesystems without worrying about the details of those filesystems. Let's take a look at the composer.json file of this library:

{
    "name": "knplabs/gaufrette",
    "require": {
        "php": ">=5.3.2"
    },
    "require-dev": {
        ...
    },
    "suggest": {
        ...
        "amazonwebservices/aws-sdk-for-php": "to use the legacy Amazon S3 adapters",
        "phpseclib/phpseclib": "to use the SFTP",
        "doctrine/dbal": "to use the Doctrine DBAL adapter",
        "microsoft/windowsazure": "to use Microsoft Azure Blob Storage adapter",
        "ext-zip": "to use the Zip adapter",
        "ext-apc": "to use the APC adapter",
        "ext-curl": "*",
        "ext-mbstring": "*",
        "ext-mongo": "*",
        "ext-fileinfo": "*"
    },
    ...
}

After what we've discussed above, this is quite a surprise: the library says it has only one actual dependency: a PHP version that is at least 5.3.2. Everything else is either a "dev" requirement or a "suggested" requirement.

Of course people who use Composer and Packagist for some time now (like myself) have become quite used to this way of advertising the dependencies of a package. But it is just wrong. As we concluded earlier, ext-mongo is a true dependency of the GridFS class, yet looking at the composer.json file it is only a suggested dependency.

This means that if I want to use the class in my project, it is not sufficient to require just the knplabs/gaufrette package. I also have to add ext-mongo as a requirement to my own project. Which is semantically wrong: it is not my project that needs the mongo extension, it is the knplabs/gaufrette package that actually needs it. Besides, how do I know which version of ext-mongo I have to choose? Dependencies listed under the suggest key in composer.json don't come with version constraints, so I have to figure them out myself.

Not just this package

knplabs/gaufrette is not the only package out there that advertises actual, required dependencies as "suggested" dependencies. It is a convenient way for package maintainers to put a lot of different classes in a package that may or may not be needed by users. Since using those classes is optional, their dependencies are made optional too. But package maintainers forget that dependencies never are optional. They are always required, since the code would not be executable without them.

The solution

What package maintainers should do is split their packages. In the case of knplabs/gaufrette this means there should be a knplabs/gaufrette package containing all the generic code for filesystem abstraction. Then each specific adapter, like the GridFS class, should live in its own package, e.g. knplabs/gaufrette-mongo-gridfs. This package itself has the following dependencies:

{
    ...,
    "require": {
        "php": ">=5.3",
        "knplabs/gaufrette": "~0.1"
        "ext-mongo": ">=0.9.0"
    }
}

No hidden dependencies there: everything is truly required for using the code in this package.

On the other hand the knplabs/gaufrette package has almost no dependencies anymore, and the "adapter" packages are listed under the suggested key:

{
    "require": {
        "php": ">=5.3.2"
    },
    "suggested": {
        "knplabs/gaufrette-mongo-gridfs": "For storing files using Mongo GridFS",
        ...
    }
}

This approach has many advantages:

  1. The main package will be very stable. There are almost no reasons for it to change anymore, since all the moving parts are inside the "adapter" packages.

  2. The adapter packages can have different specialists as maintainers, for instance the knplabs/gaufrette-mongo-gridfs can be maintained by someone who knows all about MongoDB.

  3. Users don't have to keep track of available updates for parts of the library they don't use.

  4. Users don't have to manually add extra dependencies to their projects (which means they don't have to worry about version constraints for them).

So keep in mind, next time you are tempted to add a suggested dependency to your package: is it an actual dependency of (part of) the code in this package? Then split the package and reinstate that dependency as a true requirement. If all the code in the package works perfectly well without that suggested dependency, then you are indeed allowed to advertise it as a suggested dependency.

Want to know more?

Cover of Principles of PHP Package Design I'm working on a book about package design principles, based on the work of Robert Martin. You may register yourself as an "interested reader" and receive a considerable discount when the first part of the book becomes available next week.

You may also want to read some of the articles about package coupling by Paul Jones (Frameworks are good, components are awesome!, Symfony components: sometimes decoupled, sometimes not). He maintains the Aura framework and components and does a great job when it comes to package coupling.

Book PHP Composer coupling package design
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
Adam Balogh

The idea itself is great, but what about the development requirements? Where should I put the require-dev dependencies? I would not bother those.

l3l0

I agree with you in 100%. Basically we should do it for each adapter in Gaufrette and each adapter should have own maintainer... but currently this can be difficult to achieve :)

Matthias Noback

Thanks for your opinion about the matter (especially since you are one of the core maintainers of Gaufrette!). I understand there can be practical difficulties with this :)

GromNaN

You are absolutely true that as a user of a library, requiring 1 package for 1 feature must install only/all the required packages.

The general pattern for packages for composer is to have 1 repository = 1 library.
If we want to multiply packages for each sub-requirements (adaptor), this means that we must create many new git repository and split the code. That would be more work to maintain.

We could find a solution to have many libraries, with different requirements, pointing to the same Git repository.

Matthias Noback

That's an interesting point. Lukas Smith mentioned something like that too: maintaining many repositories gives some overhead.
Well, first of all, I don't share that experience. The overhead of having another repository is just really small, you only need to set up some things once, like a composer.json file, a license file, a readme file (could be really simple), a .gitignore file (always the same), a test bootstrap file, you need to create the repository on GitHub, you need to register the package on Packagist, set up continuous integration on Travis, etc. This quite a hassle. However, it will take no more than half an hour to do all this. Besides everything mentioned here could be easily automated. I didn't actually search for such tools, but I'm sure they exist or could easily be created. Maybe I'll do it myself :)
After the initial "slow" start for setting up a new package, things will become really easy for the maintainer(s) of those packages. I sure hope that more people will take this approach.

Lukas

The overhead of creating a new repo is not the problem since this is a one time effort. However there is daily stuff: Coordinating new releases, cross cutting changes (these should ideally be very few when doing decoupling right), constantly filled up travis-ci queue, tickets and PRs scattered across many repos etc. This can quickly add up.

Marc Morera Merino

Well, working with just one repository and splitting all subrepositories with a good post-commit and post-tag hook, this nightmare becomes just a very simple workflow.

I really like what Matthias proposes, but I can easily understand the complexity of maintaining many packages without splitting process.

Matthias Noback

Thanks for pointing this out Lukas! I can imagine with these big projects such overhead really starts to add up. Do you think there are quick wins when it comes to the major pain points?

Lukas

Hmm not sure about quick wins. We are not using subtree splits but this seems to be a popular option by several projects. Building CLI tools to help is another approach (gush, Fabien has his own thing etc). We also build our own dashboards to aggregate some key bits of info (http://ghag.dantleech.com/#, http://cmf.davidbu.ch) though there are also some generic tools (http://williamdurand.fr/Tra..., https://waffle.io). Paying travis ci to get a dedicated queue is probably a bit out of reach for most.

cordoval

a question also i want to ask, should Gush persist the composer.lock on the repo? why yes or no? in your opinion

Sebastiaan Stok

It depends, for an application (including phar-only) its best to have your versions locked so that everything works as expected when installing (especially if your using a dev version).

For a library it does not make sense to keep the lock file, but could be handy to figure out what versions have changed that break your tests (this is something Johannes (schmittjoh) once mentioned).

cordoval

We have done so on Gush now. Thanks @sstok!

cordoval

i think one of the unexploited features of composer is the provider key. I tried to use this in Gush but it just did not work. It does work on cmf but i think you can almost ignore it since it is not relevant for the use of composer. I honestly think it should be removed from composer.

Gush has adapters https://github.com/gushphp/... but i removed the dependencies back because they often cause problems when installing a package. So i guess if someone wants to install Github Adapter one should clone the adapter and it will pull the main Gush package?
See the problem?

Now if we say that we want to install Gush with support for github adapter then we should probably create a deploy or build package that pulls these other two then.

Ultimately I was thinking pulling all the adapters into the Gush repository since the idea is to switch for any kind of support, however that would make Gush a fat beast. It is already a beast :), i mean including Github, enterprise, bitbucket, gitlab, etc, even a Git abstraction.

I have also seen how Behat phar extensions are coupled but more needs to be said about how to approach the clustering of packages and its dependencies when integrating.

Matthias Noback

Good point - it is quite difficult to accomplish what I mention here when the result of building your project should be a .phar file ;) Such a project would definitely benefit from a composer.lock file (but it seems you have settled that question already!). The same goes for Composer itself, which people will upgrade whatever compatibility feature is included (like SVN support, when they only use Git anyway).
But even though people will only install Gush as a .phar file, it would still make sense for you to have separate repositories/packages. It will probably be easier to maintain them separately (different people can maintain the different adapters. You will have one "gush/gush-core" package with files shared by all the adapters, upon which the adapter packages can depend. Then you could create a new "gush/boxed-gush" package, which requires all the available adapter packages and contains the tools needed to build the .phar file. What do you think?

Paul M. Jones

Great post, and glad to see others taking up the banner on this. Thanks for the link-backs; for one more related to package decoupling, may I suggest http://paul-m-jones.com/arc... (“On Decoupling and Dependencies”) ?

Matthias Noback

Thanks Paul. As I was diving deeper in the archive of your blog, I saw that one too. It's certainly ironic: I actually remember reading that article a while ago when I hadn't given so much thought on the subject. At the time I didn't agree with you at all!

Now that I read it again, there are so many interesting things going on in that debate, and it's mainly people-things. We'll be in touch I guess!