Packages: the case for clones

Posted on by Matthias Noback

Don't reinvent the wheel

There is this ongoing discussion in the PHP community (and I guess in every software-related community) about reinventing wheels. A refreshing angle in this debate came from an article by Phil Sturgeon pointing to the high number of "duplicate" packages available on Packagist. I agree with Phil:

Sometimes these are carbon copies of other packages, but often they are feature-weak versions of established packages.

It doesn't make sense to do the same thing over and over again. At least I personally don't try to make this mistake. If I want to write code that "already exists", at least I don't publish it on Packagist.

However, recently I got myself into the business of "recreating stuff" myself. I released a set of packages related to commands, events, command buses and event buses called SimpleBus. Of course, I'm not the first one to write this kind of code. But here's the thing (and I'm quoting Phil again):

The golden rule is: If they are different, then awesome!

So in my defense, just a quick list of things that I want to offer with SimpleBus, which are different from existing packages. I wanted to offer:

  • Small packages with very simple features that can be used separately from each other.
  • Bridge packages to unlock some powerful features when combining some of the packages in one application.
  • Separate packages for framework integration (in this case Symfony).
  • Separate packages for integration with persistence libraries (in this case Doctrine).

Existing solutions combined way too many features in one package, which is to be considered the cause of people like me "reinventing the wheel". I don't want to have all that code in my project. And I don't want to keep upgrading a package when none of its changes are relevant to me.

When I created the SimpleBus packages, my hope was that maybe these would become the reusable components for other people's efforts to create full-fledged CQRS/event sourcing applications. My packages are very abstract, simple and are certainly not "fancy" in any way. Still they offer the amount of flexibility needed to implement any thing you need with it

A lack of good package design

So I admit, I've been rewriting some existing code (still there are subtle differences which I'll in another post). But why did I do it? Because existing solutions lack good package design. Now, what is good package design? Let me quickly explain about the Package Design Principles (they are much less known than the SOLID principles of class design):

  1. The Release/reuse equivalence principle - You can only reuse code that you release. Release only things that you can reuse.
  2. The Common reuse principle - Release classes together which are reused together. If you use one class in a package, you use (almost) all of the other classes too. In other words: if some classes in a package can be used without using all the other classes, they deserve to be in another package (this is the Interface segragation principle for packages).
  3. The Common closure principle - Packages should have one reason to change. If a package has multiple reasons to change, then split the package (this is the Single responsibility principle for packages).

These first three principles are about package cohesion. They tell you what should be in a package and when it's time to split a package. The second set of principles are about package coupling:

  1. The Acyclic dependencies principle - The dependency graph of packages in a project should have no cycles.
  2. The Stable dependencies principle - Packages should only depend on packages which are more stable than itself. Stable packages are less likely to change. A stable package has no dependencies, but is only being depended upon, i.e. it's a responsible package. Instable packages are dependent packages, and no other package depends upon them, i.e. they are very irresponsible.
  3. The Stable abstractions principle - The more stable a package is, the more abstract things it contains (abstract classes and interfaces). The more instable a package is, the more concrete things it contains (classes).

If all package maintainers would follow these package design principles when creating and releasing packages, there would be much more quality in the world of (PHP) packages. In most cases when I did reinvent the wheel, existing packages that provided more or less the same solution had issues with at least two of the abovementioned design principles.

For example, many packages on Packagist provide both library and framework integration code. This makes them violate the Common reuse principle: if I want to reuse it in an application with a different framework, I'm not using half of the classes in the package. These packages also violate the Stable dependencies principle by depending on an entire framework to actually work. A framework is by definition a very instable package because it is such a dependent package itself and therefore likely to change.

Another example: many libraries on Packagist provide some abstract code and a set of concrete classes for multiple persistence libraries (Doctrine ORM, MongoDB, Propel, etc.). These packages violate the Stable dependencies principle because they depend on so many other packages, which are themselves quite unstable. But they also violate the Stable abstractions principle by containing both concrete and abstract things. And, just like the framework-specific packages they violate the Common reuse principle because if I use Doctrine ORM, then I don't use all the MongoDB and Propel-related classes.

Package consumer intuition

I think that even though the package design principles are not well known yet, many of you already knew what I was talking about. And I think the lack of good package design is one of the reasons why people create their own packages instead of using existing ones. There's always something "wrong" about other people's packages and unfortunately, not even part of such a package can be reused, because everything is so tied together.

If at least everybody would only publish small packages, implementing one considerably small feature, or just providing some very generic classes and interfaces, the situation with regard to people reinventing the wheel would be much better.

A connection with PHP-FIG efforts

There is an interesting connection between the world-wide struggle for truly reusable PHP packages and the same kind of struggle for establishing a collection of absolutely generic, always reusable interfaces which several PSR proposals are trying to accomplish. I think one of the main goals here is interoperability - when you use the PSR interfaces, you should be able to switch between logger, cache or event dispatcher libraries, without thinking twice and without the need to adapt any of your existing code.

Establishing these interfaces has however proven to be a long and difficult process. The reason for this has been explained very well by Anthony Ferrara in his An Open Letter To PHP-FIG:

They are trying to create a reasonably generic solution (99% solution). So they need to handle a HUGE range of needs and requirements.

It's very hard to establish the definitive shape of any kind of interface since every project has its own specific characteristics. Just as it is very difficult to find an all-encompassing solution for command and event buses, because you either have to provide all possible alternatives, or be very opinionated. If you choose the first strategy, then your package will violate the Common reuse principle again because nobody needs all of the stuff your package provides and if you choose the second strategy, your package will not be reused often, since people are likely not to agree with opinionated stuff. So the third strategy, which is I'd say the winning strategy, is to offer only very unopinionated code for reuse. Very generic, very simple code.

I'd like to quote Anthony again on a very general recommendation which he directed to PHP-FIG, while I would in fact direct it to anyone who creates reusable PHP packages:

Please stop trying to solve generic problems. Solve the 50% problem, not the 99% problem.

If everybody does this, I'm sure we should be able to do better at preventing duplicate packages and duplicate coding efforts. Until then, I'm not surprised by the amount of package clones out there.

P.S. Also, creating all this stuff ourselves can be fun too, right? ;)

PHP package design reuse dependencies
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
Eugene Leonovich

I have a doubt about the Common reuse principle. In theory, it sounds good and I totally agree with it, but in practice, this can lead to a lot of small packages which can become a support nightmare. For example, I wrote a library (https://github.com/rybakit/..., which has 8 backed drivers at the moment. To follow the principle I have to split the package into 10 packages, 9 of which will only have one file. And so I wonder if it's worth it. Thanks.

Matthias Noback

Hi Eugene, thanks for bringing this is up. There is a golden middle of course. You can not fully apply this principle. But combined with others, indeed, the best option is to move these adapter classes out of the core package. Some of the reasons:

- You now use suggested dependencies instead of required dependencies which brings all sorts of trouble and doesn't reflect the true dependencies of your package.

- Your users have to keep track of updates of this package even if changes are irrelevant to them (because they are related to an adapter which they don't use).

- Even though the core of your package is quite stable, the adapters make it unstable, because of their concreteness and their dependency on something external. This means things may easily change/break.
I also acknowledge the amount of extra work to maintain a bunch of packages instead of just one. However, as I noticed myself, this burden isn't that heavy. It requires some initial setup work, which may be overcome when the tools become better/more integrated. Afterwards, you'll notice that some package almost never need maintenance, because they are very stable, and other packages only change because of changes in external dependencies. Also, splitting the packages allows them to have a different lifecycle (e.g. different versions), depending on how they evolve. And maybe you can find other people who would like to maintain just one adapter library, so you will have one more thing that you don't have to be worried about.

By the way, I think my new book might interest you: https://leanpub.com/princip... It's full of these kinds of discussions.

Eugene Leonovich

Hi Matthias, thank you for such a detailed answer. It's much appreciated.

Matthias Noback

My pleasure!

Hari K T

@matthiasnoback:disqus I was thinking may be packagist should provide a different place to post framework specific packages.

Eg : I don't know whether symfony bundle have any use on zend framework. So bundles can be probably moved to symfony.packagist which is specifically to symfony as a framework. I am not saying all the components of symfony should go there. So as laravel packages to laravel specific , zend to zend, aura to aura etc.

I am not sure how others feel about this though, so it will be much easier to find the right package.

Matthias Noback

Yes, that might make sense. Laravel has this already. And there is knpbundles.com for Symfony bundles. However, I think we might win a lot more when people would actually offer both libraries and bundles and use the Composer suggest key to suggest framework adapters.

Hari K T

My idea keeping separate place is to find useful components for PHP than seeing a lot of bonded components of framework which have nothing to do.

philsturgeon

These framework specific package directories scare me a bit. Lots of Laravel developers only look at http://packalyst.com/, and when they cant find a package that does what they want they build it themselves and post it up there, furthering the framework specific cycle.

Instead of asking people to hide their framework specific code away from others, we should probably keep trying to educate people about the benefits of framework agnostic code.

Hari K T

I don't disagree with you Phil. May be composer have a different tag to find the best reusable components counting the dependencies or some how ?

Matthias Noback

Then again, the number of dependencies is not sufficient for this kind of quality assessment. The ideal situation is I think: only library packages are listed on Packagist. Then each of the search results can list a subset of framework-specific integration packages (or other packages from the "suggest" key in their composer.json file). This however, is ideal and will never be real ;)

philsturgeon

It does not seem like the job of Packagist to pass judgement on the quality of the code, it is just a directory service.

Another service is in production by a third party which DOES intend to pass judgement on code quality and various metrics, which seems like a better way to go about it. :)

Hari K T

I am not saying packagist functionality is code quality, but when it can improve the search that is not bad though ;) .

devosc

When a new package is added and it has a dependency on another package, it increases the Favourites count for that dependency. Also knowing how many other packages depend on that package can be a better indicator and should factor into it.

devosc

Knowing how many other packages are dependent on that package might help?

devosc

The code looks good, however I think it should be implemented using Generators.

Matthias Noback

I think your comment belongs to a different post, which is it?

devosc

I get confused sometimes .. It was to do with the handlers. It look like it would be better to take advantage of 5.5's Generator and not have to create the entire array in memory first.

Matthias Noback

Ah, I get it, yes. However, I wanted to keep this PHP 5.4. Even though it's just a small step to 5.5 and that makes it even more "cool" ;) Maybe later...

devosc

Being able to traverse the object is what I would be looking for (or thinking of) the most.