Packages: the case for clones
Matthias Noback
Don’t reinvent the wheel
There is this ongoing discussion in the PHP community (and I guess in every software-related community) about reinventing wheels. A refreshing angle in this debate came from an article by Phil Sturgeon pointing to the high number of “duplicate” packages available on Packagist. I agree with Phil:
Sometimes these are carbon copies of other packages, but often they are feature-weak versions of established packages.
It doesn’t make sense to do the same thing over and over again. At least I personally don’t try to make this mistake. If I want to write code that “already exists”, at least I don’t publish it on Packagist.
However, recently I got myself into the business of “recreating stuff” myself. I released a set of packages related to commands, events, command buses and event buses called SimpleBus. Of course, I’m not the first one to write this kind of code. But here’s the thing (and I’m quoting Phil again):
The golden rule is: If they are different, then awesome!
So in my defense, just a quick list of things that I want to offer with SimpleBus, which are different from existing packages. I wanted to offer:
- Small packages with very simple features that can be used separately from each other.
- Bridge packages to unlock some powerful features when combining some of the packages in one application.
- Separate packages for framework integration (in this case Symfony).
- Separate packages for integration with persistence libraries (in this case Doctrine).
Existing solutions combined way too many features in one package, which is to be considered the cause of people like me “reinventing the wheel”. I don’t want to have all that code in my project. And I don’t want to keep upgrading a package when none of its changes are relevant to me.
When I created the SimpleBus packages, my hope was that maybe these would become the reusable components for other people’s efforts to create full-fledged CQRS/event sourcing applications. My packages are very abstract, simple and are certainly not “fancy” in any way. Still they offer the amount of flexibility needed to implement any thing you need with it
A lack of good package design
So I admit, I’ve been rewriting some existing code (still there are subtle differences which I’ll in another post). But why did I do it? Because existing solutions lack good package design. Now, what is good package design? Let me quickly explain about the Package Design Principles (they are much less known than the SOLID principles of class design):
- The Release/reuse equivalence principle - You can only reuse code that you release. Release only things that you can reuse.
- The Common reuse principle - Release classes together which are reused together. If you use one class in a package, you use (almost) all of the other classes too. In other words: if some classes in a package can be used without using all the other classes, they deserve to be in another package (this is the Interface segragation principle for packages).
- The Common closure principle - Packages should have one reason to change. If a package has multiple reasons to change, then split the package (this is the Single responsibility principle for packages).
These first three principles are about package cohesion. They tell you what should be in a package and when it’s time to split a package. The second set of principles are about package coupling:
- The Acyclic dependencies principle - The dependency graph of packages in a project should have no cycles.
- The Stable dependencies principle - Packages should only depend on packages which are more stable than itself. Stable packages are less likely to change. A stable package has no dependencies, but is only being depended upon, i.e. it’s a responsible package. Instable packages are dependent packages, and no other package depends upon them, i.e. they are very irresponsible.
- The Stable abstractions principle - The more stable a package is, the more abstract things it contains (abstract classes and interfaces). The more instable a package is, the more concrete things it contains (classes).
If all package maintainers would follow these package design principles when creating and releasing packages, there would be much more quality in the world of (PHP) packages. In most cases when I did reinvent the wheel, existing packages that provided more or less the same solution had issues with at least two of the abovementioned design principles.
For example, many packages on Packagist provide both library and framework integration code. This makes them violate the Common reuse principle: if I want to reuse it in an application with a different framework, I’m not using half of the classes in the package. These packages also violate the Stable dependencies principle by depending on an entire framework to actually work. A framework is by definition a very instable package because it is such a dependent package itself and therefore likely to change.
Another example: many libraries on Packagist provide some abstract code and a set of concrete classes for multiple persistence libraries (Doctrine ORM, MongoDB, Propel, etc.). These packages violate the Stable dependencies principle because they depend on so many other packages, which are themselves quite unstable. But they also violate the Stable abstractions principle by containing both concrete and abstract things. And, just like the framework-specific packages they violate the Common reuse principle because if I use Doctrine ORM, then I don’t use all the MongoDB and Propel-related classes.
Package consumer intuition
I think that even though the package design principles are not well known yet, many of you already knew what I was talking about. And I think the lack of good package design is one of the reasons why people create their own packages instead of using existing ones. There’s always something “wrong” about other people’s packages and unfortunately, not even part of such a package can be reused, because everything is so tied together.
If at least everybody would only publish small packages, implementing one considerably small feature, or just providing some very generic classes and interfaces, the situation with regard to people reinventing the wheel would be much better.
A connection with PHP-FIG efforts
There is an interesting connection between the world-wide struggle for truly reusable PHP packages and the same kind of struggle for establishing a collection of absolutely generic, always reusable interfaces which several PSR proposals are trying to accomplish. I think one of the main goals here is interoperability - when you use the PSR interfaces, you should be able to switch between logger, cache or event dispatcher libraries, without thinking twice and without the need to adapt any of your existing code.
Establishing these interfaces has however proven to be a long and difficult process. The reason for this has been explained very well by Anthony Ferrara in his An Open Letter To PHP-FIG:
They are trying to create a reasonably generic solution (99% solution). So they need to handle a HUGE range of needs and requirements.
It’s very hard to establish the definitive shape of any kind of interface since every project has its own specific characteristics. Just as it is very difficult to find an all-encompassing solution for command and event buses, because you either have to provide all possible alternatives, or be very opinionated. If you choose the first strategy, then your package will violate the Common reuse principle again because nobody needs all of the stuff your package provides and if you choose the second strategy, your package will not be reused often, since people are likely not to agree with opinionated stuff. So the third strategy, which is I’d say the winning strategy, is to offer only very unopinionated code for reuse. Very generic, very simple code.
I’d like to quote Anthony again on a very general recommendation which he directed to PHP-FIG, while I would in fact direct it to anyone who creates reusable PHP packages:
Please stop trying to solve generic problems. Solve the 50% problem, not the 99% problem.
If everybody does this, I’m sure we should be able to do better at preventing duplicate packages and duplicate coding efforts. Until then, I’m not surprised by the amount of package clones out there.
P.S. Also, creating all this stuff ourselves can be fun too, right? ;)