Last week I wrote about when to add an interface to a class. The article finishes with the claim that classes from the application's domain don't usually need an interface. The reason is that domain code isn't going to be swapped out with something else. This code is the result of careful modelling work that's done based on the business domain that you work with. And even if you'd work on, say, two financial software projects in a row, you'll find that the models you produce for each of them will be different in many subtle (if not radical) ways. Paradoxically you'll find that in practice a domain model can sometimes be reused after all. There are some great examples out there. In this article I explain different scenarios of where and how reuse could work.
In "Facts and Fallacies of Software Engineering" (2002), Robert Glass speaks about reuse-in-the-small:
Reuse-in-the-small (libraries of subroutines) began nearly 50 years ago and is a well-solved problem.
Reuse in software is quite possible, but only if the amount of shared code is relatively small. Examples of components with the right size in the PHP ecosystem would be:
- Flysystem (filesystem abstraction)
- ProxyManager (proxy generation)
- JMSSerializer (object serialization)
- Symfony Validator (validation)
And so on. The idea being: if the library is basically a utility function that "got a bit out of hand", but is flexible at the same time, supporting many different use cases, then we can speak of successful reuse. In particular because PHP package maintainers tend to set a very high standard for themselves: every quality indicator should be green, 100%, etc. And so it can happen that these packages have millions and millions of downloads.
By the way, I also count frameworks as successful reuse-in-the-small: most of them are a collection of utility-like libraries anyway, and they rarely get in the way in terms of flexibility, at least in my experience; you can build any web application on top of any of them.
Reuse-in-the-large and software diversity
If a reusable component is too large, many aspects of it will be irrelevant, or even counter-productive, for its users. This leads to objections like:
- When I'd use this component in my project, I'll be downloading way too much stuff that I'll never actually need.
- Considerable parts of this component don't work as I want them to, so maybe I should roll my own.
- This component is good today, since we're in the prototype phase, but probably in about a year, it will limit us.
In Glass's terms, we're talking about reuse-in-the-large, and he poses that:
Reuse-in-the-large (components) remains a mostly unsolved problem, even though everyone agrees it is important and desirable.
Software projects in general are very diverse. Every project comes with its own requirements, its own domain experts, its own team, and everything about it is special (although some product owners would be better off not thinking that their project was so very special). Still, one might say, there should be some common ground. Some domain knowledge will be potentially reusable, like a
Money class, or a
DateTime class, right?
If there are enough common problems across projects and even application domains, then component-based approaches will eventually prevail. If, as many suspect, the diversity of applications and domains means that no two problems are very similar to one another, then only those common housekeeping functions and tasks are likely to be generalized, and they constitute only a small percentage of a typical program’s code.
Over the past few years we've had some excellent articles making the rounds, which prove the point that "no two problems are the same". I'd like to mention Ross Tuck's "Precision Through Imprecision: Improving Time Objects" and Mathias Verraes's "Type Safety and Money". In these articles, we learn by example how design decisions are based on domain knowledge, that different situations require different decisions, and that designs resulting from those decisions can't be useful in every other situation. Trying to use things like a
Money object in situations where they just don't belong, is very much like the saying: "trying to fit a round peg in a square hole".
In particular, reusing domain code will make us feel like we have to ignore or work around certain aspects of it. Which is why, instead of reusing this code from some shared location, we might be better off copying it instead. By copying the code and using it in a different situation, we can find out if the code really has potential for reuse. And it gets even better: it'll be easier to find the right abstractions, since we'll be able to clearly see what's essential to the thing we we're trying to reuse, once we see how it may serve other use cases.
We won't have any of these advantages if we aim to reuse the code from the outset. Which is why Glass gives us one of his "rules-of-three":
[...] a reusable component should be tried out in three different applications before it will be sufficiently general to accept into a reuse library.
Reuse-in-the-even-larger: reusing entire subdomains
It's funny that, while reuse-in-the-large is deemed an unsolved problem, today we see several larger reusable software projects, which are (according to Glass, against all odds) very successful. We see large reusable components for e-commerce software, like Sylius, Spryker, Spark, etc. These are not just oversized utility functions; they provide complete solutions for running commercial internet-based businesses, and offer features like inventory management, payments and invoicing, and so on. There's bound to be a lot of domain decisions in that code. This contradicts the reuse-in-the-large problem of software diversity. Even though no two problems/projects are the same, these components still aim to solve many problems at once, and given the popularity of them, we should conclude that these reusable components are indeed examples of successful reuse-in-the-large.
How to explain this?
In part, I think, because most of us don't want to spend time building the equivalent components from the ground up. As it says on the Spark homepage:
Spark is a Laravel package that provides scaffolding for all of the stuff you don't want to code.
Another reason might be that these components are still relatively small and focused - in most cases there's always the option to write your own component to replace the third-party ones. Limiting the scope of a package to a small part of the domain therefore contributes to its success. At that point you can make a conscious decision: which part of the domain is special in your case, which part requires more careful modelling, how can my application make a difference?
Eric Evans has written extensively about how and when reusing a domain model could work, in his book "Domain-Driven Design" (2003). He explains how you can divide the overall domain of an application or software system into several subdomains. He calls this "strategic distillation", because the aim is to find out what your core domain is, separating it from the other subdomains, including so-called "generic" ones. Evans summarizes this as follows:
Identify cohesive subdomains that are not the motivation for your project. Factor out generic models of these subdomains and place them in separate MODULES. Leave no trace of your specialties in them. Once they have been separated, give their continuing development lower priority than the CORE DOMAIN, and avoid assigning your core developers to the tasks (because they will gain little domain knowledge from them). Also consider off-the-shelf solutions or published models for these GENERIC SUBDOMAINS.
For application developers this is useful advice, because it helps you focus your development effort on areas where your application can stand out amongst many others. At the same time, it will help you decide for which parts of your application you should rather use an existing library or external service, also known as an "off-the-shelf solution".
It's also useful advice for package developers: application developers looking for off-the-shelf solutions are the potential users of the packages that you publish. So whenever you consider extracting part of your application into a reusable package, consider if it can be used by others to help them get to their core domain quicker.
Once more, for application developers this is crucial advice: when using these third-party solutions in your own applications, consider once again if you've done your strategic distillation right. Can you be certain that the off-the-shelf solution won't get in your way when you're continuously improving the code for your core domain?
Reuse-in-the-small is definitely possible. Reuse-in-the-large is deemed to be impossible, because no two problems/projects are alike, but practice proves otherwise. There are reusable components covering entire subdomains, which are nonetheless quite successful. The chance of success is bigger if such a reusable component is used to cover for a generic subdomain. Using an off-the-shelf solution in such a case helps you save development effort which can instead be redirected to the core domain. Another requirement is that the component is flexible enough to replace parts that don't fit well with your requirements, and/or that the components are small enough to be replaced or rewritten in its entirety.
Here's something interesting to watch: a talk by Eric Evans called "Exploring Time". In it, he discusses date/time calculations in terms of instances and intervals. This could be considered a generic subdomain that's part of almost every larger domain. Pointing out that existing date/time handling APIs don't make a lot of sense, he comes up with better models. One big conclusion for me is that modelling efforts in generic subdomains will improve the models in many ways. Even though we don't know all the situations in which a generic model will be used, we can still do a better job by extensively thinking and researching aspects of the domain.
Very good article.
What I saw about reuse-in-the-large: yes it can answer your need at the beginning but more and more the company will grow, more and more this kind of tool won't fit anymore.
A concrete example: I worked with Magento for an e-commerce. After 3 years, they had to rewrite everything with Symfony. Why?
Simply because the domain get more specific and doesn't fit in the framework. It was difficult to use Magento's architecture for more and more data (EAV...), and impossible to replace some component of the framework.
To me tools which provide reuse-in-the-large are attractive for a lot of company and that's why they are successful:
- Companies in early phases need to ship out their product and begin to bring money very quickly, whatever the technical cost later. "If we don't do it know, there won't be any later anyway".
- A lot of companies ignore that reusing everything can lead to rewrite everything. A towel can be used multiple times, the code could as well. It's tempting to think that way and I saw it everywhere, coupled with a bad understanding of DRY.
A little precision: I worked mostly in startup / small web agencies.
Thanks for sharing your story here! I think your assessment about startup phases is correct. And I don't think it's a big problem, but development teams should be aware about the trade-off and realize that they may need to put in a lot more time to take ownership of the core components.