Negative architecture, and assumptions about code

Posted on by Matthias Noback

In "Negative Architecture", Michael Feathers speaks about certain aspects of software architecture leading to some kind of negative architecture. Feathers mentions the IO Monad from Haskell (functional programming) as an example, but there are parallel examples in object-oriented programming. For example, by using layers and the dependency inversion principle you can "guarantee" that a certain class, in a certain layer (e.g. domain, application) won't do any IO - no talking to a database, no HTTP requests to some remote service, etc.

[...] guarantees form a negative architecture - a set of things that you know can’t happen in various pieces of your system. The potential list is infinite but that doesn’t stop this perspective from being useful. If you’re using a particular persistence technology in one part of your system, it is valuable to be able to say that it is only used in that place and no place else in the system. The same can be true of other technologies. Knowing what something is not able to do reduces the number of pitfalls. It puts architecture on solid ground.

I find that this is a great advantage of making any kind of agreement between team members: about design principles being applied, where to put certain classes, etc. Even making a clear distinction between different kinds of tests can be very useful. It's good to know that if a test is in the "unit test" directory, it won't do any IO. These tests are just some in-memory verifications of specified object behavior. So we should be able to run them in a very short amount of time, with no internet connection, no dependent services up and running, etc.

A negative architecture is what's "between the lines". You can look at all the code and describe a positive architecture for it, but given certain architecture or design rules, there's also the background; things we can be sure will never happen. Like a User entity sending an email. Or an HTML template doing a database migration. Or an HTTP client wiping out the entire disk drive. And so on and so on; I'm sure you can think of some other funny stuff that could happen (and that would require instant, nightmarish fixing).

It all sounds really absurd, but tuned down a little, these examples are actually quite realistic if you work with a legacy project that's in a bad shape. After all, these scenarios are all possible. Developers could've implemented them if they wanted to. We can never be certain that something is definitely not happening in our system. In particular if we don't have tests for that system. We can only have a strong feeling that something won't be happening.

The legacy project I'm working on these days isn't quite as bad as the ones we're fantasizing about now. But still, every now and then I stumble upon an interesting class, method, statement, which proves that my picture of the project's "negative architecture" isn't accurate enough. This always comes with the realization that "apparently, this can happen". Let's take a look at a few examples.

"An object does nothing meaningful in its constructor"

I've learned over time to "do nothing" in my constructor; just to accept constructor arguments and assign them to properties. Like this:

final class SomeService
{
    private $foo;
    private $bar;

    public function __construct($foo, $bar)
    {
        $this->foo = $foo;
        $this->bar = $bar;
    }
}

Recently, I replaced some static calls to a global translator function by calls to an injected translator service, which I added as a new constructor argument. At the last line of the constructor I assigned the translator service to a new attribute (after all, that's what I've learned to do with legacy code - add new things at the end):

final class SomeService
{
    // ...

    private $translator;

    public function __construct(..., Translator $translator)
    {
        // ...

        $this->translator = $translator;
    }
}

However, it turned out that the method that needed the (now injected) translator, was already called in the constructor:

public function __construct(..., Translator $translator)
{
    // ...
    $this->aMethodThatNeedsTheTranslator(...);

    $this->translator = $translator;
}

So before I assigned the translator to its dedicated attribute, another method attempted to use it already. This resulted in a nasty, fatal runtime error. It made me realize that my assumption was that in this project, a constructor would never do something. So even though this was a very rare case, it is something to be aware of: apparently I don't have a fully accurate picture of the "negative" architecture of the system.

By the way, the general rule that nothing gets done inside a constructor, has a related rule, which follows from it: assigning values to attributes doesn't have to happen in a specific order. I want this to be the case, since I want to have the option of swapping the order when it enhances readability.

"An object doesn't also fetch dependencies statically if it also gets any number of dependencies injected"

Related to the story about my misunderstanding of constructor arguments in some parts of the project, is the implicit rule I thought would apply to every class in the project: an object will get all its dependencies injected. I would at least like it if every object was designed like that. But I've learned to be sceptical when I encounter a service class with no (or weird) constructor arguments:

class LegacyService
{
    public function __construct()
    {
        // ...
    }
}

Since this is a service, I expect it to do something with the help of some other objects. Apparently, it doesn't get those objects injected, so I assume it fetches them from a global static place (see also my previous article "Road to dependency injection").

When looking at a class and seeing that some of its dependencies get injected, I will conclude that all of its dependencies get injected. I don't expect any of its methods to make a call to the service locator. So this is another example of negative design: we're assuming that something can never happen. I don't need to explain to you how my assumption turned out to be wrong on several occasions...

"If a class is not marked final, it has at least one subclass"

Something that has changed for me personally over the last couple of years is my use of the final keyword in front of a class declaration. There are excellent reasons for doing so, which can be summarized as: the option to extend a class should be one of its explicit use cases. It should be consciously designed to be possible to extend a class; it should not be possible by-default.

This would be an excellent example of a negative architecture: "you'll find that we never extend classes, except when they are non-final". Nowadays, when I see a "naked" class declaration, I'm assuming it will have at least one subclass. If it doesn't, also in a legacy project, I add final to it, as a first step to protect its design and prevent abuse.

Anyway, to make the distinction more clear between a class with subclasses and a class without any subclasses by design, you should make potential parent classes abstract.

"If a property is not marked "private", it's used by at least one subclass"

In the same spirit, I noticed that when I see a protected attribute, I assume that it's needed and actually used by one of the subclasses. Of course, in the real world, it turns out that often there aren't even any subclasses, but these properties are just protected by-default. If you can apply this rule to your project, it will be extremely useful. One helpful aspect is that your IDE or static analysis tool will be able to mark certain properties as unused (e.g. only used in the constructor) or as dynamically declared (e.g. not declared, but defined ad hoc).

class LegacyClass
{
    /*
     * Is this property protected on purpose or by-default?
     * The class isn't final, so... I guess it's on purpose.
     */ 
    protected $foo;
}

"An abstract class won't use methods that aren't part of its published interface"

There's a lot that I/we don't expect a class to do. One of these things is: calling a method that hasn't been defined first. The fun thing is, PHP being a dynamic language, it's totally possible to make a call in a parent class to a method that only gets defined in a subclass (or in a trait):

abstract class Foo
{
    public function foo()
    {
        $this->bar();
    }

    // bar() is strictly speaking not known to Foo
}

final class Bar extends Foo
{
    // bar() is only defined in subclasses of Foo

    public function bar()
    {
        // ...
    }
}

$foo = new Foo();
// not a problem at all!
$foo->foo();

Since PHP is a dynamic language, and everything only gets resolved at runtime, it's not a problem that bar() is an unknown method to Foo; as long as it's available at runtime. However, using this dynamic language trick isn't considered any form of reasonable practice at all, so when reading code, most of us will not even consider this a possibility.

However - you can already see it coming - I introduced a bug while refactoring a piece of code, doing exactly this, while apparently not all subclasses had this particular method. I thought this rule definitely belonged to the negative architecture of the application: no class would use a method that's not part of the published interface.

"If an object has one public method, it won't have any public static methods"

In my world, there are two types of classes:

  1. Classes which need to be instantiated and then offer one or more object or instance methods.
  2. Classes which don't need to be or can't be instantiated, but still offer one or more static methods.

I consider mixing these two characteristics bad design. However, you can do it. It's just that, nobody does it. Except when they do. Again, I introduced a regression in our code base, because I assumed all the classes to obey my classification. Again, it would be an awesome rule to be able to trust blindly.

Note that an important exception to this rule would be a named constructor, which is by definition a public static method. I find that domain objects can benefit from such a static method. So probably the first type of classes should be described as: "Classes which need to be instantiated (possibly using one or more static, named constructor methods) and then offer one or more object or instance methods." (thanks for the hint, Sylvain Robez-Masson!).

Conclusion

I think "negative architecture" is an interesting way to look at applications. Also, enumerating the above examples was a surprising exercise for me. As far as I remember, these rules aren't written down in any of the big programming books. But we still live by them, and create ourselves an image of what is and is not in our applications; what can or can't happen in them. This is very helpful, because it makes it safer for us to make changes to the code. So it would be a good idea to verify and enforce these rules on a regular basis, when doing refactoring, and while doing code reviews.

Another thought I just had: for most - if not all - of the examples above one could easily write a static analyzer, looking for cases of them in a given code base (or extend existing ones like PHP_Codesniffer or PhpStan).

PHP DDD OOP design