When to add an interface to a class

Posted on by Matthias Noback

I'm currently revising my book "Principles of Package Design". It covers lots of design principles, like the SOLID principles and the lesser known Package (or Component) Design Principles. When discussing these principles in the book, I regularly encourage the reader to add more interfaces to their classes, to make the overall design of the package or application more flexible. However, not every class needs an interface, and not every interface makes sense. I thought it would be useful to enumerate some good reasons for adding an interface to a class. At the end of this post I'll make sure to mention a few good reasons for not adding an interface too.

If not all public methods are meant to be used by regular clients

A class always has an implicit interface, consisting of all its public methods. This is how the class will be known to other classes who use it. An implicit interface can easily be turned into an explicit one by collecting all those public methods (except for the constructor, which should not be considered a regular method), stripping the method bodies and copying the remaining method signatures into an interface file.

// The original class with only an implicit interface:

final class EntityManager
{
    public function persist($object): void
    {
        // ...
    }

    public function flush($object = null): void
    {
        // ...
    }

    public function getConnection(): Connection
    {
        // ...
    }

    public function getCache(): Cache
    {
        // ...
    }

    // and so on
}

// The extracted - explicit - interface:

interface EntityManager
{
    public function persist($object): void;

    public function flush($object = null): void;

    public function getConnection(): Connection;

    public function getCache(): Cache;

    // ...
}

However, regular clients of EntityManager won't need access to the internally used Connection or Cache object which can be retrieved by calling getConnection() or getCache() respectively. You could even say that the implicit interface of the EntityManager class unnecessarily exposes implementation details and internal data structures to clients.

By copying the signatures of these methods to the newly created EntityManager interface, we missed the opportunity to limit the size of the interface as it gets exposed to regular clients. It would be most useful if clients only needed to depend on the methods they need. So the improved EntityManager interface should only keep persist() and flush().

interface EntityManager
{
    public function persist($object);

    public function flush($object = null);
}

You may know this strategy from the Interface segregation principle, which tells you not to let clients depend on methods they don't use (or shouldn't use!).

If the class uses I/O

Whenever a class makes some call that uses I/O (the network, the filesystem, the system's source of randomness, or the system clock), you should definitely provide an interface for it. The reason being that in a test scenario you want to replace that class with a test double and you need an interface for creating that test double. An example of a class that uses I/O is the CurlHttpClient:

// A class that uses IO:

final class CurlHttpClient
{
    public function get(string $url): string
    {
        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

        // This call uses the network!
        $result = curl_exec($ch);

        // ...

        return $result;
    }
}

// An explicit interface for HTTP clients like CurlHttpClient

interface HttpClient
{
    public function get(string $url): string;
}

Introducing an interface for such classes is usually an excellent situation to apply the Dependency inversion principle as well: make sure the interface is more abstract than the class. A first step would be to remove the specificness from the class name and methods, as we did in the example: we went from CurlHttpClient to HttpClient, hiding the fact that Curl is used to do the actual work. The next step would be to find out how this interface will be used. For example, is it used to communicate with a remote service to load user data from, like in the AuthenticationManager class below?

final class AuthenticationManager
{
    private $client;

    public function __construct(HttpClient $client)
    {
        $this->client = $client;
    }

    public function authenticate(Request $request): void
    {
        $username = $request->request->get('username');

        $userData = json_decode($this->client->get('/user?username=' . $username), true);

        // ...
    }
}

In that case, we could take the design to the next level by acknowledging that AuthenticationManager doesn't really need an HttpClient, but rather a "user data provider". This is a more abstract concept, which can easily be modelled as an interface:

// A proper abstraction for something that "provides user data":

interface UserDataProvider
{
    public function getByUsername(string $username): array;
}

final class AuthenticationManager
{
    private $userDataProvider;

    public function __construct(UserDataProvider $userDataProvider)
    {
        $this->userDataProvider = $userDataProvider;
    }

    public function authenticate(Request $request): void
    {
        $username = $request->request->get('username');

        $userData = $this->userDataProvider->getByUsername($username);

        // ...
    }
}

Introducing the UserDataProvider abstraction makes the AuthenticationManager class much more flexible, allowing us to plug a different strategy for providing user data. It will also make it easier to provide test doubles for the dependencies of AuthenticationManager. Instead of preparing an HttpClient stub which returns a carefully crafted HTTP response object, we can now simply return an array of user data.

If you'd like to know more about using test doubles to replace I/O calls, take a look at my article series on "Mocking at architectural boundaries":

If the class depends on third-party code

If there is some third-party code (e.g. from a package you don't maintain yourself) that is used in your class, it can be wise to isolate the integration of your code with this third-party code and hide the details behind an interface. Good reasons to do so are:

  • The (implicit) interface wouldn't be how you would’ve designed it yourself.
  • You're not sure if the package is safe to rely on.

Let's say you need a diffing tool to calculate the differences between two multi-line strings. There's an open source package (nicky/funky-diff) which provides more or less what you need, but the API is a bit off. You want a string with pluses and minuses, but the class in this package returns a list of ChunkDiff objects:

class FunkyDiffer
{
    /**
     * @param array $from Lines
     * @param array $to Lines to compare to
     * @return array|ChunkDiff[]
     */
    public function diff(array $from, array $to): array
    {
        // ...
    }
}

Besides offering a strange API, the package is being "maintained" by someone you've never heard of (and it has 15 open issues and 7 pull requests). So you need to protect the stability of your package and you define your own interface. Then you add an Adapter class which implements your interface, yet delegates the work to the FunkyDiffer class:

// Your own interface:

interface Differ
{
    public function generate(string $from, string $to): string;
}

// The Adapter class:

final class DifferUsesFunkyDiffer implements Differ
{
    private $funkyDiffer;

    public function __construct(FunkyDiffer $funkyDiffer)
    {
        $this->funkyDiffer = $funkyDiffer;
    }

    public function generate(string $from, string $to): string
    {
        return implode(
            "\n", 
            $this->funkyDiffer->diff(
                explode("\n", $from),
                explode("\n", $to)
            )
        );
    }
}

The advantage of this approach is that from now on you can always switch to a different library, without changing the bulk of your code. Only the adapter class needs to be rewritten to use that other library.

By the way, a good old Façade might be an option here too, since it would hide the use of the third-party implementation. However, due to the lack of an explicit interface, you wouldn't be able to experiment with alternative implementations. If the code is part of a package, the same goes for its users: they won't be able to write their own implementation of a "differ".

// A simple Façade, no explicit interface:

final class Differ
{
    public function generate(string $from, string $to): string
    {
        $funkyDiffer = new FunkyDiffer();

        // delegate to FunkyDiffer
    }
}

If you want to introduce an abstraction for multiple specific things

If you want to treat different, specific classes in some way that is the same for every one of them, you should introduce an interface that covers their common ground. Such an interface is often called an "abstraction", because it abstracts away the details that don't matter to the client of that interface. A nice example is the VoterInterface from the Symfony Security component. Every application has its own authorization logic, but Symfony’s AccessDecisionManager doesn't care about the exact rules. It can deal with any voter you write, as long as it implements VoterInterface and works according to the instructions provided by the documentation of that interface. An example of such an implementation:

final class MySpecificVoter implements VoterInterface
{
    public function vote(
        TokenInterface $token, 
        $subject, 
        array $attributes
    ): int {
        // ...
    } 
}

In the case of the VoterInterface, the package maintainers serve the users of their package by offering them a way to provide their own authorization rules. But sometimes an abstraction is only there for the code in the package itself. In that case too, don’t hesitate to add it.

If you foresee that the user wants to replace part of the object hierarchy

In most cases, a final class is the best thing you can create. If a user doesn't like your class, they can simply choose not to use it. However, if you're building up a hierarchy of objects you should introduce an interface for every class. That way the user can replace a particular piece of logic somewhere in that hierarchy with their own logic. It will make your code useful in as many situations as possible.

A nice example comes from Tactician, which offers a command bus implementation.

The package ships with a CommandBus class. It's a class, because its implicit interface isn't larger than its explicit interface would be - the only public method is handle().

class CommandBus
{
    // ...

    public function __construct(array $middleware)
    {
        // ...
    }

    public function handle($command)
    {
        // ...
    }

    // ...
}

To set up a working CommandBus instance, you need to instantiate a number of "middleware" classes, which all implement the Middleware interface:

interface Middleware
{
    public function execute($command, callable $next);
}

This is an example of an interface that was introduced as an abstraction, allowing the package maintainer to treat multiple specific things in some generic way, as well as to allow users to plug in their own specific implementations.

One of those middlewares is the CommandHandlerMiddleware, which itself needs a "command name extractor", a "handler locator" and a "method name inflector". All of which have a default implementation inside the package (the command name is the class name, the handler for a command is kept in memory, the handle method is "handle" plus the name of the command):

$handlerMiddleware = new CommandHandlerMiddleware(
    new ClassNameExtractor(),
    new InMemoryLocator([...]),
    new HandleClassNameInflector()
);

$commandBus = new CommandBus(
    [
        // ...,
        $handlerMiddleware,
        // ...
    ]
);

Each collaborating object that gets injected into CommandHandlerMiddleware can easily be replaced by re-implementing the interfaces of these objects (CommandNameExtractor, HandlerLocator and MethodNameInflector respectively). Because CommandHandlerMiddleware depends on interfaces, not on concrete classes, it will remain useful for its users, even if they want to replace part of the built-in logic with their own logic. For example when they would like to use their favorite service locator to retrieve the command handler from.

By the way, adding an interface for those collaborating objects also helps the user to decorate existing implementations of the interface by using object composition.

For everything else: stick to a final class

If your situation doesn't match any of the ones described above, most likely the best thing you can do is not to add an interface, and just stick to using a class, preferably a final class. The advantage of marking a class as "final" is that subclassing is no longer an officially supported way of modifying the behavior of a class. This saves you from a lot of trouble later on when you're changing that class: you won’t have to worry about users who rely on your class's internals in some unexpected way. This advice applies to both package and application developers by the way.

Classes that almost never need an interface are:

  • Classes that model some concept from your domain.
  • Classes that otherwise represent stateful objects (as opposed to classes that represent stateless services).
  • Classes that represent a particular piece of business logic, or a calculation.

What these types of classes have in common is that it's not at all needed nor desirable to swap their implementations out. More specifically, some good examples of these classes are:

  • Entities
  • Value objects
  • Domain services
  • Application services

The reason is that these are classes from the domain or application layer. A domain model should model parts of your business domain, and it doesn't make sense to prepare for certain elements of that model to be replaced. Making things as concrete and specific as possible is usually good advice in this realm. As for application services: they reflect the use cases of the application these services belong to. It would be weird if you'd aim for replaceability and extensibility of these classes. They are unique to this application. So again: let them be plain old classes.

PHP design Comments

Improving your software project by being intolerant

Posted on by Matthias Noback

During the holiday I read a book mentioned to me by Pim Elshoff: "Skin in the game", by Nassim Nicholas Taleb. Discussing this concept of "skin in the game" with Pim had made me curious about the book. It's not such a fat book as one of Taleb's other books, which is possibly more famous, called "Antifragile". "Skin in the game" is a much lighter book, and also quite polemic, making it an often uncomfortable, but fun reading experience. I can easily see how people could get mad about this book or aggressive towards its author (not that I'm encouraging or approving of that aggression!). While reading it, it reminded me of Nietzsche, and his despise of the "common man", who Taleb calls "the intellectual" - someone who has no "skin in the game", let alone "soul in the game". Taleb's ideas are interesting, just like Nietzsche's are, but they could easily be abused, and probably so by misinterpreting them.

Intolerance wins

Something that's controversial, yet interesting for me as a developer in a team - from Chapter 2, "The Most Intolerant Wins: The Dominance of the Stubborn Minority":

Society doesn't evolve by consensus, voting, majority, committees, verbose meetings, academic conferences, tea and cucumber sandwiches, or polling; only a few people suffice to disproportionately move the needle. All one needs is an asymmetric rule somewhere—and someone with soul in the game. And asymmetry is present in about everything.

Taleb, Nassim Nicholas. Skin in the Game: Hidden Asymmetries in Daily Life (p. 83). Penguin Books Ltd. Kindle Edition.

This made me aware of something that I had observed several times before, whenever I was part of a software development team - or "software development society": the biggest impact on quality (to be interpreted in many different ways) isn't made by reaching consensus, or voting, or being tolerant to other people's opinions or ways of coding at all. What makes the biggest impact is the opposite: being intolerant of all those different styles, approaches, and being stubborn about your own way. As the title of the chapter says: "the most intolerant wins".

For example, how do you improve the coding style of a project? Maybe by discussing which styles you all like, and agreeing on one, then agreeing that all of you will apply it from now on? No, you can only achieve a consistent coding style by enforcing it everywhere and being very intolerant about it (that is, only allowing commits that conform to this particular style).

Another example: would you leave it up to your fellow developers to decide whether they will use static methods to fetch dependencies, or they inject all dependencies as constructor arguments? If you allow both, the result will be a mess. You can't even rely on a single class using either of these options - a class may even use both at the same time! (See also "Negative architecture, and assumptions about code"). You will be considered a tolerant developer, but it's not for the good of the project. Again, intolerance is needed to drastically improve and guard the quality of your project.

One more example: if you know that the project would be better off with a Docker setup, do you ask management for permission to make it so? Do you vote? You won't get the minority vote, let alone get scheduled time for it during regular work hours. It may very well be that you have the biggest chance of success when you just do it. There's one rule though: if things become messy, take full responsibility and adapt: you made a mistake, just roll it back (or don't roll it out in the first place). This is where you make sure you have skin in the game.

Do more than you talk

This leads me to quoting Taleb again:

[...] always do more than you talk. And precede talk with action. For it will always remain that action without talk supersedes talk without action.

Taleb, Nassim Nicholas. Skin in the Game: Hidden Asymmetries in Daily Life (p. 122). Penguin Books Ltd. Kindle Edition.

Over the past 20 years I've had lots of conversations with developers and managers, that eventually turned out to be just talk, no action. I don't know about your experiences (would love to hear about them though!), but this is a very common characteristic of "conversations in tech"; lots of meetings, lots of discussions, cool ideas, great plans, but none of it is going to happen. For many reasons of course, like:

  • The managers in the meeting just want to keep the developers happy, so they allow them to make great plans, which will never be executed.
  • The developers are not happy about their job anymore, so they construct this shared dream about how things could be, "if only...".
  • Everybody realizes that the great plans are to great to ever be finished, so nobody even dares to start doing the work.
  • The people who talk, feel that they must reach consensus for everything. They never reach it.

I'm sure that, from experience, you can add several more reasons to this list.

I find that some of my most impactful work on projects in the past has been when I preceded talk with action. I did something, believing it was the right thing to do, putting in some of my own time, and proving "that it could work". I did this work outside of the system of employer-employee, in unbillable time, and without (a lot of) consensus. Without exception, starting a change resulted in other people jumping in, supporting, and completing the work.

In fact, this is what I recommend people to do, when I get asked (and I often get this same question): "How can I change X in my team?" My first advice is always: don't ask, just do it. Of course, this comes with certain ethical clauses, like:

  • you shouldn't be wasting your employer's time,
  • you shouldn't do what you know will harm your relation with other team members,
  • etc.

Still, within these margins you can easily accomplish much more than you thought was even possible, given the current project and the project team.

Having skin in the game, being antifragile

When reading "Antifragile", I was struck by something Taleb writes about being self-employed:

Further, for a self-employed person, a small (nonterminal) mistake is information, valuable information, one that directs him in his adaptive approach;

Taleb, Nassim Nicholas. Antifragile: Things that Gain from Disorder (p. 85). Penguin Books Ltd. Kindle Edition.

I always thought this to be a good reason to be freelancing: the mistakes you make force you to adapt, to learn. You get immediate feedback from what you do, and use it to become a better freelancer. Because you want to make your business sustainable and not end up with a bad reputation, becoming unhireable.

However, reconsidering this, I think it's not smart to link "being self-employed" to "learning from mistakes". What matters is the feedback loop involved in the work. If you could make a mistake, but it will never be linked to you, or you will never hear about it anyway, there's no way you could learn from it. And this is what may very well happen to any freelance software developer: a mistake you made (a bug you produced, a bad design you imposed, a coupling issue you introduced), may surface years after you made it. Since companies usually can't pay (a team of) freelance developers for longer than one or two years, you are very likely to miss the mistakes that you make, and therefore not learn from them. You'll likely even make the same mistakes in other projects for years to come.

So, do I recommend you not to start freelancing? Well, no. But I do recommend you to look for ways in which you can tap into the mistakes you produce, to evaluate how things are going, and to look for ways in which you can improve.

PHP Comments

More code comments

Posted on by Matthias Noback

Recently I read a comment on Twitter by Nikola Poša:

He was providing us with a useful suggestion, one that I myself have been following ever since reading "Clean Code" by Robert Martin. The paraphrased suggestion in that book, as well as in the tweet, is to consider a comment to be a naming issue in disguise, and to solve that issue, instead of keeping the comment. By the way, the book has some very nice examples of how comments should and should not be used.

The code comment as a naming issue

I think in most cases, indeed, we don't need comments. Common suggestions are, like Nikola suggests as well:

  • Introduce a variable with a meaningful name, which can hold the result of a complicated expression.
  • Introduce a method with a meaningful name, which can hide a piece of complicated logic.

The "hiding" that a method does, provides you with two interesting options:

  1. The method name can summarize a number of statements.
  2. The method name can represent a concept that is of a higher level than the lower-level things that are going on inside the method.

Option 1 is useful, but if you can go for option 2, it'll be the most valuable option.

Consider Nikola's example:

if ($date->hour < 9 || $date->hour > 17) { //Off-hours?
}

Option 1 would entail something like:

if ($this->hourIsBetween9And17($date->hour)) { //Off-hours?
}

Option 2 would be much more interesting and useful:

if ($this->isAnOffHour($date->hour)) {
}

This introduces abstraction, where the client doesn't care about the concrete details of determining whether or not a certain hour is an off-hour, but only wants to know if the given hour is indeed an off-hour.

#NoComments?

I realized I'm always applying the following rule: if I feel like writing a comment, I look for a way in which I can change the code so that I don't have to write a comment anymore. Also, when I encounter a comment in an existing code base, I apply the same rule, and look for ways to remove the comment. Because, as you may know, code can't lie, but comments can, so let's get rid of them before they become a risk.

However, contrary to my own advise, I also realized that I've actually been writing more and more comments over the past few months. So what's going on?

A little soul-searching revealed that I've been adding code comments that can be categorized as follows:

  1. Explanatory comments: ~70%
  2. TODO comments: ~20%
  3. WTF comments: ~10%

Explanatory comments

I've been adding explanatory comments to parts in the code that need an explanation, or a canonical description of the situation. Most often these comments can be found near a couple of statements in the domain model. The comment then explains why a certain business rule should be applied, and how the code beneath the comment actually accomplishes this. Of course, I try to match the words in the comment with the words in the code itself, so they are tied together.

Before adding an explanatory comment, I often try to rephrase the explanation as a test first. This is usually the better option, since it will make sure that the code and its documentation will never diverge. The test is the documentation that describes a piece of the code, and if the code is no longer truthful to that description, the test will fail. An even better option is to apply behavior-driven development, where tests can actually be written as stories with examples, and they will serve as automated tests at the same time.

Still, some things just need a whole paragraph of explaining, and in that case, I happily put the explanation in the code, and take my time (and a long-form commenting style like /* ... */) to explain what's going on.

"Here's a question: why don't you create a separate document, instead of a code comment?"

Good question; in my experience most developers don't care about documentation. They don't read it, don't write it, and don't update it. This means that documentation becomes untrustworthy very, very quickly. And still, nobody cares. All everybody cares about is diving into the code and making that change that needs to be done. So, knowing ourselves, I think it's much better to add a comment, than to write separate documentation.

"Okay, another question: don't you think a commit message would be more suitable for explanatory comments?

I agree, a commit message is great for a bit of explanation from your side. However, commit messages aren't as readily available as code comments are. They aren't searchable. They are not "in your face" - you don't see them automatically when you open the code. In fact, you'd have to dive into the history of that code, to figure out all the reasoning behind a piece of code. And besides, everybody would have to make small commits and write proper commit messages, which definitely doesn't happen in the real world.

By the way, I often stumble upon comments like this:

autoSelect: false, // important !!!! 

I think these are meant to be explanatory, but what they're missing is the "why" part of the explanation. Whenever you add an explanatory comment, make sure you don't forget it!

TODO comments

Besides adding explanatory comments to code, TODO comments are my favorite when doing a major refactoring, Mikado style. I sprinkle lots of these little comments across the area of the code base that I'm working on. You have to remember the one rule that should be applied when using TODO comments: when merging/releasing your work, those TODOs should be gone. This means that, as part of a code review, you should verify that no TODOs were added inside the new branch. Just like comments in general become outdated very quickly, TODO comments expire very soon as well. What happens most of the time is:

  • The TODO comment never was an intention-of-future-work, it actually was an apology in disguise.

    // TODO we should inject this service as a constructor dependency instead
    
  • The work to be done is too much and will never be done.

    // TODO rewrite using Twig
    
  • The TODO comment marks some technical debt that was introduced. The work will also never be done.

    // TODO make it possible to use this with a different tenant
    

From years of experience with TODO comments, my advice is to turn them into DONE comments right away, and move the comment to your commit message.

WTF comments

These comments are very much like TODO comments, because you're basically saying: "this ain't right", but you're also saying: "We're not going to fix this". Either because it works (even if that's a miracle), or because it would take too much time to make it right. You should still add a comment, to explain what's going on, so the next person who reads this code understands what's going on, and doesn't have to spend a lot of time trying to figure it out, just like you did now.

Conclusion

When I write a comment, I do it so that the next person who looks at this piece of code doesn't have to figure it out again. Fun fact: this is the same approach I have to many other things in life. For example, I don't want to spend time and energy over the same thought again and again, in particular if it's a troublesome thought. Same for things I have to do: if I think of it once, I write it down, so I never have to be interrupted by the thought again (until I check my TODO list, that is!).

While thinking about code comments, I became aware of an implicit rule about how I write code: I always aim to de-personify it. I try to write code that doesn't show hints of the author as a person (with emotions, habits, years of experience, etc.). When adding the types of comments discussed earlier, the code starts to have more signs of human authorship. And I actually think that's a good thing: we're working on this code together, but we're also experiencing it. Adding comments increases ownership of the code, and shows empathy towards other programmers, like our current and future team members. And to ourselves, when we dive back into the code after a holiday break. I'd say: happy commenting!

PHP clean code design Comments