Testing first? Or letting the test force you to write the code

Matthias Noback

April 27, 2020

I recently read Kent Beck’s classic on test-driven development (“TDD by Example”) and I think one of the most powerful ideas in this book is to have a test practically force you to write the code that will make it pass. I find that using tests for this has a deep impact on how I write code and it also makes me fully aware of how often I write code that is not justified by any test. I just write that code because I want to write it, or because I somehow know I have to write it. This is the same knowing as I experience when we’re in some kind of meeting where we discuss new functinality and all the programmers in the room including myself starts jumping to a solution.

That’s because from experience we know what kind of coding solution we most likely will encounter once we start working on the new feature. But writing that code immediately doesn’t give us any confidence that what we write is correct, nor that it is actually helpful. You know it often happens that we’ve programmed ourselves “in a corner” and we realize that we didn’t write the code that should actually have been written. I’m sure there is some kind of Availability bias at work here, where we choose the solution that comes to mind the easiest. But availability doesn’t mean it’s a good solution for this particular situation, nor that it is a good solution at all. Only that in a situation we previously experienced this has worked well (and even then, it could have been a bad solution if we were never able to tune into the feedback from real world usage).

What we need is some way to keep us “in check”. What code we write should be justified by real criteria: the behavior of the system itself. Tests are a great way of specifying that behavior, verifying that the system actually produces this behavior, and that we write only the code necessary to support that behavior.

Following from this idea comes the technique of triangulation. We don’t just write our tests first, to let it determine what cde write write. We treat the unit under test as an object that needs more than one way of establishing the correctness of its behavior. Aside from its origin in geometry, in many sciences triangulation has come to mean: using more than one method to prove something. In software testing triangulation means that we try different inputs to prove that the production code is correct. As an example, we test that a calculation gives the correct answer. We then show for a second input that the calculation is still correct. This ensures that the production code isn’t tailored to the first input, and is dynamic enough to also shows the correct behavior when provided with the second input. When using BDD different inputs are often called “examples”, and they serve the same purpose. Using realistic data they prove that different situations that could happen in real-life scenarios will be dealt with in the correct way.

In the realm of unit-testing, there’s one particular thing I’d like to share that I find is a useful application of the triangulation technique. I’m using it to determine if an object needs to keep the state that you want to intuitively let it keep. How do you know if data has to be assigned to an object’s property?

To answer this question, as a thought experiment, let’s switch paradigms. If we were using a functional programming language, and all we had was functions and data (and nothing like objects, which combines the two). How do you know which data should be provided as a function’s arguments? The simplest answer is: the data that you need in that function. Back to object-oriented programming: wat data should be assigned to properties? Of course: the data that oyou need. This is begging the question: how do you know if you need that data?

Generally, I find there are several reasons for needing data:

The data needs to be there for clients to take it out and use it.
The data needs to be persisted, so it stays inside an object and an ORM will take it out and serialize it to a database.
The data is used in some way by the object itself.

In CRUD-areas of an application, data goes in and out of objects. So there you find reasons 1 and 2 for assigning data to properties. In objects that support a more action-oriented workflow, e.g. objects that are more like state machines than data containers, you’ll often find reason 3. This makes it a good match for (DDD) aggregates, which are all about protection of state and making state transitions.

If you have a data container object, it is usually not a good candidate for unit-testing, since there is no behavior that needs to be specified. The only thing you should is the whole process of getting data in, getting it out into the database, then loading it from the database (that is, if you’re worried you’re making some assignment mistake like $this->userId = $groupId).

Test triangulation for object state

If you have an aggregate (or maybe just call it an entity), you may have some fun using triangulation to force yourself to write code. As an example, I like to assign nothing to a property, in fact, not even introduce a property, until the test in some way wants me to do it. Consider this example of a unit test for a Concert entity. Cancelling it should produce a domain event.

/**
 * @test
 */
public function it_can_be_cancelled(): void
{
    $concert = $this->aConcert();

    $concert->cancel();

    self::assertArrayContainsObjectOfClass(
        ConcertWasCancelled::class,
        $concert->releaseEvents()
    );
}

Easy enough. Because you’ve encountered this situation many times, you know what’s coming and besides recording that domain event, you also add a property like bool $wasCancelled to the Concert class and set it to true.

final class Concert
{
    // ...

    private bool $wasCancelled = false;

    // ...

    public function cancel(): void
    {
        $this->events[] = new ConcertWasCancelled($this->concertId);

        $this->wasCancelled = true;
    }
}

Too much code! Remove that line $this->wasCancelled = true and run the test; it will pass. Of course, my advice would not be to remove code and see if tests fail (although that’s another useful area of software testing called Mutation testing). My advice is:

Write production code only when it’s needed.
Let the test determine if it’s needed.

So which desired behavior will eventually force us to add that $wasCancelled property? In this example it turns out to be the test that describes the behavior of Concert when we try to cancel the same concert again. It’s up to you to decide: should we throw an exception, or ignore that second request. At least we wouldn’t want another domain event to be recorded again:

/**
 * @test
 */
public function cancelling_the_concert_twice_has_no_effect(): void
{
    $concert = $this->aConcert();
    $concert->cancel();
    $concert->releaseEvents(); // the first time we cancel the concert, an event will be recorded

    $concert->cancel();

    self::assertCount(0, $concert->releaseEvents()); // the second time, no events will be recorded
}

This test fails, but it will pass as soon as we modify the cancel() method to return early if $this->wasCancelled is already true.

final class Concert
{
    // ...

    private bool $wasCancelled = false;

    // ...

    public function cancel(): void
    {
        if ($this->wasCancelled) {
            return;
        }

        $this->events[] = new ConcertWasCancelled($this->concertId);

        $this->wasCancelled = true;
    }
}

I think this a rather nice example of how triangulation forces you to do so something in your code. A single method wasn’t sufficient to justify adding a property. A second method was needed to accomplish that.

Only add getters when you need them

A related situation is the one where data has to get out of the object. You’d normally add a getter for that. But what if the only one that needs the data is the unit test itself? As an example, if we would not have used domain events, the unit test for cancelling a concert would look like this:

/**
 * @test
 */
public function it_can_be_cancelled(): void
{
    $concert = $this->aConcert();

    $concert->cancel();

    self::assertTrue($concert->wasCancelled());
}

I use the rule: don’t add a getter if the test is the only one that needs it. In fact, don’t add any method if the test is the only one that needs it. When one of the clients is interested in information carried by the object, the test for that client will force you to add that getter at some point anyway. I’m explaining this and other rules about object testing in my book, the Object Design Style Guide. Another rule that I think is very helpful is not to test the happy path of a constructor. Mainly because that will look something like this:

/**
 * @test
 */
public function it_can_be_created(): void
{
    $concert = new Concert('The name', '2020-03-04', '20:00');

    self::assertEquals('The name', $concert->getName());
    self::assertEquals('2020-03-04', $concert->getDate());
    self::assertEquals('20:00', $concert->getTime());
}

This test doesn’t provide any justification at all for the fact that that data has to go in the object and be assigned to a property there. Likewise for all those getters: why does the data need to go out again? A test isn’t a real client of an object. These methods should deserve to be there because some other client needs it.

As a general recommendation: think about all the code you write and why you write it; is it because you think you need it, or because a test proves that you need it?