This article is part 2 of my review of the book "Fifty quick ideas to improve your tests". I'll continue to share some of my personal highlights with you.
Replace multiple steps with a higher-level step
If a test executes multiple tasks in sequence that form a higher-level action, often the language and the concepts used in the test explain the mechanics of test execution rather than the purpose of the test, and in this case the entire block can often be replaced with a single higher-level concept.
When writing a unit test for some complicated object, you may be collecting some data first, then putting that data into value objects, maybe wrapping them in other value objects, before you can finally pass them to the constructor of the value object. At that point you may have to call several methods on the object before it's in the right state. The same goes for acceptance tests. You may need quite a number of steps before you get to the point where you can finally write a 'When' clause.
I find that often the steps leading up to the 'When' clause (or to the 'Act' part of a unit test), can be summarized as one thing. This is meant by the "single higher-level concept". So instead of enumerating everything that has happened to the system, you look for a single step that described what your starting point is. For instance, instead of:
Given the user logged in And they have created a purchase order And they added product A to it, with a quantity of 10 And they placed the purchase order When the user cancels the purchase order Then ...
You could summarize the 'Given' steps as follows:
Given a placed purchase order for product A, with a quantity of 10 When the user cancels the purchase order Then ...
Once you start looking for ways to make things smaller and simpler, easier to follow and read, you'll find that many of the details you previously added to the smaller steps are completely irrelevant for the higher-level step. In our case, the product and the quantity may be completely irrelevant for the scenario that deals with cancelling the purchase order. So the final version may well be:
Given a placed purchase order When the user cancels the purchase order Then ...
When implementing the step definition for this single, simplified step, you still need to provide sensible default data. But it happens behind the scenes. This technique hides the details that are irrelevant, and only shows the relevant ones to the person reading the scenario. It will be a lot easier to understand what's being tested with this particular scenario.
I find that this approach works really well with unit tests too. One test case may show how a complicated object can be constructed, another test case won't just repeat all those steps, but will summarize it properly (like we did for the above scenario). This often requires the introduction of a private method in the test class, which wraps the individual steps. The method is the abstraction, its name the higher-level concept.
Specification, acceptance tests, living documentation
When creating your artifacts, remember the three roles they must serve at different times: now as a specification, soon as acceptance tests, and later as living documentation. Critique the artifact from each perspective. How well does it serve each distinct role? Is it over-optimised for one role to the detriment of others?
In workshops I often point out that testing is a crucial part of the development workflow. You need it to verify that the unit of code you've been working works as expected. To improve this perspective on writing tests, you could aim for not just verifying correctness of a unit, but specifying it. This implies some kind of test-first approach, where you specify expected behavior. The next level would be to consider the future. What happens when you're done writing this code? Could someone else still understand what's going on? Do you only have technical specifications, or also domain-level specifications?
For unit tests, the danger is that you'll be doing only technical verifications. You may be testing methods, arguments and return values instead of higher-level behaviors. It happens when you're just repeating the logic of the class inside its unit test. You find yourself looking into the box (white box testing), instead of treating the box as an object with proper boundaries.
The best thing you can do is to think hard about the method names of a unit test. They are your chance to describe the behaviors on a higher level. It's all we do really. We start with some higher level expected behavior, and implement it in terms of lower-level concrete code steps. A method is what we often use to hide the irrelevant information, so at a higher level you can have meaningful conversations, without drowning in details. That's why a name for a method is good if it doesn't tell you what happens inside of the method, but when it describes the higher-level term for what happens. E.g. a repository's
save() method describes what it does, but not how it does it.
I find that once you have the proper level of abstraction in your specifications, the actual acceptance criteria will be present there too. And then, those specifications, besides serving as acceptance tests, can be considered living documentation. They are documentation in the sense that they document the acceptance criteria (any business rule that's involved), and they are living, because when something changes on either side, this will have an immediate effect on the other side. This keeps the documentation automatically up-to-date, and the code in sync with the documentation.
Checking whether a test performs its three roles well - being specification, acceptance test and living documentation - is like a test where the test itself is the subject under test.
Business-oriented versus technical-oriented testing
Both the technical and the business aspects are important, and they both need to be tested, but teams will often get a lot more value out of two separate sets of tests rather than one mixed-role test.
I think this is correct. But it also turns out that you can't make an easy distinction between these different roles. At least, trying to do so by separating the tests by type (unit, acceptance, integration, system) doesn't work well. There doesn't have to be as strict a distinction between these different types of test at all. At least not in the sense that one type is business-oriented, the other technology-oriented.
A [...] common way of confusing coverage and purpose is thinking that acceptance tests need to be executed at a service or API layer.
To avoid this pitfall, make the effort to consider an area of coverage separately from the purpose of a test. Then you’re free to combine them. For example, you can have business-oriented unit tests, or technical end-to-end checks.
When you have a layered architecture (as described in "Layers, ports & adapters - Part 2, Layers"), you'll have an application layer which doesn't directly depend on any real infrastructure (so no database, no UI). This means you can write your acceptance tests against this application layer (a.k.a. "API layer"). It results in fast acceptance tests, which naturally don't talk about URLs or HTML elements, or about "what's in the database", which is great. I think this idea originated from a 2015 article - The Forgotten Layer of the Test Automation Pyramid).
However, it's not the complete truth though. You can mix and match however you like. It's possible to write unit tests which are more like acceptance tests. You can write acceptance tests which are more like system tests and test against the actual running application (see for an approach that mixes both: "Introducing Modelling by Example"), including the UI, the database, even the web server if you like (preferably, I'd say). I realize now that I've been doing things like that, but was still teaching the more strict separation of unit and acceptance tests. I find that at least the unit tests for your domain code can be mainly business-oriented.
Another interesting test for tests
I found another interesting idea in the book. The authors encourage us to reflect on what we do when a test fails. They talk about a company that uses the following rule:
If people changed the test to make it pass, it was marked as bad.
I'm not sure if this approach is really feasible, but there is something intriguing about the rule. The idea is that the test specifies behavior, embodying acceptance criteria. If nothing has changed about those, it shouldn't need to be modified. In theory it's only the production code that implements the behavior that should be updated. A change about the test means that either the specification itself was incorrect (in which case that would have been the starting point of making the change, not the production code itself), or that the test was bad in the sense that it wasn't a good (enough) specification after all...