Mocking at architectural boundaries: persistence and time

Posted on by Matthias Noback

More and more I've come to realize that I've been mocking less and less.

The thing is, creating test doubles is a very dangerous activity. For example, what I often see is something like this:

$entityManager = $this->createMock(EntityManager::class);


Or, what appears to be better, since we'd be mocking an interface instead of a concrete class:

$entityManager = $this->createMock(ObjectManagerInterface::class);
// ...

To be very honest, there isn't a big different between these two examples. If this code is in, for example, a unit test for a repository class, we're not testing many of the aspects of the code that should have been tested instead.

For example, by creating a test double for the EntityManager, we're assuming that it will work well with any objects we'll pass to it. If you've ever debugged an issue with an EntityManager, you know that this is a bad assumption. Anything may go wrong: a mistake in the mapping, missing configuration for cascading persist/delete behavior, an issue with the database credentials, availability of the database server, network connectivity, a missing or invalid database schema, etc.

In short, a unit test like this doesn't add any value, except that it verifies correct execution of the code you wrote (something a linter or static analysis tool may be able to do as well). There's nothing in this test that ensures a correct working once the code has been deployed and is actually being used.

The general rule to apply here is "Don't mock what you don't own" (see the excellent book "Growing Object-Oriented Software, Guided by Tests", or an article on the topic by Eric Smith, "That's Not Yours"). Whenever I've brought up this rule in discussions with developers, I've always met with resistance. "What else is there to mock?" "Isn't mocking meant to replace the slow, fragile stuff with something that is fast and stable?"

Mock across architecturally significant boundaries

Of course we want to use mocks for that. And we need to, since our test suite will become very slow and fragile if we don't do it. But we need to do it in the right place: at the boundaries of our application.

My reasoning for "when to mock" is always:

  1. If you encounter the need for some information or some action that isn't available to you in the memory of the the currently running program, define an interface that represents your query (in case you need to know something) or command (in case you want to do something). Put the interface in the "core" of your application (in the domain or application layer).
  2. Use this interface anywhere you want to send this query or command.
  3. Write at least one implementation for the interface, and make sure all the clients of the interface get this implementation injected as constructor arguments.

Mocking "persistence"

To fix the EntityManager example above we need to take a step back and articulate our reason for using the EntityManager in the first place. Apparently, we were in need of persisting an object. This is not something the running application could do naturally (the moment it stops, it forgets about any object it has in memory). So we had to reach across the application's boundaries, to an external service called a "database".

Because we always considering reusing things that are already available in our project, we just decided to go with the previously installed EntityManager to fulfill our needs. If however we would've followed the steps described above, we would've ended up in a different place:

  1. I need to persist (not just any object, but) my Article entity, so I define an interface that represents the action I intend for it to do:

    interface ArticleRepository
        public function persist(Article $article): void
  2. I use this interface everywhere in my code.

  3. I provide a default implementation for it, one that uses my beloved EntityManager:

    final class ORMArticleRepository implements ArticleRepository
        public function persist(Article $article): void
            // here we *will* use the EntityManager 

Note that there's nothing about EntityManagers in the interface we've defined. Also, there's no generic object-type parameter, but a very specific one (Article). Finally, there's nothing "crazy" about the interface, like first having to call persist() and then flush(). It's now one thing: we want the ArticleRepository to take care of persisting an Article entity.

Mocking "time"

Another example: I need to know what the current date/time is. This information isn't available inside the running program; I need to reach outside of it. Normally, I'd simply summon the language's native date/time representation of "now", i.e. new DateTime('now'). Instead, following the same steps as described above:

  1. I define an interface that represents my query:

    interface Clock
        public function currentTime(): DateTime;
  2. I use this interface in my code:

    final class ArticleRepository
        public function __construct(Clock $clock)
            // ...
        public function findNotYetPublishedArticles(): array
            $now = $this->clock->currentTime();
            // look for articles that will be published 
            // at some point in the future...
  3. I write a standard implementation for it, one that uses the most common way of determining the current date/time. This involves the system clock and the server's time zone:

    final class SystemClock implements Clock
        public function currentTime(): DateTime
            return new DateTime('now');

In unit tests for the core of your application, you can safely mock Clock to give you some deterministic return value. The unit test will be a pure unit test, since it won't involve a system call to get the current date/time (as opposed to when we'd just call new DateTime('now') on the spot).

Consider passing time as context

In terms of dependency injection versus context passing, you should also consider determining the current time once in the infrastructure layer (e.g. in the controller), and passing it as a value to any other objects that need this information.

Since we'll use a test double for Clock everywhere, the code in SystemClock won't be exercised during a test run. So we need another kind of test, a so-called integration test, which can prove that the SystemClock functions correctly, given an actual system clock is available to retrieve the current time.

Your unit tests will be fast, and your integration test will show you any mistakes in your assumptions about the third-party code or hardware that you're implicitly relying on.

Replacing language features

Another style of "mocking" time involves overriding built-in time-related functions. A pretty smart solution is provided by Symfony's PHPUnit Bridge library. It overwrites calls like time() and sleep(). Though this library is useful for testing low-level infrastructure/framework code, I don't recommend to use this library in scenarios that are like the one above, for two reasons:

  • Only part of the quality improvement comes from faster running test suites. The biggest improvement comes from defining your own abstractions. With it comes a more clear sense of what it is you need: do you need the current date, the current time, then with what precision, is a timezone involved, or even relevant, etc.
  • Replacing system calls involves assumptions about what the behavior in real life will be. These assumptions go untested and will result in hard-to-figure out bugs.

Mocking with architectural consequences

Another big benefit of this approach is that it forces you to think through what your significant architectural boundaries are; and enforce them with polymorphic interfaces. This allows you to manage the dependencies across those boundaries so that you can independently deploy (and develop) the components on either side of the boundary.

Uncle Bob, "When to Mock"

By following the above steps, you've basically been applying the Dependency Inversion Principle. And you'll be better off with that. The results are:

  • Loose coupling; you have the option to change the implementation, without touching all the clients of the interface.
  • Intention-revealing interfaces: you'll have better interfaces that define your questions in a better way than the framework/library/language feature's interface does.

If you keep making these interfaces for things that deal with "the world outside your application", you'll end up with two things:

  • An accidentally emerging infrastructure "layer", which contains all the code that connects to external services, data providers and devices.
  • An accidentally emerging hexagonal architecture architecture that is good for testing, since it allows using application features without all those slow and brittle external services, data providers and devices.

To me it's fascinating how proper techniques for creating test doubles could make a difference at the level of an application's architecture.


We've discussed two common areas of an application that need mocking - persistence and time. We've also seen how correctly mocking things has a healthy effect on your application's architecture. In another article we'll discuss two other things that need mocking - the filesystem, the network and... randomness.

PHP testing mocking hexagonal architecture Comments

Local and remote code coverage for Behat

Posted on by Matthias Noback

Why code coverage for Behat?

PHPUnit has built-in several options for generating code coverage data and reports. Behat doesn't. As Konstantin Kudryashov (@everzet) points out in an issue asking for code coverage options in Behat:

Code coverage is controversial idea and code coverage for StoryBDD framework is just nonsense. If you're doing code testing with StoryBDD - you're doing it wrong.

He's right I guess. The main issue is that StoryBDD isn't about code, so it doesn't make sense to calculate code coverage for it. Furthermore, the danger of collecting code coverage data and generating coverage reports is that people will start using it as a metric for code quality. And maybe they'll even set management targets based on coverage percentage. Anyway, that's not what this article is about...

In my Advanced Application Testing workshop I just wanted to collect code coverage data to show how different types of tests (system, acceptance, integration and unit) touch different parts of the code. And how this evolves over time, when we're replacing parts of the system under test, or switch test types to achieve shorter feedback loops, etc.

The main issue is that, when it comes to running our code, Behat does two things: it executes code in the same project, and/or (and this complicates the situation a bit) it remotely executes code when it's using Mink to talk to a web application running in another process.

This means that if you want to have Behat coverage, you'll need to do two things:

  1. Collect local code coverage data.
  2. Instruct the remote web application to collect code coverage data itself, then fetch it.

Behat extensions for code coverage

For the workshop, I created two Behat extensions that do exactly this:

  1. LocalCodeCoverageExtension
  2. RemoteCodeCoverageExtension

The second one makes use of an adapted version of the LiveCodeCoverage tool I published earlier.

You have to enable these extensions in your behat.yml file:

            # ...
            target_directory: '%paths.base%/var/coverage'
            target_directory: '%paths.base%/var/coverage'
            # ...
            local_coverage_enabled: true
            mink_session: default
            # ...
            remote_coverage_enabled: true

Local coverage doesn't require any changes to the production code, but remote coverage does: you need to run a tool called RemoteCodeCoverage, and let it wrap your application/kernel in your web application's front controller (e.g. index.php):

use LiveCodeCoverage\RemoteCodeCoverage;

$shutDownCodeCoverage = RemoteCodeCoverage::bootstrap(
    __DIR__ . '/../phpunit.xml.dist'

// Run your web application now...

// This will save and store collected coverage data:

From now on, a Behat run will generate a coverage file (.cov) in ./var/coverage for every suite that has coverage enabled (the name of the file is the name of the suite).

The arguments passed to RemoteCodeCoverage::bootstrap() allow for some fine-tuning of its behavior:

  1. Provide your own logic to determine if code coverage should be enabled in the first place (this example uses an environment variable for that). This is important for security reasons. It helps you make sure that the production server won't expose any collected coverage data.
  2. Provide your own directory for storing the coverage data files.
  3. Provide the path to your own phpunit.xml(.dist) file. This file is used for its code coverage filter configuration. You can exclude vendor/ code for example.

Combining coverage data from different tools and suites

If you also instruct PHPUnit to generate "PHP" coverage files in the same directory, you will end up with several .cov files for every test run, one for every suite/type of test.

vendor/bin/phpunit --testsuite unit --coverage-php var/coverage/unit.cov
vendor/bin/phpunit --testsuite integration --coverage-php var/coverage/integration.cov
vendor/bin/behat --suite acceptance
vendor/bin/behat --suite system

// these files will be created (and overwritten during subsequent runs):

Finally, you can merge these files using phpcov:

phpcov merge --html=./var/coverage/html ./var/coverage

You'll get this nice HTML report and for every line it shows you which unit test or feature file "touched" this line:

Merged code coverage report

I hope these extensions prove to be useful; please let me know if they are. For now, the biggest downside is slowness - running code with XDebug enabled is already slow, but collecting code coverage data along the way is even slower. Still, the benefits may be big enough to justify this.

PHP testing Behat code coverage Comments

Call to conference organisers: pay your workshop instructors

Posted on by Matthias Noback

A little background: speakers don't get paid

Speakers like myself don't get paid for doing a talk at a tech conference. That's why I call this work "open source". People will get a video or audio recording of the talk, including separately viewable or downloadable slides for free. The idea is, a conference costs a lot of money to organise. It is quite expensive to fly in all those speakers. So there's no money to pay the speakers for all their work (for me personally it's about 80 hours of preparation, plus time spent travelling, usually half a day before and after the conference). Speakers get their travel costs reimbursed, they often get two nights at a hotel, and a ticket to the conference. Plus, they get advertising for their personal brand (increasing their reputation as an expert, a funny person, or just a person with more Google results for their name).

Workshops: also not paid

Ever since I realized that creating workshops is something I like a lot, I started submitting them to conferences as well. This, to me, is a whole different story. There's many hours of educational experience, preparation, and again travel going into this. And delivering a workshop always costs a lot more energy than a single talk does. Often you won't get paid for all that. Nevertheless, it earns the conference organisers a lot more money than a talk does.

Workshops are often planned within the days before the main conference. Ticket prices range from 200 to 800 euro. It often happens to me that I'm delivering my workshop in front of 20-25 people. This means that the conference organisers receive 4000 to 20.000 euros per day per workshop. Surely, there will be costs involved. But even after subtracting those, a workshop instructor will often generate thousands of euros in revenue.

To be fair

I don't think it would be fair to pay workshop instructors the entire amount either (after subtraction of the costs). Having a conference organising your workshop has value too:

  • They have reach: hundreds of people will know about your workshop.
  • They deal with all the administrative work (payments, registration, refunds, etc.).
  • They take care of the well-being of the attendees (parking directions, food, drinks, etc.).

Call to conference organisers

Still, I wanted to point out how wrong the situation is for tutorial/workshop instructors at many PHP conferences I know of. I want to ask you all, conference organisers: next time you organise workshop days for your attendees, make sure to pay your instructors.

Some useful suggestions, which I've learnt from conference organisers I spoke with:

  1. Pay instructors for their work day: €1000,- (plus the usual reimbursement of travel costs, and a hotel night). This isn't quite enough to cover preparation time, but it's a reasonable amount to me.
  2. Let instructors share in the revenue, after subtraction of the costs, e.g. give them 50%. This makes up for the day, and the required preparation. It will also make instructors happy workers.

In fact, Laracon EU is a conference where they do this. Shawn McCool, one of its organisers, said to me:

Paying people for their work is the right thing to do, for both ethics and sustainability.

I totally agree with him. Now, do the right thing and make that change!

Community Comments