Mocking at architectural boundaries: the filesystem and randomness

Matthias Noback

March 6, 2018

In a previous article, we discussed “persistence” and “time” as boundary concepts that need mocking by means of dependency inversion: define your own interface, then provide an implementation for it. There were three other topics left to cover: the filesystem, the network and randomness.

Mocking the filesystem

We already covered “persistence”, but only in the sense that we sometimes need a way to make in-memory objects persistent. After a restart of the application we should be able to bring back those objects and continue to use them as if nothing happened.

Besides object persistence, managed by object repositories, we also encounter the need to store any number of (unstructured) bytes. A PDF file, an image, a security key, etc. Whenever that need arises, we may reach out to the filesystem, or some external file storage system, and store our data on it.

As soon as an object starts talking to a local or a remote file system, we won’t be able to write a unit test for it though. We can’t just “mock out” a file system. There are too many details about communicating with the file system that should not go untested. If we would mock, e.g. by overriding built-in functions like fopen() and fwrite(), we might be making lots of assumptions that turn out to be wrong once the software is running on production (e.g. directory doesn’t exist, or it isn’t writable, the disk is full, etc.).

This is where a tool like vfsStream could help: it replaces the real file system with something that behaves just like an actual file system, leveraging PHP’s built-in “stream” abstraction. It might be a useful tool, but I suggest using it only in situations where you can’t introduce your own/a better abstraction (e.g. in bad cases of legacy code).

A more powerful alternative for dealing with file systems in a test scenario might be a “file system abstraction” library, like Gaufrette or Flysystem, which does the same kind of thing: it offers abstractions that hold true for all adapter implementations (e.g. FTP, S3, GridFS). Hence, we can mock the filesystem, and trust that Flysystem or Gaufrette will get the actual implementation right.

This is wonderful; we don’t have to worry about all the low-level details, in the same way that Doctrine ORM and DBAL deal with a lot of the nitty-gritty details of database communication, on behalf of us. However, even if Doctrine DBAL offers “database abstraction”, and Flysystem offers “filesystem abstraction”, these aren’t our abstractions. Most often, our applications don’t really need a database, or a filesystem. They need a way to persist an object, or to store a piece of data.

As described in the previous post, the best solution is to define your needs as an interface first, then implement this interface using the real thing. When you write unit tests for client code of that interface, you’re free to create a test double for it. The integration tests you’ll write for the implementation of the interface itself, should prove that you made the right assumptions about the third-party code, or the hardware involved. This means that your integration test should test the real thing. In case you use Flysystem with an FTP adapter, your integration test verifies that your code works well with Flysystem and its FTP adapter. So don’t replace the FTP adapter with an in-memory adapter or something, since its API may be the same, it’ll behave in completely different ways at runtime - ways you want to capture in an integration test.

Mocking “randomness”

Generating random values is another example of something the application itself can’t do. Unless it’s using predictable randomness, like produced by PHP’s mt_rand() function. In that case, although it seems like you can’t make that function’s return value deterministic in a test, you can actually “seed” the function by providing a number to mt_srand():

$seed = 1000;
mt_srand($seed);

// will always return 8
$randomNumber = mt_rand(1, 10);

As you can imagine, this isn’t “true” randomness. You’d need to ask the computer for such a thing, probably through some convenience method like random_int(). Using that function makes your code unsuitable for a unit test though. That’s why you’d need to mock requests for randomness.

Again, start with your own interface. This is a great opportunity to think about what you’re really looking for: a random int within a certain range (ask yourself, what would you call this int?), random bytes, something with a random length, etc. Make sure this is reflected in the interface, and provide an implementation for it which uses the underlying, lower-level call to the random number/byte generator.

Mocking “the network”

Maybe you’ve noticed that I skipped a previously promised discussion about mocking “the network”. Since it’s a bigger topic, I decided to save it for another article.