In part 1 of this short series (it's going to end with this article) we covered how you can test-drive the queries in a repository class. Returning query results is only part of the job of a repository though. The other part is to store objects (entities), retrieve them using something like a save() and a getById() method, and possibly delete them. Some people will implement these two jobs in one repository class, some like to use two or even many repositories for this. When you have a separate write and read model (CQRS), the read model repositories will have the querying functionality (e.g. find me all the active products), the write model repositories will have the store/retrieve/delete functionality.

In particular if you write your own mapping code (like I've been doing a lot recently), you need to write some extra tests to verify that the persistence-related activities of your repository function correctly.

Writing tests for store and retrieve actions

When you're testing a class, you're actually specifying its behavior. But you're doing that from an outsider's perspective. The test case uses an instance of your class (the subject-under-test). It calls some methods on it and checks if the resulting behavior is as expected.

How would you specify the behavior of a save() method? You could say about the repository: "It can save an entity". How would you verify that a given repository class implements this specification correctly? If the repository would use Doctrine ORM, you could set up a mock for the EntityManager or a similar class/interface, and verify that it passes the object to its persist() method. However, as explained earlier, within repository classes mock's aren't allowed. The other option would be to make a call to that save() method and afterwards look inside the database to verify that the expected records have been inserted. However, this would tie the test to the implementation of the repository.

It seems there's no easy way in which you can find out if a call to save() has worked. But let's think about that save() method. Why is it there? Because we want to be able to later retrieve the entity that it saves. So, what if we test our save() method by also introducing its counterpart, getById()? That way, we can indirectly find out if save() has worked: getById() is expected to return an object that's equal to the object you've just persisted.

In other words, a black box test for save() can be written if you combine that test with getById():

public function it_can_save_and_retrieve_an_entity(): void
{
    // Create a basic version of the entity and store it
    $originalEntity = ...;
    $this->repository->save($originalEntity);

    // Now load it from the database
    $entityFromDatabase = $this->repository->getById($originalEntity->entityId());

    // Compare it to the entity you created for this test
    self::assertEquals($originalEntity, $entityFromDatabase);
}

State changes, child entities, etc.

Usually an entity doesn't get persisted once. You'll be modifying it, persisting it again, adding child entities to it, removing them again, etc. So, for every situation like this, I like to write another test method, showing that all this really works, e.g.

public function it_can_save_child_entities(): void
{
    // Create a basic version of the entity and store it
    $originalEntity = ...;
    // Add some child entity
    $originalEntity->addChildEntity(...);
    $this->repository->save($originalEntity);

    // Now load it from the database
    $entityFromDatabase = $this->repository->getById($originalEntity->entityId());

    // Compare it to the entity as we've set it up for this test
    self::assertEquals($originalEntity, $entityFromDatabase);
}

Sometimes it makes sense to add intermediate save() and getById() calls, e.g.

public function it_can_save_child_entities(): void
{
    // Create a basic version of the entity and store it
    $originalEntity = ...;
    $this->repository->save($originalEntity);
    // Load and save again, now with an added child entity
    $originalEntity = $this->repository->getById($originalEntity->entityId());
    $originalEntity->addChildEntity(...);
    $this->repository->save($originalEntity);

    // Now load it from the database
    $entityFromDatabase = $this->repository->getById($originalEntity->entityId());

    // Compare it to the entity as we've set it up for this test
    self::assertEquals($originalEntity, $entityFromDatabase);
}

Deleting entities

Finally, a repository may offer a delete() method. This one needs testing too. Deleting is always scary, in particular if you somehow forget to add proper WHERE clauses to your DELETE statements (who didn't, at least once?).

So we should verify that everything related to a single entity has been deleted, but nothing else. How can you do this? Well, if you want black box testing again, you could save two entities, delete one, and check that the other one still exists:

public function it_can_delete_an_entity(): void
{
    // Create the entity
    $originalEntity = ...;
    $originalEntity->addChildEntity(...);
    $this->repository->save($originalEntity);

    // Create another entity
    $anotherEntity = ...;
    $anotherEntity->addChildEntity(...);
    $this->repository->save($anotherEntity);

    // Now delete that other entity
    $this->repository->delete($anotherEntity);

    // Verify that the first entity still exists
    $entityFromDatabase = $this->repository->getById($originalEntity->entityId());
    self::assertEquals($originalEntity, $entityFromDatabase);

    // Verify that the second entity we just removed, can't be found
    $this->expectException(EntityNotFound::class);
    $this->repository->getById($anotherEntity->entityId());
}

Or, if you like, you could let go of the black box aspect and populate the database with some entity and child entity records that you want to prove will still exist after you delete a single entity.

Ports & adapters

If you write purely black box tests for your write model entity/aggregate repository (that is, for save(), getById() and delete()), the test cases themselves won't mention anything about the underlying storage mechanism of the repository. You won't find any SQL queries in your test. This means that you could rewrite the repository to use a completely different storage mechanism, and your test wouldn't need to be modified.

This amounts to the same thing as applying a Ports and Adapters architectural style, where you separate the port (i.e. "Persistence") from its specific adapter (i.e. a repository implementation that uses SQL). This is very useful, because it allows you to write fast acceptance tests for your application against code in the Application layer, and use stand-ins or "fakes" for your repositories. It also helps you decouple domain logic from infrastructure concerns, allowing you to replace that infrastructure code and migrate to other libraries or frameworks whenever you want to.

Conclusion

By writing all these repository tests you specify what the repository should be capable of, from the perspective of its users. Specifying and verifying different uses case proves that the repository is indeed capable of storing, retrieving and deleting its entities correctly. Most effective for this purpose is black box testing, where you make sure the repository test is completely decoupled from the repository's underlying storage mechanism. If you can accomplish this, you can rewrite your repository using a different storage mechanism, and prove that everything still works afterwards.

PHP design testing database
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
Marcelo Chiaradía
This is an interesting approach. Something to note is that this also requires the equality method to be implemented, and tested.Also, for entities equality is defined by their identity (and not their attributes), so I wonder how this approach would work in such cases.
Matthias Noback

Interesting point. Normally, you're right about not comparing entities by values but by ID. In this case however, we do compare by value, to somehow show that no data was lost. Another way is to look at it in a more behavioral way, and test that the entity behaves the same before and after storing. But that seems rather elaborate.

Ondřej Frei

@matthiasnoback:disqus, thanks for this article!

I have a question to the sentence "Sometimes it makes sense to add intermediate save() and getById() calls, e.g." - what is the advantage of that in the example below it over the one above it? They seem to cover pretty much the same.

Matthias Noback

Well, I like to sometimes see that intermediate steps also get "reloaded" correctly. Otherwise, I may be looking at something that is correct at the object level, but isn't reflected in the database.

Melyou

Nice article, Matthias.
Could you please explain, from your point of view, cons & advantages (could be a different blog post) of testing Repositories.

For example, Symfony team do not recommend testing repositories (from Symfony documentation: "Unit testing Doctrine repositories in a Symfony project is not recommended" - https://symfony.com/doc/cur..., but they don't explain why.

Matthias Noback

That is in fact a good recommendation. It doesn't make sense to test the repository without the actual database that's going to be used.

Melyou

I probably misunderstood that line. Then if I am using Symfony to create my shiny beautiful & awesome website, it is not included on "a Symfony project"?

Well, I think I got it.
Testing Symfony Repositories, should be done by a Functional test & not a Unit test (Sorry, I am just starting to know about the world of TDD).

Thank you, Matthias.

Matthias Noback

That's correct. It's indeed about terminology: a unit test is a test that doesn't use I/O; no database calls, in fact, no network calls whatsoever, no filesystem calls, etc. So a unit test isn't much of a help for a repository test.

Kristof Van Cauwenbergh

When testing the delete, you’re not actually testing if the entity is removed from the database, but you’re testing if it didn’t delete another entity. Which is great, but shouldn’t you also assert if the deleted entity is actually deleted?

Matthias Noback

Oops, that should've been another example - testing for a RuntimeException ("not found").

Александр Оганов

Thanks for writing. But this time there is little useful information.

Matthias Noback

Haha, you're right. I realize that now :) Will blog more about this soon, when I'm releasing TalisORM, which combines some of these insights.

Timothy Rourke

Do you eschew seeding the database in a test setup step to avoid leaking implementation details about the underlying storage, or is it acceptable to limit this kind of sql to test code?

I'm expecting some degree of seeding would be important for testing complex query methods on your read model repository.

On a related note, wondering also whether you think it is a good practice to test drive repositories against the exact real backing database you intend to use in production, or if the performance benefits of using something like an in memory sqlite database outweigh the risks of not testing sql code against the exact target syntax you plan to use in production. The goal to not leak implementation about the backing storage is good, but isn't part of what must be tested quite tightly coupled to how a given database works? Or do the infrastructure tests go somewhere else whereas the higher level integration tests exist to test the interface only?

Matthias Noback

Maybe you've read it (https://matthiasnoback.nl/2...), but I think ideally you would use only the normal way of creating objects to get them in your database. For the current project however, we're just using SQL, since we explicitly want to test for cases that aren't even covered by the regular execution path (because our application isn't the only one manipulating that same database). So in that case, writing SQL for fixtures isn't a problem.

I prefer writing repository tests using a real database, and also the one you're going to use in production. It's so easy to make mistakes with it. I also wonder how bad it is really, for performance. You just shouldn't load the entire schema, and all the data you can think of.

Timothy Rourke

Thanks!!

jonyi89

timothyrourke 3+