Inject a repository instead of an entity manager

Posted on by Matthias Noback

It appears that I didn't make myself clear while writing about entity managers and manager registries yesterday. People were quick to reply that instead you should inject entity repositories. However, I wasn't talking about entity repositories here. I was talking about classes that get an EntityManager injected because they want to call persist() or flush(). The point of my previous post was that in those cases you should inject the manager registry, because you don't know beforehand which entity manager manages the entities you are trying to persist. By injecting a manager registry you also make your code useful in contexts where another Doctrine persistence library is used.

However, most classes, especially those close to the business of the application, are not in need of an entity manager, but of an entity repository. Those classes use an injected entity manager only to retrieve a repository:

use Doctrine\ORM\EntityManager;

class SomeCustomerRelatedClass
{
    private $entityManager;

    public function __construct(EntityManager $entityManager)
    {
        $this->entityManager = $entityManager;
    }

    public function someMethod()
    {
        $customer = $this
            ->entityManager
            ->getRepository('Acme\CustomerBundle\Entity\Customer')
            ->findOneBy(...);

        ...
    }
}

The Law of Demeter

The entity manager is only injected to retrieve some other object (the repository) on which actually useful methods are called. This is a violation of the Law of Demeter, which requires to only call methods one level deeper than the current level (i.e. we should not go further than calling methods on the injected EntityManager). This becomes especially problematic when we write unit tests for this class, because we would have to create nested mock objects.

Factory service

When you notice that a class only uses an entity manager to get a repository from it, you'd better refactor it to get the repository itself injected, instead of the entity manager. With Symfony you can define a service for the repository like this:

services:
    acme_customer.customer_repository:
        class: Doctrine\ORM\EntityRepository
        factory: ['@doctrine.orm.default_entity_manager', getRepository]
        arguments:
            - Acme\CustomerBundle\Entity\Customer

The class is a required value, though it doesn't really matter here. What matters is the factory key. It instructs the service container to call the getRepository() method on the entity manager service with the entity class name as the first argument. The result will be returned as the repository service. Read more about service factories in the official documentation of the Dependency Injection component.

Once you have defined such a service for the repository that you need in your class, you can inject the repository service itself, instead of the entity manager. This will help you follow the Law of Demeter, which makes testing easier and will make your code a bit simpler:

use Doctrine\ORM\EntityRepository;

class SomeCustomerRelatedClass
{
    private $customerRepository;

    public function __construct(EntityRepository $customerRepository)
    {
        $this->customerRepository = $customerRepository;
    }

    public function someMethod()
    {
        $customer = $this->customerRepository->findOneBy(...);

        ...
    }
}

Custom repository classes

There are some remaining design issues. The most important one is: the class above is not entirely honest about its dependencies. It does not need just an EntityRepository, it needs the entity repository for Customer entities. So you can make your code much more expressive by making use of custom repository classes. You need to create a repository class which extends EntityRepository and then you need to mention that class in the metadata of your entity:

namespace Acme\CustomerBundle\Entity;

use Doctrine\ORM\EntityRepository;

/**
 * @ORM\Entity(repositoryClass="Acme\CustomerBundle\Entity\CustomerRepository")
 */
class Customer
{
    ...
}

class CustomerRepository extends EntityRepository
{
    ...
}

Now you can inject the custom repository, which makes your intentions much clearer and the code more expressive:

use Acme\CustomerBundle\Entity\CustomerRepository;

class SomeCustomerRelatedClass
{
    private $customerRepository;

    public function __construct(CustomerRepository $customerRepository)
    {
        $this->customerRepository = $customerRepository;
    }

    ...
}

The next step would be to refactor any call to the generic find(), findBy() and findOneBy() methods, including any call that relies on the magic __call() method of the EntityRepository class. Consider this call to the CustomerRepository class:

$this->customerRepository->findBy(array(
    'city' => 'Amsterdam',
    'confirmed' -> true
));

Such a query couples your class to the actual mapping of the Customer entity, because if the mapping changes (e.g. you rename the field confirmed to isConfirmed), the code in this class needs to be modified too. You can easily prevent this coupling by introducing a specific method in your custom repository class:

class CustomerRepository extends EntityRepository
{
    public function findByCity($city, $confirmed)
    {
        return $this->customerRepository->findBy(array(
            'city' => $city,
            'confirmed' => $confirmed
        ));
    }
}

The good thing is that any mapping changes will not ripple through to other parts of the application. They will be limited to the entity class itself and its repository class.

Repository interfaces

One other coupling-related design problem is the fact that the CustomerRepository is being injected, which extends EntityRepository. This couples a class using the CustomerRepository to Doctrine ORM: you would not be able to replace the CustomerRepository with a repository that extends Doctrine\MongoDB\ODM\DocumentRepository and works with MongoDB documents, because it already extends Doctrine\ORM\EntityRepository.

The solution would be to apply the Dependency Inversion Principle here (the "D" of the SOLID design principles). It tells us that we should depend on abstractions, not on concretions. In this case we depend on a concrete thing, namely an entity repository. Instead we should depend on an abstract object repository which returns objects, whether they are persisted as documents, entities, or not persisted at all.

So for every entity repository we inject, we need to define an interface, which abstracts the actual database queries. For example:

interface CustomerRepository
{
    /**
     * @param string $city
     * @param boolean $confirmed
     * @return Customer[]
     */
    public function findByCity($city, $confirmed);
}

We don't promise to return an array, nor an ArrayCollection; we promise to return something traversable (i.e. an array or an object that implements \Traversable) the values of which are instances of Customer.

Now we rename our existing CustomerRepository class to DoctrineORMCustomerRepository or CustomerEntityRepository and we make sure it implements the new repository interface:

class DoctrineORMCustomerRepository extends EntityRepository implements CustomerRepository
{
    public function findByCity($city, $confirmed)
    {
        ...
    }
}

This allows you (or others) to define other customer repositories which don't extend EntityRepository, but use some other way of persisting and retrieving Customer objects.

You may be tempted to make the CustomerRepository interface extend the ObjectRepository interface from the Doctrine Common persistence sub-library, but I'd recommend against it, because it reintroduces coupling to Doctrine.

Something to be aware of: resetting closed entity managers

As Christophe Coevoet pointed out in a comment about a related article by Wojciech Sznapka, there is one disadvantage to immediately injecting repositories in your services. In some situations, especially in long running parallel processes, an entity manager sometimes commits a database transaction which causes an exception to be thrown, which finally causes the entity manager to close itself. The assumption then is that there probably is some discrepancy between the data in the database and the entities that are kept in memory. In order to prevent accidental damage being made to the data persisted in the database, a closed entity manager refuses to perform any new operation to the database, and whatever method you call on it, you will always get this exception:

[Doctrine\ORM\ORMException] The EntityManager is closed.

In these situations, when the entity manager is closed, you can reset the entity manager by calling resetManager() on the ManagerRegistry (e.g. the doctrine service). This will cause the reference to the existing entity manager object to be set to null. The next time an entity manager is requested, a new one will be instantiated, which does not rely on entities currently being kept in memory.

The problem is that repositories that were already injected in services like described above will not be refreshed automatically. They still rely on the closed entity manager. This would be a good case for injecting ManagerRegistry instances everywhere. However, since most applications don't suffer from the "closed entity manager" problem, I would not advise to apply this by default. But it's good to know when you run into this difficult situation.

Other interesting material: criteria

Traditionally a repository is meant to accept "criteria" objects, based on which actual database queries can be made. I read about this also on Benjamen Eberlei's blog, but I never used this in practice (I will investigate this option of course). I saw there is also native support for this in Doctrine ORM, by means of the matching() method of the EntityRepository class. Official documentation on this is missing so if you have suggestions, please let me know.

Suggestion: add a save function to your repository

One last suggestion I'd like to point out here (as mentioned by Wojciech Sznapka in a comment on my previous post) is that you can skip injecting any entity manager at all if you add a save() method on your repository, which performs persist() and flush() operation on the entity manager itself:

class DoctrineORMCustomerRepository extends EntityRepository implements CustomerRepository
{
    public function save(Customer $customer)
    {
        $this->_em->persist($customer);
        $this->_em->flush();
    }
}

You rarely need an entity manager for something else, so this really settles most issues.

PHP Symfony Doctrine ORM persistence reuse
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
Ben ADAR

Thanks for the article and here's my question and this is more Doctrine 2 related.
We have some Report page that requires some ReportingCustomRepository to fetch some complex massaged data. However, Report entity does not exist, neither needs to be as it's not a table.
How from Controller you would call methods in ReportingCustomRepository?
Note: this is legacy app. - not Symphony based. It just has a doctrine 2.*

Matthias Noback

A repository can return any object (it doesn't have to extend EntittRepository or anything). Maybe https://matthiasnoback.nl/2... has some clues.

Ben ADAR

Yep. Nice design. I realize that in my case it should be some sort of ServiceLike class with DI of entity manager. Currently my design decoupled and ServiceLike class does not extends EntityRepository as you mentioned . But link you provided brings me to think that implementing Interface may be beneficial. Have to think of it. Thanks again.

Matthias Noback

Cool, thanks for letting me know, and good luck!

Geoff Maddock

Thanks for this article (and related article). This helped me resolve an issue with a project I'm building in Symfony 4.x!

vijay

@matthiasnoback:disqus
Currently we are using symfony 2.7 and 1 year back we created some services using factory like

fa.oldsay.search:
class: Doctrine\ORM\EntityRepository
factory_service: doctrine.orm.entity_manager
factory_method: getRepository
arguments: [FaOldSayBundle:OldTalk]

No issues with the above code and now also it's working fine in production server.
Now we are planning to upgrade symfony version. So we started fixing deprecated function like
factory_service and factory_method as per new symfony document. We changed above service like

fa.oldsay.search:
class: Doctrine\ORM\EntityRepository
factory: [doctrine.orm.entity_manager, getRepository]
arguments: [FaOldSayBundle:OldTalk]

Now we getting error like
RuntimeException in PhpDumper.php line 1390:
Cannot dump definition because of invalid class name ('doctrine.orm.entity_manager')
Please can you help on this issue.

Matthias Noback

I have no idea, sorry.

weaverryan

Btw, I don't believe the "entity manager is closed" is an issue anymore in Symfony 3.3, as explained by Stof himself: https://github.com/doctrine...

I just wanted to drop that info here, for clarification for anyone reading this now :).

Cheers!

Matthias Noback

Thanks for adding that here!

runawaycoin

Hey, why not make Repository classes public services, so can dependency inject them into your Controller, if you add constructor like this to your Repository, so no extra special config needed in your services.yml:

class AlbumRepository extends EntityRepository
{
public function __construct(EntityManagerInterface $em)
{
parent::__construct($em, new ClassMetadata(Album::class));
}
}

Venimus

the proper way is parent::__construct($em, $em->getClassMetadata(Album::class));

Wittestier

Simple and powerful. Thanks.

Matthias Noback

I don't want clients of my repository to have access to all Doctrine's methods. In case a client needs a new way of accessing the data, I first add a method to my own interface, then implement it in the Doctrine-specific way.

blaise

class DoctrineORMCustomerRepository extends EntityRepository implements CustomerRepository
{
public function save(Customer $customer)
{
$this->_em->persist($customer);
$this->_em->flush();
}
}
This "_em" was that I was looking for!

Matthias Noback

Although these days I definitely recommend not extending from EntityRepository but only letting it implement your own interface, then simply injecting the EntityManager.

y0da

@matthiasnoback:disqus Could you elaborate? Thx

Matthias Noback

Sure, I mean, you define your repository interface first, with very specific methods (e.g. not all the ones on Doctrine's `EntityRepository`). You will type-hint for the interface everywhere.
Then you write a class, implementing this interface, which you will use at runtime. This class gets the `EntityManager` injected. The class should have a special name, indicating that it's a Doctrine-specific implementation of your interface.

Jorge S.

Great post. Thanks

Tanguy Royere

And what if I want to persist 3 entities ? Will I inject 3 repositories just to use their save() method ? Hell no. This is not a proper solution, except for very simple cases.

webDEVILopers

In a DDD world example following the SRP and using CQRS you normally have one Service (Handler) that saves one Aggregate Root (Entity). In that fires events or other Handlers that take care of other Entites. So if you want to save 3 Aggregate Roots you have 3 Handlers, connected through events. If your other 2 Entities are part of your 1st Entity (simply "Aggregate") they should be "handled" inside your Aggregate Root and then you only persist the 1st Entity. In any case you normally inject 1 repository and only call save for 1 Entity.

Feel free to join or open your own discussion on Domain-Driven Design in PHP concentrating on Symfony Support:
https://github.com/webdevil...

mothproofsugar

Absolutely, easier to layer another mock object in your test than to inject 3 repos

Ricardo Vigatti Coelho

In your SomeCustomerRelatedClass, the variable $customerRepository is defined as private. If it was defined as public, will be violating the Law of Demeter??

I'm asking this cause i'm having to implement "duplicated" methods between my services (domain classes) and my repositories. See below:

// at the controller

$this->service->getById($id);

// at the service

$this->repository->findById($id);

Instead, i could do:

// at the controller

$this->service->repository->findById($id);

But i don't know if it's right according to the patterns...

David Sappico

In my opinion, you should go with "duplicated' methods. Because your controller should be stupid and let the service decide what to call. And maybe in the future you want too add some business logic on the call getById or improve it.

Ricardo Vigatti Coelho

Ok, i'll maintain that way. Thank you

J-P Stacey

A great roundup, although it might be worth an edit, given the factory* YAML syntax has changed in Symfony 3.x (which confuesed us yesterday for several hours.)

No longer is there separate factory_service: and factory_method; instead, there's just a single line:

factory: ['@doctrine.orm.default_entity_manager', getRepository]

See the Symfony docs for more information: http://symfony.com/doc/curr...

"Demeter, Queen of fruit and ear, bless O bless this post;
Grant our fact'ries fruitful be, that toil therein might boast.
Grip tight your calls, good objects all, or passerby will say
‘These be lines of Coldfusion; more cycles thrown away.’

Wirone

There's even newer and simpler syntax:


app.newsletter_manager:
# new syntax
factory: 'app.newsletter_manager_factory:createNewsletterManager'
# old syntax
factory: ['@app.newsletter_manager_factory', createNewsletterManager]
Enrico Stahn

Symfony 2.4:

@=service('doctrine.orm.default_entity_manager').getRepository('MycompanyMainBundle:Image')

Symfony 2.7:

@=service('doctrine').getRepository('AppBundle:EntityX')
Wirone

Your solution is great, as long as you don't need to inject something into your repository. In this case there's no risk in using "different" repositories (as far as I remember object initialized as service is not the same as repo retrieved from entity manager), which can lead to errors. When service is required (additional injections, DI tag bindings), it should be used along whole application, not mixed with $em->getRepository().

mothproofsugar

Definitely this, saves a service definition

Great articles! I've thought about the below statement myself and have debated if having a dependence on doctrine's common library is all that bad. Curious if anyone has anything to add.

> You may be tempted to make the CustomerRepository interface extend theObjectRepository interface from the Doctrine Common persistence sub-library, but I'd recommend against it, because it reintroduces coupling to Doctrine.

John

What about changing class: `Doctrine\ORM\EntityRepository` to `AcmeBundle\Entity\Repository\DoctrineORMCustomerRepository`?

Jocsonme

It;s a good method, but I have a better way to inject container to a customed Class for all Entity Repository Class

Rajesh

Pls post your sample code

craig heydenburg

I did all this but it isn't working. If I use a custom Repository, how do I change the Factory Service config? If I don't the type-hint in the constructor doesn't work.

craig heydenburg

I figured out that I had really bad XML service definitions ;-) Also, the class argument needs to be the custom repo class.

webDEVILopers

Having a similar issue here with XML service definitions. YAML works fine while XML throws "Using $this when not in object context". Any idea?

https://gist.github.com/web...

Jiri Fornous

Also a bit late, but...

I have switched to doctrine from pretty complex custom made domain model using mappers, repositories, entities with well designed aggregate boundaries. There were some pluses and also some minuses. The main reason to switch was simply the development time I've spent doing all the CUD operations. Since I used your CommandBus/EventBus for organizing the domain code, I injected necessary repositories inside the command handler to get the job done. It was pretty straight forward.

In case of Doctrine I tend to avoid traversing object net using entity methods and rather use DQL to get proper set of objects with only one join SQL query. The DQL gives me also better means to express my needs.

So I understand there are simple cases, when injecting repository instead of entity manager is cleaner solution for testability and The law of Demeter, but if you dive deeper into complexity, I prefer the entity manager injection.

Frank de Jonge

A bit of a late comment here ... but. I tried this out, it works wonderful. One correction however. I would set the class value to the actual class being returned instead of the EntityRepository just so IDE's don't slam you with warnings left and right. The value itself doesn't matter for Symfony, but it does for the Symfony2 plugin in PHPStorm.

Christian Flothmann

I would even go one step further and simply set the class to the CustomerRepository repository interface since you actually simply want to use method defined in that interface and nothing else to be able to swap implementations.

Frank de Jonge

I wouldn't go this far because the DI doesn't supply interfaces, it supplies a concrete type. Within your code you'd hint on the interfaces. Within the DI you could still have two concrete implementations of this interface. Especially with the IDE plugin, if at a later time the interface changes or the concrete repository no longer implements a certain interface, you'll get notified. Which is one of the benefits of using an IDE. If you say you're creating an interface, you'll basically instructing the DI to lie on your behalf.

Alexey Sitka

Great article and advices, thanks!

For those, who comes here and inspired - this is my version of repository.

use Doctrine\ORM\EntityRepository;

class Repository extends EntityRepository
{
    public function save($data, $flush = true)
    {
        $this->proceedData($data, 'persist', $flush);

        return $this;
    }

    public function remove($data, $flush = true)
    {
        $this->proceedData($data, 'remove', $flush);

        return $this;
    }

    public function refresh($data)
    {
        $this->proceedData($data, 'refresh', false);

        return $this;
    }

    protected function proceedData($data, $method, $flush)
    {
        if ($this->isCollection($data)) {
            foreach ($data as $entity) {
                $this->proceedEntity($entity, $method, $flush);
            }
        } else {
            $this->proceedEntity($data, $method, $flush);
        }

        return $this;
    }

    protected function proceedEntity($entity, $method, $flush)
    {
        $this->checkEntityClass($entity);
        $this->_em->$method($entity);
        if ($flush) {
            $this->_em->flush($entity);
        }

        return $this;
    }

    protected function checkEntityClass($entity)
    {
        if (!($entity instanceof $this->_entityName)) {
            throw new \InvalidArgumentException(sprintf(
                'Entity repository can not proceed with entity of other type. "%s" entity is given, but only "%s" can be proceeded.',
                get_class($entity), $this->_entityName
            ));
        }
    }

    protected function isCollection($data)
    {
        return (is_array($data) || $data instanceof \Traversable);
    }
}

Silent Works

In the example below, how are your referencing $this->customerRepository inside the CustomerRepository class? Can you provide a full class example of this please?

class CustomerRepository extends EntityRepository
{
public function findByCity($city, $confirmed)
{
return $this->customerRepository->findBy(array(
'city' => $city,
'confirmed' => $confirmed
));
}
}

I think this should be changed to:

return $this->findBy(array(...));

since it extends the EntityRepository.

Matthias Noback

A bit late but yes, you're right, you should remove the "->customerRepository" part from that statement.

(deleted)
Oleg Abrazhaev

I think it's better to add DomainSession object which will be wrapper around em and add save() method into it. And use it separately.

Matthias Noback

The "flush" part may be debatable, but I think a save() method is not bad, but its name should actually be "add()" because "save" is more of a persistence/database-related word. The repository itself makes no promises to its clients with regard to actual persistence. See also http://martinfowler.com/eaa... : "Objects can be added to and removed
from the Repository, as they can from a simple collection of objects,
and the mapping code encapsulated by the Repository will carry out the
appropriate operations behind the scenes."

Syam

A methode like Save($entity, $sync = true) should resolve this issue. Calling the Flush() method if $sync = true.

(deleted)
Matthias Noback

Thanks for mentioning the error. As I see it, the repository helps to hide the actual implementation of your storage facility.

(deleted)
Matthias Noback

You're right. And this nicely brings us back to the other discussion going on about auto-flushing.

Daniel Ribeiro

On the custom repository interfaces, I would also point out that it's interesting to keep the repository implementations in different and specific namespaces. So, instead of having a DoctrineORMCustomerRepository, you should have a Repository\Doctrine\ORM\CustomerRepository.

cordoval

i usually plug a save method with an argument for flush on my repositories, is this a bad practice? i mean i end up with both flexibilities to flush whenever I want and of course only when it is needed. Notice the save is just a method so I inject the gateway layer and proxy it to actually save with whichever persistence layer i am using. How do you see this?

Matthias Noback

Definitely, you could make the flush optional with an extra argument.

Reinier Pelayo

I know this is diverging a little from the point of your talks, but how do you feel about all those "best practices" that said to not "flush" your Managers outside the Controller? It seems that you are OK with doing it inside your "CustomerManager" class.

Matthias Noback

That's interesting. I saw Kris Wallsmith delegate the flush call to a kernel.terminate event listener, which might make sense, as long as you don't need the id's of the objects you are creating. I've never used this in practice though.

Marc Morera Merino

IMHO this is not the best approach, since you should flush all available EntityManagers... Am I wrong? Automatically flush all managers even without changes could be a problem...

And that way of flushing all "non-flushed" entities does not mean that you cannot flush an entity inside a service if you really need, for example, the entity id, right?

I also saw the Kris talk in Warsaw and, even is an option, not my favourite option :)

Matthias Noback

Well, it can be okay if 1) you don't depend on the database for the id creation (you use a UUID, which you can generate when you construct the entity) and 2) you fire your domain events *after* flush (like described here: http://www.whitewashing.de/....

Also I think you should only auto-flush when the HTTP method is non-safe (e.g. POST, PUT, DELETE). Since it would happen after the response was sent to the user, the user will not know if their action was a success. But if it is not a success, it will show up in your (error) log.
All in all, it may be interesting to find out if this approach makes sense. If you find any other pros/cons please let me know.

(deleted)
Matthias Noback

Thanks, it is really interesting to have some more points of view about this subject.

Some more philosophical questions: if Doctrine wasn't used, when would you actually persist your data? Is it essential that data is being flushed and stored in a database? If all repositories were just in-memory repositories, no flush was needed, which at least points to a violation of Liskov Substitution Principle.

The way I see it, flushing objects is actually part of the infrastructure of the application. In the domain code you only use the repository to fetch objects and the same repository to add new objects to. From within domain code it does not make sense to call flush. Calling add() on the repository is already quite sufficient to communicate the intention: keeping things in the repository for later use.

The fact that Doctrine requires a flush for this should mean that calling flush() should happen as far from the domain code as possible. E.g. outside the domain code, in the application code. Several options: 1. in the controller, like you suggest or 2. in a command wrapper. When you use a command bus, you can let it handle all commands within their own transaction. A flush at the end will store all modified entities after the command has been executed. If anything goes wrong which causes entities to be in a bad state, an exception should have been raised anyway, which would have skipped any actual persistence. After each command, the entity manager should be cleared, so entities don't end up being flushed after other commands have been executed.

Well, as you can see, your messages have sparked some fire in me. In the end, all I'm saying is, I changed my mind and I think flush should definitely not be an application-wide event (like on kernel.terminate). But it does not have to be an a controller too.

Marc Morera Merino

The point is that injecting Repositories like that way, you are assuming that Acme\CustomerBundle\Entity\Customer will be managed by default EntityManager (doctrine.orm.default_entity_manager), so... back to one of your last posts, this is not the best assumption if you are focused on any open source project.

Is there any other way to inject an specific Repository having in a variable the namaspace of the entity?

Matthias Noback

Excellent! This was a thought that was hanging around in my head too. You could modify the service definition to instead use the "getRepository" function of the manager registry (https://github.com/doctrine..., but this is a bit of a strange method, since it has an optional argument for the name of the entity manager (which doesn't make sense, since that already depends on the entity class itself). So in fact you should really do managerRegistry->getManagerForClass(...)->getRepository(...), but then you need two levels of factory services for each repository service (which is no problem I think, but still...).

Marc Morera Merino

Exactly. In fact there is an opened issue in Elcodi : https://github.com/elcodi/e... related to this problem.

IMO, the best approach to give the maximum OpenSourceability, is to inject as you are commenting previously.

This part is only definition, so should not be a problem...

Matthias Noback

Cool, that looks great.

Matt Robinson

Nice article :) Documentation for filtering with criteria isn't *entirely* missing, it's just well-hidden, somewhat short and refers only to collections. Still, there's enough to be useful: http://doctrine-orm.readthe...

Matthias Noback

Thank you, excellent. I will definitely dive into this.