Doctrine ORM and DDD aggregates

Posted on by Matthias Noback

I'd like to start this article with a quote from Ross Tuck's article "Persisting Value Objects in Doctrine". He describes different ways of persisting value objects when using Doctrine ORM. At the end of the page he gives us the following option - the "nuclear" one:

[...] Doctrine is great for the vast majority of applications but if you’ve got edge cases that are making your entity code messy, don’t be afraid to toss Doctrine out. Setup an interface for your repositories and create an alternate implementation where you do the querying or mapping by hand. It might be a PITA but it might also be less frustration in the long run.

As I discovered recently, you don't need an edge case to drop Doctrine ORM altogether. But since there are lots of projects using Doctrine ORM, with developers working on them who would like to apply DDD patterns to it, I realized there is probably an audience for a few practical suggestions on storing aggregates (entities and value objects) with Doctrine ORM.

Designing without the ORM in mind

When you (re)learn how to design domain objects using Domain-Driven Design patterns, you first need to get rid of the idea that the objects you're designing are ever going to be persisted. It's important to stay real about your domain model though; its state definitely needs to be persisted some day, or else the application won't meet its acceptance criteria. But while designing, you should not let the fact that you're using a relational database get in the way. Design the objects in such a way that they are useful, that you can do meaningful things with them, and that they are trustworthy; you should never encounter incomplete or inconsistent domain objects.

Still, at some point you're going to have to consider how to store the state of your domain objects (after all, your application at one point is going to shut down and when it comes up, it needs to have access to the same data as before it was restarted). I find that, when designing aggregates, it would be best to act as if they are going to be stored in a document database. The aggregate and all of its parts wouldn't need to be distributed across several tables in a relational database; the aggregate could just be persisted as one whole thing, filed under the ID of the aggregate's root entity.

More common however is the choice for a relational database, and in most projects such a database comes with an ORM. So then, after you've carefully designed your aggregate "the right way", the question is: how do we store this thing in our tables? A common solution is to dissect the aggregate along the lines of its root entity and optionally its child entities. Consider an example from a recent workshop: we have a purchase order and this order has a number of lines. The PurchaseOrder is the root entity of the aggregate with the same name. The Line objects are the child entities (i.e. they have an identity - a line number - which is only unique within the aggregate itself). PurchaseOrder and Line all have value objects describing parts or aspects of these entities (i.e. the product ID that was ordered, the quantity that was ordered, the supplier from whom it was ordered, and so on). This would be a simplified version of PurchaseOrder and Line:

<?php

final class PurchaseOrder
{
    /**
     * @var PurchaseOrderId
     */
    private $id;

    /**
     * @var SupplierId
     */
    private $supplierId;

    /**
     * @var Line[]
     */
    private $lines = [];

    private function __construct(
        PurchaseOrderId $purchaseOrderId, 
        SupplierId $supplierId
    ) {
        $this->id = $purchaseOrderId;
        $this->supplierId = $supplierId;
    }

    public static function create(
        PurchaseOrderId $purchaseOrderId, 
        SupplierId $supplierId
    ): PurchaseOrder 
    {
        return new self($purchaseOrderId, $supplierId);
    }

    public function addLine(
        ProductId $productId, 
        OrderedQuantity $quantity
    ): void
    {
        $lineNumber = count($this->lines) + 1;

        $this->lines[] = new Line($lineNumber, $productId, $quantity);
    }

    public function purchaseOrderId(): PurchaseOrderId
    {
        return $this->id;
    }

    // ...
}

final class Line
{
    /**
     * @var int
     */
    private $lineNumber;

    /**
     * @var ProductId
     */
    private $productId;

    /**
     * @var OrderedQuantity
     */
    private $quantity;

    public function __construct(
        int $lineNumber, 
        ProductId $productId, 
        OrderedQuantity $quantity
    ) {
        $this->lineNumber = $lineNumber;
        $this->productId = $productId;
        $this->quantity = $quantity;
    }

    // ...
}

To turn the PurchaseOrder into something manageable by Doctrine, we need to do several things:

  1. Mark the entity class as a Doctrine entity.
  2. Map the class attributes to database columns, providing the correct types for them.
  3. Use Doctrine Collections instead of arrays for one-to-many relations.

We can accomplish step 1. by using just one annotation:

/**
 * @ORM\Entity()
 */
final class PurchaseOrder
{

But when mapping the entity's attributes to database columns, we already get into trouble because, for instance, the $id attribute contains a value of type PurchaseOrderId. Doctrine doesn't know what to do with it. The first try might be to set up custom DBAL types to handle the conversion from and to the PurchaseOrderId type. This leads to a lot of boilerplate code for every custom attribute type we have, while there's a much simpler solution at hand. We just need to apply the following rule: only keep primitive-type values (string, integer, boolean, float) in the attributes of a Doctrine entity. That way, we don't need to do any complicated mapping, and we can just use Doctrine's basic types, string, integer, etc. Whenever you still need the value object, you just create it again (see supplierId() below):

/**
 * @ORM\Column(type="string")
 * @var string
 */
private $supplierId;

private function __construct(
    PurchaseOrderId $purchaseOrderId, 
    SupplierId $supplierId
) {
    // ...

    // convert to a string
    $this->supplierId = $supplierId->asString();
}

public function supplierId(): SupplierId
{
    // and back to an object, if necessary
    return SupplierId::fromString($this->supplierId);
}

When it comes to the identifier: in this case we define the identity ourselves (we don't wait for the database to return an auto-incremented integer), and we should tell Doctrine about that:

/**
 * @ORM\Id()
 * @ORM\GeneratedValue(strategy="NONE")
 * @ORM\Column(type="string")
 * @var string
 */
private $id;

private function __construct(
    PurchaseOrderId $purchaseOrderId, 
    SupplierId $supplierId
) {
    $this->id = $purchaseOrderId->asString();
    // ...
}

Finally, we need to configure the one-to-many relationship between order and lines. Doctrine requires us to use a Collection for the lines, and it also requires the owning side of the relation (which is the "many" part), to carry a reference to the inverse side (the "one" part). In our case, the Line needs to have a reference to the PurchaseOrder it belongs to. This means we need to modify the addLine() method a bit too:


/** * @ORM\OneToMany( * targetEntity="Line", * mappedBy="purchaseOrder", * cascade={"PERSIST"} * ) * @var Collection|Line[] */ private $lines; private function __construct( PurchaseOrderId $purchaseOrderId, SupplierId $supplierId ) { // ... $this->lines = new ArrayCollection(); } public function addLine( ProductId $productId, OrderedQuantity $quantity ): void { $lineNumber = \count($this->lines) + 1; // we also pass $this (the PurchaseOrder) to the Line: $this->lines[] = new Line($this, $lineNumber, $productId, $quantity); }

Note that, although we have to use a Doctrine Collection internally, we can easily hide that fact and still return an array to clients of PurchaseOrder:

public function lines(): array
{
    return $this->lines->toArray();
}

For the Line class we need to perform more or less the same steps. However, there are some interesting things to note:

  1. In our model we don't need or want Line to have its own ID, but Doctrine requires it to have one. So we just add it as a private field (it's not a problem to make this one an auto-incremented integer), and we never expose it as part of the object's API.
  2. In our model, Line is a child entity of PurchaseOrder. For Doctrine, it's like any other entity; you could fetch separate Line objects from an EntityRepository if you like. We can make it clear that this should not happen by only defining interfaces and implementations for aggregate repositories, so we'll have a PurchaseOrderRepository interface, but not a LineRepository interface. Still, you could always talk to the EntityManager directly and retrieve separate Line objects from it, but... you just shouldn't do that.

The Line class, when it has been converted to a Doctrine entity, looks like this:

use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Entity()
 */
final class Line
{
    /**
     * @ORM\Id()
     * @ORM\GeneratedValue(strategy="AUTO")
     * @ORM\Column(type="integer")
     * @var int
     */
    private $id;

    /**
     * @ORM\ManyToOne(targetEntity="PurchaseOrder")
     * @var PurchaseOrder
     */
    private $purchaseOrder;

    /**
     * @ORM\Column(type="integer")
     * @var int
     */
    private $lineNumber;

    /**
     * @ORM\Column(type="string")
     * @var string
     */
    private $productId;

    public function __construct(
        PurchaseOrder $purchaseOrder, 
        int $lineNumber, 
        ProductId $productId, 
        OrderedQuantity $quantity
    ) {
        $this->purchaseOrder = $purchaseOrder;
        $this->lineNumber = $lineNumber;
        $this->productId = $productId->asString();
        $this->quantity = $quantity->asFloat();
    }

    // ...
}

And all of this works well; you can store these objects without any trouble in a relational database. Summarizing:

  • Attributes only contain primitive-type values. If a client needs the value objects, reconstruct them when needed.
  • Child entities need their own ID; just give it to them.
  • The owning side in one-to-many relations needs to carry a reference to the other side.

What about embeddables, custom DBAL types, life-cycle event subscribers?

There are plenty of almost-solutions to the make-DDD-work-with-Doctrine problem, but in my experience they complicate things a lot. Look at the above examples; everything is so simple; hooking into Doctrine ORM or DBAL internals isn't going to make it much better. All you need is short, one-line transformations between your rich domain objects and the simpler value types that Doctrine understands, and you're done.

Doctrine wants whole objects, not just IDs

As you know, Doctrine makes relations between entities by passing around entire object references (not just IDs). But, as you may also know, one of the aggregate design rules is: Reference Other Aggregates By Identity:

Prefer references to external aggregates only by their globally unique identity, not by holding a direct object reference (or “pointer”).

Vaughn Vernon, "Effective Aggregate Design, Part II: Making Aggregates Work Together" (PDF)

How could this ever work with Doctrine ORM? Well, it Just Works. You need to follow the advice, and you're done already. The example above shows how it works: you pass in the ID to an external aggregate (in this case SupplierId), and set it - that's it.

What about foreign constraints then?

Without mapping the relationships using the real objects, we don't have the automatic convenience of Doctrine creating foreign key constraints for those relations. So, what about them?

Here's a challenging idea: I don't think you need foreign key constraints at all.

But if you want to, you can still enforce them of course. Just do a manual schema migration to add the index. You probably need an index anyway, to speed things up on the query-side (you can add them through Doctrine's mapping configuration too if you like).

Annotations - they pollute my clean domain model!

"I have this nice and clean domain model, no sign of external dependencies, no sign of persistence. And now we have these ugly annotations. Can we please get rid of them?"

Yes, of course, you could move the mapping configuration to some place else. To a Yaml file for example. Although the Doctrine team has decided to drop support for Yaml. But there's always XML.

Anyway, I don't think you should feel too bad about these annotations. It's very useful that they are right next to the things they configure (the attributes, and how they map to database columns). But also, you need to adapt your domain model to your ORM anyway (e.g. use Collections, pass the entire root entity into its child entities, etc.). So a few annotations seems to me like quite an innocent addition to this list of necessary changes.

"DDD" says: one transaction should persist only changes made to one aggregate

From DDD we know that every transaction may contain only the changes made to one aggregate:

A properly designed aggregate is one that can be modified in any way required by the business with its invariants completely consistent within a single transaction. And a properly designed bounded context modifies only one aggregate instance per transaction in all cases.

Vaughn Vernon, "Effective Aggregate Design, Part I: Modeling a Single Aggregate" (PDF)

This is a bit of an issue, when we compare this to how Doctrine ORM works:

The state of persistent entities is synchronized with the database on flush of an EntityManager which commits the underlying UnitOfWork. The synchronization involves writing any updates to persistent entities and their relationships to the database.

A flush will persist all changes to any entity that was created and persist()-ed, removed or modified, up to this point in time. So you need a way to overcome this problem.

There are usually two kinds of solutions: technical solutions and people solutions. In the first category you may consider using dedicated EntityManagers for each aggregate type. This may be a bit too much though, although it's quite doable, but hard to maintain. Another technical solution would be to at least always call EntityManager::clear() after a flush(), which would force the entity manager to let go of any entities it would otherwise revisit upon the next flush().

The better solution is to consider the people solution: just don't persist changes to multiple aggregates at once. This requires discipline, and maybe a bit of redesign every now and then. But your domain model will end up in a better shape (if you ask the DDDeity, that is). Also, since it's a people solution now, you can deviate from it, and still persist multiple aggregates in one go (when you're updating in batches, or as part of truly shady operations).

What about collections of value objects?

Both embeddables and DBAL types are famous for their inability to easily store collections of value objects. And Doctrine is not to blame for that - it's the relational database itself that doesn't allow multi-dimensional columns. So, if you're facing this situation, promote those value objects to child entities. This is the only slightly larger sacrifice that you may have to make, in the name of Doctrine.

PHP DDD value objects entity identity Comments

Road to dependency injection

Posted on by Matthias Noback

Statically fetching dependencies

I've worked with several code bases that were littered with calls to Zend_Registry::get(), sfContext::getInstance(), etc. to fetch a dependency when needed. I'm a little afraid to mention façades here, but they also belong in this list. The point of this article is not to bash a certain framework (they are all lovely), but to show how to get rid of these "centralized dependency managers" when you need to. The characteristics of these things are:

  • They offer global access; they are not internal to classes, but external, and thereby reachable from any scope.
  • They offer static access; they don't require you to have an instance of the object.

This means they can and will be called all over the place. If someone feels the need to use the translator service in a domain model, they can simply call Zend_Registry::get('Zend_Translate'). If they need to send an e-mail in a JSON mapper, there's sfContext::getInstance()->getMailer().

Part of the issue is that these centralized dependency managers provide access to other things than services, like user session data, request parameters, etc. (see also my previous article about this topic - "Context passing"). But when it comes to fetching services from these global service registries/locators, the main issue is that we're coupling all our code to this specific singleton thing.

Fetching the service locator

When trying to get rid of that-framework-that-was-totally-it-in-2012, we are in big trouble because of all that coupling. The setup of our services is not completely under our control. So, when migrating to a more recent service container implementation, like Symfony's dependency injection container, the first migration step might be to reuse the old mechanism for retrieving dependencies, making the new service container available to all the old code:

/*
 * At some point, we'll build and boot the Symfony service container.
 * We then inject it right into the old `Zend_Registry`.
 */
$container = ...;
Zend_Registry::set('container', $container);

/*
 * Anywhere else, we can retrieve the container and get a service from it.
 */
$container = Zend_Registry::get('container');
$translator = $serviceLocator->get('translator');

Since we can now define the translator service in any way that the dependency injection component of our choice supports, we can now migrate all existing uses of Zend_Registry::get('Zend_Translate') to the slightly better Zend_Registry::get('container')->get('translator').

"Service container" vs "service locator"

It's good to point out that the code in the example above uses the service container as a service locator. This is by far inferior, since we're still fetching services manually. This requires us to know the service ID and be aware of the service retrieval mechanism itself. Dependency injection is more useful if you leave all the instantiation details and dependency resolving to the service container (also known as "dependency injection container") itself.

Injecting the service locator

Ideally, the framework should be able to route a request to a controller, defined as a service in the service container (or dependency injection container). I know that Symfony is capable of that. It would retrieve the fully instantiated controller, and no other object would ever need the service locator anymore. But this isn't how most frameworks work (at least the older ones). So we need to be a bit more realistic. As an example, most applications still have multiple actions per controller class. Since all these actions use a different set of dependencies, the controller gets the service locator injected, and the actions can all take out the services they need.

So, as a compromise, where should we allow the use of a service locator?

  • Inside controllers for (web or CLI)
  • Inside controller helpers (in the case of Zend Framework), since they are still close to the infrastructure/outer boundary of the application.

Definitely not in any other service, only as an intermediate step on the road to dependency injection.

Performance

The question always seems to be: if we inject everything upfront, this means we also need to instantiate everything upfront. Some dependency injection components allow lazy loading for this. What gets injected is a proxy of the real object, that still satisfies any type-hints used. Once a method gets called on the proxy, it will instantiate the real service, including its real dependencies. However, if you would set up controllers-as-a-service, and would have only one action per controller, it would not be a problem to instantiate everything upfront. There's no performance penalty, since everything that gets instantiated, will in fact be used in the current request.

Pretending dependency injection

The solution I picked for some services in a recent migration was - contrary to the rule stated above - to inject the container itself, but the constructor still sets up the services as if they were injected, like this:


final class SomeService { private $translator; public function __construct(Container $container) { $this->translator = $container->get('translator'); } }

Service locator as a transitional phase

Once you're able to fetch dependencies in the constructor of a service, you're still fetching dependencies, but you're already preparing it for dependency injection. Once you are in a place where you can create the service with all its dependencies in one go, you can "externalize" the setup code:

final class Container
{
    public function getSomeService(): SomeService
    {
        return new SomeService(new Translator(...));
    }
}

final class SomeService
{
    private $translator;

    public function __construct(Translator $translator)
    {
        /*
         * No longer inject the container, but expect a translator to be 
         * provided.
         */

        $this->translator = $translator;
    }
}

At this point, SomeService doesn't use the service container as a service locator anymore. It simply accepts all its dependencies, unaware of where they come from.

When constructor injection isn't possible

In fact, when you're in complete control of how an object (service) gets instantiated, you shouldn't inject the whole container/service locator, but only the dependencies that are relevant to this service. Consider a counter-example, where, like in Zend Framework 1, framework-related objects (but also controllers) are almost always constructed like this:

$object = new $className();

You get no chance to provide constructor arguments, so you need to settle with inferior ways of injecting dependencies, e.g. setter injection, or property injection.

I recently applied both solutions to get the service container where I wanted it to be. The following example uses setter injection:

final class SomeService
{
    private static $serviceLocator;

    public static function setServiceLocator(ServiceLocator $locator)
    {
        self::$serviceLocator = $locator;
    }

    private function get(string $serviceId)
    {
        if (!self::$serviceLocator instanceof ServiceLocator) {
            throw new \LogicException(
                'Call setServiceLocator() first'
            );
        }

        return self::$serviceLocator->get($serviceId);
    }

    public function someMethod(): void
    {
        // ...

        $this->translator()->translate(...);
    }

    private function translator(): Translator
    {
        return $this->get('translator');
    }
}

// don't forget to do this first, before using an instance of SomeService
SomeService::setServiceLocator(...);

Introducing the static setter, but also a private non-static getter, gives us a false sense of encapsulation. But it also makes the translator feel like an injected dependency. If we're ever able to have proper dependency injection, we can just replace $this->translator() with $this->translator.

A more crude, but also more honest version of this style of "injection" could be to use (static) field injection:

final class SomeService
{
    public static $serviceLocator;
}

// don't forget to do this first, before using an instance of SomeService
SomeService::$serviceLocator = ...;

The advantage of the first approach (at least for PHP) is that you can only call setServiceLocator() with an object of the proper type (ServiceLocator). If you use the second approach, PHP allows you to assign any type of value to the public static attribute. Also, when using the private getter to locate a specific service, we get a convenient LogicException telling us that we should first call setServiceLocator().

Finally, you could also decide to inject specific services, like only the translator:

final class SomeService
{
    private static $translator;

    public static function setTranslator(Translator $translator)
    {
        self::$translator = $translator;
    }

    // ...
}

// don't forget to do this first, before using an instance of SomeService
SomeService::setTranslator(...);

I found that for a Symfony application, any bundle's boot() method is a good place to inject the service container statically into some classes.

final class AppBundle extends Bundle
{
    // ...

    public function boot()
    {
        SomeService::setServiceLocator($this->container);

        // or

        SomeService::setTranslator($this->container->get('translator'));
    }
}

Is setter/field injection better than retrieving the service locator?

There is a big difference between this approach of injecting the service locator using a public setter or field, and the approach we saw earlier where we fetch the service container from Zend_Registry for example. If we inject the service locator, at least we're not coupled to the global static centralized dependency manager anymore (be it Zend_Registry, sfContext or anything else). This is a big win in terms of flexibility and your ability to migrate the code base to another framework, dependency injection component, etc.

Temporal coupling and how to deal with it

The biggest issue with setter or field injection is in fact displayed by this annoying LogicException: we need to first make a certain call (setServiceLocator()), before it's safe to use the service. That's why earlier I called these types of injection "inferior".

When using setter or field injection, the code in the service class now has to deal with possible null values. The service can be instantiated but it's not yet safe to use. This will very likely lead to bugs and code that is hard to understand.

The name for "call this method first, than that one" is "temporal coupling". It should always be prevented. You should only be able to instantiate an object in a way that makes sense. The object, once it exists, should always be ready for use. That's why injecting a service locator like this should always be a transitional thing only.

Doing things in the wrong place

Earlier I made a little joke about sending an email from a domain object. If you have a global static mechanism for retrieving services it's possible. I don't think it's good design though. In terms of architecture it makes no sense that something in the core of the application would reach out and get in touch with "the world outside", without any indirection or abstraction.

I recently experienced again how migrating from global/static mechanisms of dependency retrieval to injecting dependencies, reveals how some objects have been doing things they really shouldn't be doing. Inside an object that looked like an active record entity, I encountered email sending functionality, but also repository-like loading of more data from a database, touching files, etc. It will be a lot of fun to figure out how to fix these classes on the way to dependency injection.

PHP dependency injection legacy code quality Comments

Deliberate coding

Posted on by Matthias Noback

I wanted to share an important lesson I learned from my colleague Ramon de la Fuente. I was explaining to him how I made a big effort to preserve some existing behavior, when he said something along the lines of: the people who wrote this code, may or may not have known what they were doing. So don't worry too much about preserving old stuff.

These wise words eventually connected to other things I've learned about programming, and I wanted to combine them under the umbrella of a blog post titled "Deliberate coding". Doing something "deliberately" means to do it consciously and intentionally. It turns out that not everyone writes code deliberately, and at the very least, not everyone does it all the time.

Decisions

It turns out that, when I look at a piece of legacy code, I'm making a few assumptions:

  • Every line of code is there for a reason, it has been explicitly decided that it should be this line of code. E.g. it embodies a piece of domain knowledge the developer had gained, it represents an exact requirement as provided by the product owner, it's a fix for a bug in some exotic situation, etc.
  • Every aspect of the behavior emerging from the code is relied upon by at least one user. Whenever I change something, it should be refactoring in the true sense of the word: changing the structure of the code, without changing its behavior, so the user isn't put off the track.

As I write this out, these are clearly not such good assumptions. Whether or not we can trust the code to be meaningful, important, to have been deliberately written down - that depends on a lot of things:

  • How were the developers "managed"?
  • Did they actually speak with domain experts?
  • Did they use some kind of test-first approach? Did they think about testability?
  • How well did they know the programming language itself?
  • Did they have a step-debugger installed?
  • And so on...

This could be a very long list actually.

Documenting the intention behind the code

I remembered something Cyrille Martraire wrote in his exhaustive treatise about the topic of "Living Documentation":

Building software is about continuous decision-making. Big decisions usually get a lot of attention, with dedicated meetings and several written documents, while other "less important" decisions are somewhat neglected. The problem is that many of these decisions end up being arbitrary rather than well-thought about, and their accumulated effect (perhaps even a compounding effect) is what will make the source code hostile to work with.

Maybe over time, the decisions that the programmers made were not always made consciously. Maybe, they were not based on domain knowledge, or actual requirements. Possibly, a line of code is just there:

  • Because adding that line seemed to make it work,
  • because the same line was also there in a similar situation,
  • because a manager told them to add it,
  • because it was a copy-paste error, or an accidental Duplicate line keystroke (remember goto fail;?)
  • because it was a leftover of the old situation, or maybe
  • because a manager didn't tell them to remove it.

This is why the intention behind the code needs to be documented. The code itself could be its documentation, and so could the history of the code, as it is recorded by your version control software. But tests/specifications are the best things to happen to code. Without tests, lines can be added, modified or removed and the developer who later reads the code, won't know if the code (and all of the code) is there for a good reason.

Writing code consciously

So I learned not to trust every line of code I encounter, and not to preserve all of its little (weird) behaviors. But, how to make sure that people who come after me can trust my code? Or, generalized: what can any developer do to write code consciously?

Well, the answer is a big one. Follow clean code rules, adopt a test-first approach, apply Domain-Driven Design, with related practices like Living Documentation. Anyway, it'll be impossible to cover all of these rules in this article. I thought it would be interesting though to briefly discuss some programming practices I noticed over the past few years, that stand in the way of deliberate coding.

Examples of non-deliberate practices

If you're looking for things that stand in the way of deliberate coding, keep an eye on the things you do without thinking. Find the advice you're following, without making a conscious trade-off. Look for code that you didn't write yourself (that had to be copied, or auto-generated).

We generate code, boilerplate, doc blocks

I find that I'm often looking for ways to let my IDE do things for me. Things that I usually don't (have to) write myself anymore, because my IDE has shortcuts for them, or offers me to add them, are:

  • Constructors
  • Getters
  • Doc blocks
  • Closing HTML tags
  • Test classes

If you use tools to generate controllers, console commands, or even entities, you're producing even more code that was not written by yourself. This usually means you'll end up with lines of code or comments in your files that are not there on purpose. Sometimes you don't even know why they're there or what they achieve, and sometimes you just forget to clean them up.

My advice: consider the trade-off. These convenient shortcuts and code generation tools will save you a bit of typing, but may bring code into your project that you don't really "own" (i.e. need, understand, maintain, etc.). They are okay to use, if you later revise the generated or copied code.

Also, keep an eye on how your IDE evolves. At some point I noticed that auto-generated doc blocks also contained exceptions that were thrown in the method. It should be a very conscious decision by you and your team, whether or not you want @throws annotations in your doc blocks. At the time of writing, my opinion is that for the most part it doesn't make sense to add them. But in some cases, it does, e.g. in interfaces. Explicitly mentioning which exceptions can be expected by the client will make them part of the interface's published API.

We auto-fix IDE inspections, etc.

I often find myself "fixing" code because the IDE tells me to. My IDE and the inspection plugins I've got installed warn me for all kinds of issues that I would've never been aware of if I'd still been writing code in Notepad (or Dreamweaver, like I did in 2004). I think this is quite a dangerous thing. When my IDE started adding @throws annotations to my doc blocks, it also started warning me about missing @throws annotations.

The idea may have come from an IDE developer with a Java background (nothing wrong with that of course!) - and they must have missed it in PHP, or at least found that the IDE should promote the Java practice of explicitly mentioning exceptions. It turns out there are good reasons for this practice. But we should not let an IDE take over existing programming principles just like that.

Of course, it's really great that IDEs help you write better code from the start. And I love most of what they do. But I also notice how following its suggestions, trying to get rid of smaller or bigger warnings, leads to code that you no longer deliberately write. So my advise on this: if your IDE suggests a change, think about it and discuss with your team whether or not you want to follow this particular type of suggestion.

A nice example is when I write:

mkdir(__DIR__ . '/' . uniqid('', true));

And PhpStorm (in fact, the PhpInspections plugin) tells me to change it to:

if (!mkdir(__DIR__ . '/' . uniqid('', true)) && !is_dir(__DIR__ . '/' . uniqid('', true))) {
    throw new \RuntimeException(sprintf('Directory "%s" was not created', __DIR__ . '/' . uniqid('', true)));
}

If I press Alt+Enter, it happens automatically. The change should help prevent race conditions. If this is your bug, it'll be an interesting one to fix. But the inspection plugin doesn't know about the context of this line of code. Maybe I'm writing the mkdir() statement in a script of which I'll never run more than one instance, so this concurrency issue may never occur. But more particular: in this case, I'm using a unique directory name, so I better hope that no other script is trying to create the same directory at the same time. Also, take a look at the replaced code. It won't work at all, since the directory name will be different every time uniqid() gets called.

What I'm saying is: think before letting your IDE write code for you.

We collapse code

One thing that annoys me is automatic collapsing of code. It basically says: these lines are not interesting, so I don't want to see them. Still, these lines are part of the repository, you preserve these lines, you may need to maintain these lines (like: read them, update them when necessary, clean them up, etc.). If you don't see these lines, you won't maintain them. Common examples of collapsed code are doc blocks and import/use statements.

Besides being a thing you have to maintain, you should actually keep an eye on your import statements, since they will be telling you a lot, like:

  • Is this code coupled to the framework?
  • Is this code adhering to the Dependency rule?
  • Does this code know about too many domain concepts?

So definitely don't collapse code (or only temporarily, to de-clutter the view on a piece of code you're working on).

Conclusion

In the bigger picture, my suggestions above about using your IDE well (and when to ignore it) may be a bit unimportant. In the end, it's about making an effort to code deliberately in every way possible. But there's an even higher level of coding deliberately that I think is worth considering: the ethical dimension of writing code.

Every line of code represents an ethical and moral decision.

Grady Booch

I just don't have a lot of advice on this topic yet... Let me know if you have good pointers about the topic!

PHP clean code deliberate coding Comments