Objects should be constructed in one go

Matthias Noback

July 17, 2018

Consider the following rule:

When you create an object, it should be complete, consistent and valid in one go.

It is derived from the more general principle that it should not be possible for an object to exist in an inconsistent state. I think this is a very important rule, one that will gradually lead everyone from the swamps of those dreaded “anemic” domain models. However, the question still remains: what does all of this mean?

Well, for example, we should not be able to construct a Geolocation object with only a latitude:

final class Geolocation
{
    private $latitude;
    private $longitude;
    
    public function __construct()
    {
    }
    
    public function setLatitude(float $latitude): void
    {
        $this->latitude = $latitude;
    }
    
    public function setLongitude(float $longitude): void
    {
        $this->longitude = $longitude;
    }
}

$location = new Geolocation();
// $location is in invalid state!

$location->setLatitude(-20.0);
// $location is still in invalid state!

It shouldn’t be possible to leave it in this state. It shouldn’t even be possible to construct it with no data in the first place, because having a specific value for latitude and longitude is one of the core aspects of a geolocation. These values belong together, and a geolocation “can’t live” without them. Basically, the whole concept of a geolocation would become meaningless if this were possible.

An object usually requires some data to fulfill a meaningful role. But it also poses certain limitations to what kind of data, and which specific subset of all possible values in the universe would be allowed. This is where, as part of the object design phase, you’ll start looking for domain invariants. What do we know from the relevant domain that would help us define a meaningful model for the concept of a geolocation? Well, one of these things is that latitude and longitude should be within a certain range of values, i.e. -90 to 90 inclusive and -180 to 180 inclusive, respectively. It would definitely not make sense to allow any other value to be used. It would render all modelled behavior regarding geolocations useless.

Taking all of this into consideration, you may end up with a class that forms a sound model of the geolocation concept:

final class Geolocation
{
    private $latitude;
    private $longitude;
    
    public function __construct(
        float $latitude,
        float $longitude
    ) {
        Assertion::between($latitude, -90, 90);
        $this->latitude = $latitude;
        
        Assertion::between($longitude, -180, 180);
        $this->longitude = $longitude
    }
}

$location = new Geolocation(-20.0, 100.0);

This effectively protects geolocation’s domain invariants, making it impossible to construct an invalid, incomplete or useless Geolocation object. Whenever you encounter such an object in your application, you can be sure that it’s safe to use. No need to use a validator of some sorts to validate it first! This is why that rule about not allowing objects to exist in an inconsistent state is wonderful. My not-to-be-nuanced advice is to apply it everywhere.

An aggregate with child entities

The rule isn’t without issues though. For example, I’ve been struggling to apply it to an aggregate with child entities, in particular, when I was working on modelling a so-called “purchase order”. It’s used to send to a supplier and ask for some goods (these “goods” are specific quantities of a certain product). The domain expert talks about this as “a header with lines”, or “a document with lines”. I decided to call the aggregate root “Purchase Order” (a class named PurchaseOrder) and to call the child entities representing the ordered goods “Lines” (in fact, every line is an instance of Line).

An important domain invariant to consider is: “every purchase order has at least one line”. After all, it just doesn’t make sense for an order to have no lines. When trying to apply this design rule, my first instinct was to provide the list of lines as a constructor argument. A simplified implementation (note that I don’t use proper values objects in these examples!) would look like this:

final class PurchaseOrder
{
    private $lines;
    
    /**
     * @param Line[] $lines
     */
    public function __construct(array $lines)
    {
        Assertion::greaterThan(count($lines), 0,
            'A purchase order should have at least one line');
            
        $this->lines = $lines;
    }
}

final class Line
{
    private $lineNumber;
    private $productId;
    private $quantity;
    
    public function __construct(
        int $lineNumber, 
        int $productId, 
        int $quantity
    ) {
        $this->lineNumber = $lineNumber;
        $this->productId = $productId;
        $this->quantity = $quantity;
    }
}

// inside the application service:
$purchaseOrder = new PurchaseOrder(
    [
        new Line(...), 
        new Line(...)
    ]
);

As you can see, this design makes the construction of the Line child entities a responsibility of the application service which creates the PurchaseOrder aggregate. One of the issues with that is that lines need to have an identity which is relative to the aggregate. So, when constructing these Line entities, the application service should provide it with an ID too:

// inside the application service:
$lines = [];

foreach (... as $lineNumber => ...) {
    $lines[] = new Line($lineNumber);
}

$purchaseOrder = new PurchaseOrder($lines);

It would be cool if we wouldn’t have to determine the “next identity” of a child entity outside of the aggregate. And the aggregate could happily do this work for us anyway. However, that would require a loop inside the constructor of PurchaseOrder, in which we call setLineNumber() on each Line object:

public function __construct(array $lines)
{
    foreach (array_values($lines) as $index => $line) {
        // line numbers will be 1, 2, 3, ...
        $line->setLineNumber($index + 1);
    }
    
    $this->lines = $lines;
}

That’s not a nice solution, because now a Line can exist in an invalid, because incomplete state - without a line number.

So instead, we should let the PurchaseOrder create those Line entities itself. We’d only need to provide the raw data (product ID, quantity) as constructor arguments, e.g.

public function __construct(array $linesData)
{
    foreach (array_values($linesData) as $index => [$productId, $quantity]) {
        $this->lines[] = new Line(
            $index + 1,
            $productId,
            $quantity
        );
    }
}

However, I’m not really happy with $linesData being just some anonymous data structure. We could introduce something like a proper type for that - LineWithoutLineNumber, but that would be even more silly.

Instead, we should use a “DDD trick”, that is to leave the creation of the child entity to the PurchaseOrder. We can do this using something that resembles a factory method. The difference being that this method doesn’t (have to) return the created object (a Line instance), and that it also makes a state change to the PurchaseOrder. For example:

final class PurchaseOrder
{
    private $lines = [];
    
    public function __construct()
    {
        // no lines!
    }
    
    public function addLine(int $productId, int $quantity): void
    {
        $this->lines[] = new Line(
            count($this->lines) + 1,
            $productId,
            $quantity
        );
    }
}

// in the application service:
$purchaseOrder = new PurchaseOrder();
$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);

This looks great. But in the process of removing the $lines parameter from the constructor, we’ve unfortunately also lost our ability to protect our one domain invariant we cared about. Using this design, I can’t think of a reasonable way to verify that there is at least one line. I mean, if we do that right after creating the PurchaseOrder it would be too soon. And adding this assertion to addLine() wouldn’t make sense at all. The only moment at which we can reasonably verify it, is just before we persist the PurchaseOrder. In fact, we could move the validation to the repository. However, having a validate() method on PurchaseOrder wouldn’t look great, and delegating the responsibility to protect domain invariants to the repository doesn’t at all feel like a safe option. We’d basically be back at: throw a bunch of data at this object and validate it afterwards…

I’m tempted to go back to the first solution. But then we’d be running around in circles. Remember, if you get stuck, take a step back. We set out trying to accomplish everything in one go so we could protect this one very important domain invariant. We wouldn’t even allow a nice and convenient method such as addLine() to exist, just because that allows the PurchaseOrder to exist in an incomplete, hence inconsistent state.

Being in this situation reminds me of a tweet by Alberto Brandolini:

Every modeling tool will have blind spots. Whenever the discussion around a model turns sterile, choose a different tool to challenge it.

Alberto Brandolini

We’re discussing things endlessly, and we get stuck because of an object oriented programming design principle we apply without thinking. We should try to look at this PurchaseOrder thing from a different perspective. Because, if we think about it from a business perspective: a purchase order without any lines is actually fine, until it gets sent to the supplier.

A paper metaphor

As a “modelling tool” I find it very useful to imagine what dealing with a “paper purchase order” would look like. It’s not at all far-fetched in this case, because even the domain experts speak of “documents”. So consider how we would deal with a paper purchase order document. We’d take an empty sheet, notice some dotted lines to fill in the basics (supplier name, address, etc.). And then we see some blank space where we can write lines for every product we want to order. We could leave this paper “incomplete” for a quick bathroom stop. We can take it with us and discuss something about it with a co-worker, after which we make some corrections. Or we can even throw it away and start all over. But at some point, we’re going to actually send it over to the supplier. And before we do, we give it one more look to verify that everything is there.

Translating the metaphor back to the code, we might realize that there’s really two different “phases” in the life-cycle of the PurchaseOrder, which come with their own invariants. When the order is in its “draft” phase, we need to supply basic information, but we’re allowed to add (and maybe remove) lines at will. Once we “finalize” the purchase order, we’re claiming that it’s ready to be sent, and at that point we could protect some other invariants.

We would only need to add a method to PurchaseOrder that would “finalize” it. Trying to be DDD-compliant, we look for a word that our domain experts use. This word turns out to be “place” - we’re gradually filling in all the details and then we place the purchase order. So, in code it could look something like this:

final class PurchaseOrder
{
    private $lines = [];
    private $isPlaced;
    
    public function __construct(...)
    {
        // ...
    }
    
    public function addLine(int $productId, int $quantity): void
    {
        if ($this->isPlaced) {
            throw new \LogicException(
                'You cannot add a line to an order that was already placed'
            );
        }
        
        $this->lines[] = new Line(
            count($this->lines) + 1,
            $productId,
            $quantity
        );
    }
    
    public function place(): void
    {
        Assertion::greaterThan(count($this->lines), 0,
            'A purchase order should have at least one line');
            
        $this->isPlaced = true;
    }
}

// in the application service:
$purchaseOrder = new PurchaseOrder();

$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);

$purchaseOrder->place();

Note that, besides adding a place() method, we also modified addLine() to prevent new lines from being added after the order was placed. In the paper metaphor this wouldn’t be allowed either, since the document has been sent to the supplier, so it will be very confusing if lines get added in our local version of the purchase order.

Also note that the place() method brings the aggregate root in a certain state, after which not everything is possible anymore. This might remind you of the concept of a state machine. I actually find that entities are often much like state machines. Given a certain state, operations on it are limited. And state transitions are limited too. For example, before placing the order, it would be possible to cancel it without any consequences, but after placing it, the system needs to take all kinds of compensating actions (send a message to the supplier that the order has been cancelled, etc.).

Conclusion

I find that leaving the exact type and construction details of nested objects to their parent object leads to a more “supple” design. In a sense, this is “old OOP knowledge” - we hide the implementation details of how exactly the PurchaseOrder deals with the lines (e.g. does it use a plain old array, or a collection object, do we need a Line class at all, etc.). We thereby allow refactoring of the PurchaseOrder aggregate without having to update all its clients across the code base.

This is part of what’s meant by the traditional DDD advice to make the aggregate root the only entry point to interaction with any aggregate part:

Choose one ENTITY to be the root of each AGGREGATE, and control all access to the objects inside the boundary through the root.

Eric Evans, “Domain-Driven Design”, Part II, Chapter Six: “The Life Cycle of a Domain Object”

No client should be able to make use of or create new Line objects directly. Unless, I have to add, there is a very specific use case for that. For instance, you should not expose all those internals for the sake of unit testing (see also a previous article - “Testing actual behavior”).

So even though we ended up with a better design, we had to reconsider options for protecting the “an order should have at least one line” domain invariant. We discussed reaching for other modelling tools, like working out a “paper metaphor”. Of course there are other modelling tools, but this one was effective for making us realize there are actually two distinct phases in the life-cycle of the purchase order.

The general advice being: if you find yourself stuck with a modelling question, look for ways to change your perspective. Even look for (unconsciously) applied (programming) rules that keep you from reaching the best solution. Alberto adds another useful suggestion to that:

…and you’re maybe looking for “the best” solution. When all you need is “a solution”.