Objects should be constructed in one go

Posted on by Matthias Noback

Consider the following rule:

When you create an object, it should be complete, consistent and valid in one go.

It is derived from the more general principle that it should not be possible for an object to exist in an inconsistent state. I think this is a very important rule, one that will gradually lead everyone from the swamps of those dreaded "anemic" domain models. However, the question still remains: what does all of this mean?

Well, for example, we should not be able to construct a Geolocation object with only a latitude:

final class Geolocation
{
    private $latitude;
    private $longitude;

    public function __construct()
    {
    }

    public function setLatitude(float $latitude): void
    {
        $this->latitude = $latitude;
    }

    public function setLongitude(float $longitude): void
    {
        $this->longitude = $longitude;
    }
}

$location = new Geolocation();
// $location is in invalid state!

$location->setLatitude(-20.0);
// $location is still in invalid state!

It shouldn't be possible to leave it in this state. It shouldn't even be possible to construct it with no data in the first place, because having a specific value for latitude and longitude is one of the core aspects of a geolocation. These values belong together, and a geolocation "can't live" without them. Basically, the whole concept of a geolocation would become meaningless if this were possible.

An object usually requires some data to fulfill a meaningful role. But it also poses certain limitations to what kind of data, and which specific subset of all possible values in the universe would be allowed. This is where, as part of the object design phase, you'll start looking for domain invariants. What do we know from the relevant domain that would help us define a meaningful model for the concept of a geolocation? Well, one of these things is that latitude and longitude should be within a certain range of values, i.e. -90 to 90 inclusive and -180 to 180 inclusive, respectively. It would definitely not make sense to allow any other value to be used. It would render all modelled behavior regarding geolocations useless.

Taking all of this into consideration, you may end up with a class that forms a sound model of the geolocation concept:

final class Geolocation
{
    private $latitude;
    private $longitude;

    public function __construct(
        float $latitude,
        float $longitude
    ) {
        Assertion::between($latitude, -90, 90);
        $this->latitude = $latitude;

        Assertion::between($longitude, -180, 180);
        $this->longitude = $longitude
    }
}

$location = new Geolocation(-20.0, 100.0);

This effectively protects geolocation's domain invariants, making it impossible to construct an invalid, incomplete or useless Geolocation object. Whenever you encounter such an object in your application, you can be sure that it's safe to use. No need to use a validator of some sorts to validate it first! This is why that rule about not allowing objects to exist in an inconsistent state is wonderful. My not-to-be-nuanced advice is to apply it everywhere.

An aggregate with child entities

The rule isn't without issues though. For example, I've been struggling to apply it to an aggregate with child entities, in particular, when I was working on modelling a so-called "purchase order". It's used to send to a supplier and ask for some goods (these "goods" are specific quantities of a certain product). The domain expert talks about this as "a header with lines", or "a document with lines". I decided to call the aggregate root "Purchase Order" (a class named PurchaseOrder) and to call the child entities representing the ordered goods "Lines" (in fact, every line is an instance of Line).

An important domain invariant to consider is: "every purchase order has at least one line". After all, it just doesn't make sense for an order to have no lines. When trying to apply this design rule, my first instinct was to provide the list of lines as a constructor argument. A simplified implementation (note that I don't use proper values objects in these examples!) would look like this:

final class PurchaseOrder
{
    private $lines;

    /**
     * @param Line[] $lines
     */
    public function __construct(array $lines)
    {
        Assertion::greaterThan(count($lines), 0,
            'A purchase order should have at least one line');

        $this->lines = $lines;
    }
}

final class Line
{
    private $lineNumber;
    private $productId;
    private $quantity;

    public function __construct(
        int $lineNumber, 
        int $productId, 
        int $quantity
    ) {
        $this->lineNumber = $lineNumber;
        $this->productId = $productId;
        $this->quantity = $quantity;
    }
}

// inside the application service:
$purchaseOrder = new PurchaseOrder(
    [
        new Line(...), 
        new Line(...)
    ]
); 

As you can see, this design makes the construction of the Line child entities a responsibility of the application service which creates the PurchaseOrder aggregate. One of the issues with that is that lines need to have an identity which is relative to the aggregate. So, when constructing these Line entities, the application service should provide it with an ID too:

// inside the application service:
$lines = [];

foreach (... as $lineNumber => ...) {
    $lines[] = new Line($lineNumber);
}

$purchaseOrder = new PurchaseOrder($lines); 

It would be cool if we wouldn't have to determine the "next identity" of a child entity outside of the aggregate. And the aggregate could happily do this work for us anyway. However, that would require a loop inside the constructor of PurchaseOrder, in which we call setLineNumber() on each Line object:

public function __construct(array $lines)
{
    foreach (array_values($lines) as $index => $line) {
        // line numbers will be 1, 2, 3, ...
        $line->setLineNumber($index + 1);
    }

    $this->lines = $lines;
}

That's not a nice solution, because now a Line can exist in an invalid, because incomplete state - without a line number.

So instead, we should let the PurchaseOrder create those Line entities itself. We'd only need to provide the raw data (product ID, quantity) as constructor arguments, e.g.

public function __construct(array $linesData)
{
    foreach (array_values($linesData) as $index => [$productId, $quantity]) {
        $this->lines[] = new Line(
            $index + 1,
            $productId,
            $quantity
        );
    }
}

However, I'm not really happy with $linesData being just some anonymous data structure. We could introduce something like a proper type for that - LineWithoutLineNumber, but that would be even more silly.

Instead, we should use a "DDD trick", that is to leave the creation of the child entity to the PurchaseOrder. We can do this using something that resembles a factory method. The difference being that this method doesn't (have to) return the created object (a Line instance), and that it also makes a state change to the PurchaseOrder. For example:

final class PurchaseOrder
{
    private $lines = [];

    public function __construct()
    {
        // no lines!
    }

    public function addLine(int $productId, int $quantity): void
    {
        $this->lines[] = new Line(
            count($this->lines) + 1,
            $productId,
            $quantity
        );
    }
}

// in the application service:
$purchaseOrder = new PurchaseOrder();
$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);

This looks great. But in the process of removing the $lines parameter from the constructor, we've unfortunately also lost our ability to protect our one domain invariant we cared about. Using this design, I can't think of a reasonable way to verify that there is at least one line. I mean, if we do that right after creating the PurchaseOrder it would be too soon. And adding this assertion to addLine() wouldn't make sense at all. The only moment at which we can reasonably verify it, is just before we persist the PurchaseOrder. In fact, we could move the validation to the repository. However, having a validate() method on PurchaseOrder wouldn't look great, and delegating the responsibility to protect domain invariants to the repository doesn't at all feel like a safe option. We'd basically be back at: throw a bunch of data at this object and validate it afterwards...

I'm tempted to go back to the first solution. But then we'd be running around in circles. Remember, if you get stuck, take a step back. We set out trying to accomplish everything in one go so we could protect this one very important domain invariant. We wouldn't even allow a nice and convenient method such as addLine() to exist, just because that allows the PurchaseOrder to exist in an incomplete, hence inconsistent state.

Being in this situation reminds me of a tweet by Alberto Brandolini:

Every modeling tool will have blind spots. Whenever the discussion around a model turns sterile, choose a different tool to challenge it.

Alberto Brandolini

We're discussing things endlessly, and we get stuck because of an object oriented programming design principle we apply without thinking. We should try to look at this PurchaseOrder thing from a different perspective. Because, if we think about it from a business perspective: a purchase order without any lines is actually fine, until it gets sent to the supplier.

A paper metaphor

As a "modelling tool" I find it very useful to imagine what dealing with a "paper purchase order" would look like. It's not at all far-fetched in this case, because even the domain experts speak of "documents". So consider how we would deal with a paper purchase order document. We'd take an empty sheet, notice some dotted lines to fill in the basics (supplier name, address, etc.). And then we see some blank space where we can write lines for every product we want to order. We could leave this paper "incomplete" for a quick bathroom stop. We can take it with us and discuss something about it with a co-worker, after which we make some corrections. Or we can even throw it away and start all over. But at some point, we're going to actually send it over to the supplier. And before we do, we give it one more look to verify that everything is there.

Translating the metaphor back to the code, we might realize that there's really two different "phases" in the life-cycle of the PurchaseOrder, which come with their own invariants. When the order is in its "draft" phase, we need to supply basic information, but we're allowed to add (and maybe remove) lines at will. Once we "finalize" the purchase order, we're claiming that it's ready to be sent, and at that point we could protect some other invariants.

We would only need to add a method to PurchaseOrder that would "finalize" it. Trying to be DDD-compliant, we look for a word that our domain experts use. This word turns out to be "place" - we're gradually filling in all the details and then we place the purchase order. So, in code it could look something like this:

final class PurchaseOrder
{
    private $lines = [];
    private $isPlaced;

    public function __construct(...)
    {
        // ...
    }

    public function addLine(int $productId, int $quantity): void
    {
        if ($this->isPlaced) {
            throw new \LogicException(
                'You cannot add a line to an order that was already placed'
            );
        }

        $this->lines[] = new Line(
            count($this->lines) + 1,
            $productId,
            $quantity
        );
    }

    public function place(): void
    {
        Assertion::greaterThan(count($this->lines), 0,
            'A purchase order should have at least one line');

        $this->isPlaced = true;
    }
}

// in the application service:
$purchaseOrder = new PurchaseOrder();

$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);
$purchaseOrder->addLine(...);

$purchaseOrder->place();

Note that, besides adding a place() method, we also modified addLine() to prevent new lines from being added after the order was placed. In the paper metaphor this wouldn't be allowed either, since the document has been sent to the supplier, so it will be very confusing if lines get added in our local version of the purchase order.

Also note that the place() method brings the aggregate root in a certain state, after which not everything is possible anymore. This might remind you of the concept of a state machine. I actually find that entities are often much like state machines. Given a certain state, operations on it are limited. And state transitions are limited too. For example, before placing the order, it would be possible to cancel it without any consequences, but after placing it, the system needs to take all kinds of compensating actions (send a message to the supplier that the order has been cancelled, etc.).

Conclusion

I find that leaving the exact type and construction details of nested objects to their parent object leads to a more "supple" design. In a sense, this is "old OOP knowledge" - we hide the implementation details of how exactly the PurchaseOrder deals with the lines (e.g. does it use a plain old array, or a collection object, do we need a Line class at all, etc.). We thereby allow refactoring of the PurchaseOrder aggregate without having to update all its clients across the code base.

This is part of what's meant by the traditional DDD advice to make the aggregate root the only entry point to interaction with any aggregate part:

Choose one ENTITY to be the root of each AGGREGATE, and control all access to the objects inside the boundary through the root.

Eric Evans, "Domain-Driven Design", Part II, Chapter Six: "The Life Cycle of a Domain Object"

No client should be able to make use of or create new Line objects directly. Unless, I have to add, there is a very specific use case for that. For instance, you should not expose all those internals for the sake of unit testing (see also a previous article - "Testing actual behavior").

So even though we ended up with a better design, we had to reconsider options for protecting the "an order should have at least one line" domain invariant. We discussed reaching for other modelling tools, like working out a "paper metaphor". Of course there are other modelling tools, but this one was effective for making us realize there are actually two distinct phases in the life-cycle of the purchase order.

The general advice being: if you find yourself stuck with a modelling question, look for ways to change your perspective. Even look for (unconsciously) applied (programming) rules that keep you from reaching the best solution. Alberto adds another useful suggestion to that:

...and you’re maybe looking for "the best" solution. When all you need is "a solution".

PHP Domain-Driven Design OOP object design
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
slifin

Pattern matching and first class types make this much easier:
https://repl.it/@slifin/Hus...

Matthias Noback

Thanks for sharing this here; I see what you mean. However, I'm not convinced that the matching isn't some "if" in disguise.

yourwebmaker

I've faced the same problem some years ago.

To solve such problem I've split the PurchaseOrder (i'm trying to match to your case) into a new class: Cart.

I did something like:

$cart = Cart::start();
$cart->addLine($itemId, $quantity);
$cart->addLine($itemId, $quantity);
$cart->addLine($itemId, $quantity);
$order = $cart->checkout(); //Returns the Order

Here is the checkout method:

public function checkout(CustomerData $customerData) : Order
{
return Order::create($customerData, $this->lines());
}

.....

Thinking about Bounded Contexts, it's 'ok' to an 'Order'|Cart to have no items at the Shopping context, but not for Shipping Context.

I think you've got struggled because you tried to use the same model for different contexts.

Sławek Grochowski

What about using 'new' in PurchaseOrder class?

'The point of dependency injection is to push the dependencies in from the outside, thereby revealing the dependencies in our classes. Using a new keyword inside a class is in opposition to that idea, so we need to work through the codebase to remove that keyword from our non-Factory classes.' Jones, Paul M.Modernizing Legacy Apps In PHP 2014

Matthias Noback

That's taking it way too far. new is not a dirty thing itself (just like static isn't). It's just that if you don't want to depend on something concrete, you define something more abstract, and let a concrete instance of it be injected.Paul Jones (I'm guessing) talks about instantiating services. This shouldn't happen on the spot, but by a service/dependency injection container. It will make the application as a whole much more flexible, leaving you with the future possibility of replacing entire framework components, without even touching your code code.When it comes to domain objects, you should definitely use new, and use all the concrete classes. This is your domain model, you're not going to replace parts of it. Besides, Line isn't a "dependency" of Order, it's a "part" of Order.

Mario T. Lanza

You've probably heard of Functional Core, Imperative Shell. Well let me expound with an odd notion.

If you create a thing as an immutable data structure (such as a purchase order), and you want a functional core, what you do is make it so that all commands (functions that would otherwise mutate) instead return a copy of the original with the modifications grafted on top (structural sharing). Then, you can effectively update your object with only queries that simulate commands (e.g. addLine returns a new copy of the purchase order with an extra line) by simply updating the variable.

po = po.addLine(productId, qty);

Some might use an observable/atom for the variable but any reference container is viable.

When you do it this way, you effectively have something akin to a builder that is not a builder (I've never heard anyone refer to a Clojure persistent vector as a builder, but that's effectively what it is).

The purchase order when adding a line effectively returns a new purchase order object. Now what if that query (addLine) were to return a different type based on its stage in a purchase order lifecycle (e.g. within the state machine)? Normally, the persistent data structure when updated returns another persistent data structure of the same type (e.g. you have a vector in Clojure and you add to it a new item, you get a new vector), but there's no rule that says it must. You could return a DraftPurchaseOrder to represent a purchase order in an incomplete state and a PurchaseOrder when it's complete. In theory, using the functional core, imperative shell approach, you have your commands (again, they're actually queries) return a type based on conditions present in the data.

At this point we could say that a purchase order is a sum type (it's the sum of all the potential states in the state machine). We would expect a similar interface/protocol among all types that make up the sum type to keep the api consistent.

For example, we might be able to call IPurchaseOrder.place on DraftPurchaseOrder and PurchaseOrder but would expect different results.

We would construct things with factory functions rather than concrete types:

var po = purchaseOrder(); /* returning our intial DraftPurchaseOrder state */
po = po.addLine("George Foreman Grill", 1); /* now we have a PurchaseOrder */
po = po.removeLines(); /* back to a DraftPurchaseOrder */
po = po.place(db) /* Draft Purchase Order Exception! */
var po2 = purchaseOrder([["Ginsu Knife", 3]]); /* a PurchaseOrder outright */
po = po2.place(db); /* Success */

I haven't actually implemented this approach, though I like it in theory.

Alberto Brandolini

Thank you for quoting me. :-) In your particular example looks like you've bumped into a draft model / executable model dichotomy, where you split the entire Bounded Context in two. In some domains you would have chosen a Cart -> Order modelling style: the cart would have some relaxed constraints (can we have an empty cart? yes, can we have an empty order? -> no, can we change quantities in the cart? -> sure! Can we change them in the order? I guess not). You end up with two models, but with greater flexibility (with an explicit transition from one model to the other one) and with different strength of the applied constraints: a draft model (the Cart or the unsigned PurchaseOrder can be in a loosely consistent state "the delivery address is still missing" that will prevent the order to be placed (thus transitioning to the executable model of the PurchaseOrder), but not to be persisted in the sales context.

Matthias Noback

Thanks for taking the time to comment on this. It makes sense to experiment with such an approach. I'm curious also about how one would determine this separation to be needed/successful/etc. If the concepts are relatively far apart from each other, like Cart and Order, it's clear that it's helpful. They will be more like things you hand over from context to context. Within the same context, it may be more troublesome I think, and maybe even hard to draw a line; we could end up having a CancelledOrder, EmailedOrder, ShippedOrder, etc. ;) In fact, this would bring us close to using events as the underlying drivers of state change.

Daniel S

Good article, thanks for that. I see another problem. An order line is then invalid if it does not belong to any order. So you would have to specify the order in the constructor of OrderLine. In this example, however, we create an OrderLine without reference to the order.

I suspect that there is no perfect solution here.

Eduardo Dobay

I didn't understand why you say that the belonging to an Order must be encoded explicitly in a Line. The Order already carries the Lines belonging to it. It seems to me that being in that context is enough to qualify the Line as valid. Or maybe you had something else in mind?

If we had some logic to test in the Line, we could do it (in principle) in isolation from an Order. Maybe a dependency from Line to Order (when Order already depends on Lines) would be an indication of unnecessary coupling?

Daniel S

A Line _has_ to be referred to its order. For example, take a simple Query to find all order lines > 100 $ (for statistics purpose). Now you want to fetch the related orders for that lines. Do you notice the required coupling to the order (and vice versa, too) ?

Eduardo Dobay

Oh, I understand that. From the point of view of your query it makes total sense that the reference to Order is necessary.

What I had in mind is that Lines as business conecpts are not so much "independent" - for me it would make more sense to ask "find all orders which have lines > $100" than "find all order lines > $100". Because the collection of all order lines is not very meaningful in the business, whereas the collection of all orders is.

Matthias Noback

In most cases I'd agree with that, although to be honest, we should never dismiss a modelling solution because we don't *think* it's a good idea. It may very well be a good idea in another situation.
However, in general, I also like the "parent" object or aggregate root to be the main entry point, not the "child" (e.g. the line). Besides, I like to apply a separation between read and write model as well, in which case the read model can be completely different from the write model, and in any particular read model, the line itself may be the most important thing, not the order.

Eduardo Dobay

Oh, I hadn't thought about that, but I agree with you, we shouldn't dismiss a solution in that way, and I'm happy to have heard that :)

Matthias Noback

Right, the thing is, a line doesn't need a reference to its order, unless for example your ORM needs it to have one.

Benoît Burnichon

What about returning a new PlacedOrder($lines) in PurchaseOrder::place()? This way PlacedOrder would always be valid and no checks to be done about whether the actual order is placed yet or not.
Object would not change its behavior when changing its state.

Matthias Noback

Yes, I think that's a viable solution indeed. You could still have the same persistence mechanism behind them if you like.

Stefan Gehrig

Instead of protecting the PurchaseOrder aggregate with an internal status flag to enforce domain constraints, my first thought was to use a PurchaseOrderBuilder to support the creation of a PurchaseOrder aggregate and to enforce the domain constraint of "at least one line item". Do you see any pros and cons compared to the idea you showed above? Or would that just be "another solution" to the problem in your eyes?

Matthias Noback

Good question; I honestly think a builder is not the right solution for this kind of problem. A builder is usually very intimate with the internals of the object it's building, which often breaks its encapsulation. Like I mentioned in the article, I would want the Line objects and the construction of them to remain an implementation detail of PurchaseOrder, while with a builder, there will now be another thing that has knowledge about Line objects.

Also, for this specific problem I wanted to persist the "incomplete" order, so that a user can continue working on it later. So that means the builder should be persistable, which itself would be yet another problem to solve ;)

Sascha Schimke

I do use the PurchaseOrderBuilder idea in our project. There is a method PurchaseOrderBuilder::addLine(int $productId, int $quantity). The PurchaseOrderBuilder::build() would perform the validation. The constructor PurchaseOrder::__construct(array $linesData) would take the raw data, but the the constructor is marked @internal to "thwart" the direct usage of new-Operator by clients.

The construction of Line objects would remain in the Order.

And yes, Builder and Order are very intimate.

Stefan Gehrig

I see your point on persisting "incomplete" orders. That truly makes sense but it wasn't clear from reading your blog post. I wouldn't persist builders either. Thanks for sharing your thoughts!

Matthias Noback

I see, yeah, that wasn't a requirement I explicitly stated :)

JosefCech

"No need to use a validator of some sorts to validate it first!"

How does your validation look like then? In the sense of validation for data outside of an application (HTML form etc) - for that you need a way to answer back if the provided state is valid. So you can either:

1) Trust that some type of exceptions are always of merit to user (by convention), catch them and show user their message.
2) Use DTOs (or whatever is the most easy with your framework) and have duplicate validations there (trust you are able to maintain validations across layers).
3) Validate external data, trust your already validated data.

How do you approach this in DDD? Do you have something in between? For example some types of validations are only on DTO, not on domain entity itself.

Matthias Noback

I like to validate the DTO, then let the aggregate protect its own state, which is indeed not the same thing, and would even allow for applying different rules, and a different style of user interaction.

You can reuse some validation logic between DTO validation and domain invariant protection if you want. You'll always have the domain invariant protection as a fallback anyway.

SebastianBrandner

Assertion::between($latitude, -90, 90);
$this->latitude = $latitude;

Assertion::between($latitude, -180, 180);
$this->longitude = $longitude

both checks are against $latitude

Matthias Noback

Thanks, fixing it now.

Nahun O

Assertion::greaterThan(count($lines), 1, 'A purchase order should have at least one line');

Also, isn't this enforcing 2 lines per order?

Matthias Noback

Thanks, fixing it now.