A common misunderstanding in my workshops (well, whose fault is it then? ;)), is about the distinction between a DTO and a value object. And so I've been looking for a way to categorize these objects without mistake.
What's a DTO and how do you recognize it?
A DTO is an object that holds primitive data (strings, booleans, floats, nulls, arrays of these things). It defines the schema of this data by explicitly declaring the names of the fields and their types. It can only guarantee that all the data is there, simply by relying on the strictness of the programming language: if a constructor has a required parameter of type string
, you have to pass a string, or you can't even instantiate the object. However, a DTO does not provide any guarantee that the values actually make sense from a business perspective. Strings could be empty, integers could be negative, etc.
There are different flavours of the class design for DTOs:
/**
* @object-type DTO
*
* Using a constructor and public readonly properties:
*/
final class AnExample
{
public function __construct(
public readonly string $field,
// ...
) {
}
}
/**
* @object-type DTO
*
* Using a constructor with private readonly properties
* and public getters:
*/
final class AnotherExample
{
public function __construct(
private readonly string $field,
// ...
) {
}
public function field(): string
{
return $this->field;
}
}
Regarding the naming of a DTO: I recommend not adding "DTO" to the name itself. If you want to make it clear what the type is, add a comment, or an invented annotation (or attribute) like @object-type
. This will be very useful for developers that are not aware of these object types. It may trigger them to look up an article about what it means (this article, maybe :)).
What's a value object and how do you recognize it?
A value object is an object that wraps one or more values or value objects. It guarantees that all the data is there, and also that the values make sense from a domain perspective. Strings will no longer be empty, numbers will be verified to be in the correct range. A value object can offer these guarantees by throwing exceptions inside the constructor, which is private, forcing the client to use one of the static, named constructors. This makes a value object easy to recognize, and clearly distinguishable from a DTO:
final class AnExample
{
private function __construct(
private string $value
) {
}
public static function fromValue(
string $value
): self {
/*
* Throw an exception when the value doesn't
* match all the expectations.
*/
return new self($value);
}
}
While a DTO just holds some data for you and provides a clear schema for this data, a value object also holds some data, but offers evidence that the data matches the expectations. When the value object's class is used as a parameter, property, or return type, you know that you are dealing with a correct value.
How should we use these object types?
Meaning is defined by use. If we are using "DTO" and "value object" in the wrong way, their names will eventually get a different meaning. This might be how the confusion between the two terms arises in the first place.
DTOs
A DTO should only be used in two places: where data enters the application or where it leaves the application. Some examples:
- When a controller receives an HTTP POST request, the request data may have any shape. We need to go from shapeless data to data with a schema (verified keys and types). We can use a DTO for this. A form library may be able to populate this DTO based on submitted form data, or we can use a serializer to convert the plain-text request body to a populated DTO.
- When we make an HTTP POST request to a web service, we may collect the input data in a DTO first, and then serialize it to a request body that our HTTP client can send to the service.
- For queries the situation is similar. Here we can use a DTO to represent the query result. As an example we can pass a DTO to a template to render a view based on it. We can use a DTO, serialize it to JSON and send it back as an API response.
- When we send an HTTP GET request to a web service, we may deserialize the API response into a DTO first, so we can apply a known schema to it instead of just accessing array keys and guessing the types. API client packages usually offer DTOs for requests and responses.
Value objects
A value object is used wherever we want to verify that a value matches our expectations, and we don't want to verify it again. We also use it to accumulate behavior related to a particular value. E.g. if we have an EmailAddress
value object, we know that the value has been verified to look like a valid email address,so we don't have to check it again in other places. We can also add methods to the object that extract for instance the username, or the hostname, from the email address.
Value objects are often used in domain models because guarantees, or invariants, are an important part of their business. But they can be used anywhere in an application, since every part of the application will need ways to centralize some rules, provide evidence of correctness, and accumulate related behavior.
Conclusion
There's much more to say about value objects, but that was not the point of this article (if you want to read more, check out my book Object Design Style Guide, or Implementing Domain-Driven Design by Vaughn Vernon). The goal was to show most clearly the difference between DTOs and value objects and I hope they will no longer be confused. Here's a summary table:
A DTO:
- Declares and enforces a schema for data: names and types
- Offers no guarantees about correctness of values
A value object:
- Wraps one or more values or value objects
- Provides evidence of the correctness of these values
Can you please explain why you advise against adding "DTO" to a DTO it's name?
Does these objects can have methods to encapsulate complex logic like calculation or this logic must be defined in the constructor and stored in property ?
There is another article compares Value object, DTO and Entity. Your Customer looks like Entity
Value object with more than one value could be geo marker with two values - latitude and longitude, both require validation beyond name and types.
Value objects to not represent something bigger than its value
One thing I seem to find frequently in blog posts dealing with this topic is the abuse of validation enforcement when the values for a dto/value-object are coming from within the application itself. Fe. the object might be built in code, with fixed values - in which case they are valid by definition, or might be built with data from a db query - in which case I'd rather have the validation be done when it is inserted in the database. In such scenarios it feels wasteful to execute data-validation checks. And it also feels somehow wrong from a "philosophical" pov - as it means muddying the distinction between trusted and untrusted.
Another often overlooked issue which has a deep impact on the topic of validation is how error messages are presented to the end user. This is true, to a lesser degree, even for apps exposing only APIs, as the response error by those will have to be understood by human developers in the end. Often a constraint can be enforced at different places within the app layers, but the error messages which are generated will be wildly different. No-one wants to code the same data validation multiple times in multiple places. And enforcing an integer value to be non-negative when defining a db table column is both easy to do and a good safety practice, as it gives true trust in correctness, regardless of how many apps share the same data. But the user experience is pretty bad when receiving an error such as "query error: invalid value for table.column".
How do you handle data that needs to be sanitized?
For example: I got this IBAN number “BE88 9544 3714 9541” but I want to have it in my database without spaces.
So, when I make a VO of it, do I sanitize it in the constructor of my VO? Or do I sanitize it outside my VO and pass the correct value to the VO?
I personally have my sanitizers in separate classes and like to avoid services to be used in my VO’s. But if the input comes from a lot of different places it saves some lines of code.
> How do you handle data that needs to be sanitized?
> So, when I make a VO of it, do I sanitize it in the constructor of my VO? Or do I sanitize it outside my VO and pass the correct value to the VO?
I guess, as most often, there is no one true answer to that.
I tend to answer: If you don't have any good place else, anyway, I'd sanitize it in the constructor of the VO. This way you have one class that is concerned about the validity of a given thing.
This sanitization logic, though, could be held in another, dedicated class, e.g., "ThingSanitization", and could be called by the VO construction. This could lead to better separation of concerns (value encapsulation and value sanitization).
For my code, I try to convert unstructured data into VOs as early as possible and keep using the VO. So it would be quite early, anyway.
> For example: I got this IBAN number “BE88 9544 3714 9541” but I want to have it in my database without spaces.
In this case though, I'm not sure if this is actually a real case of sanitization. Depending on the usage, “BE88 9544 3714 9541” could be seen as valid IBAN representation.
Nonetheless, I would have an IBAN VO that has a named constructor "fromUnformattedIban()" or "fromUnformattedString()" that would then sanitize and (if necessary) reformat the IBAN for internal storage in the VO.
Also, I'd have different formatting methods on the VO. E.g. "toHumanReadable()" for human readability purposes, which would likely be “BE88 9544 3714 9541” again and "toStorable()" for storage in any kind, like DB.
Again, the formatting logic _could_ live in another, dedicated formatting class.