Refactoring without tests should be fine

Refactoring without tests should be fine. Why is it not? When could it be safe?

From the cover of “Refactoring” by Martin Fowler:

Refactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which “too small to be worth doing”. However the cumulative effect of each of these transformations is quite significant.

Good design means it's easy-to-change

Software development seems to be about change: the business changes and we need to reflect those changes, so the requirements or specifications change, frameworks and libraries change, so we have to change our integrations with them, etc. Changing the code base accordingly is often quite painful, because we made it resistant to change in many ways.

Code that resists change

I find that not every developer notices the “pain level” of a change. As an example, I consider it very painful if I can’t rename a class, or change its namespace. One reason could be that some classes aren’t auto-loaded with Composer, but are still manually loaded with require statements. Another reason could be that the framework expects the class to have a certain name, be in a certain namespace, and so on. This may be something you personally don’t consider painful, since you can avert the pain by simply not considering to rename or move classes.

Can we consider DateTimeImmutable a primitive type?

During a workshop we were discussing the concept of a Data Transfer Object (DTO). The main characteristic of a DTO is that it holds only primitive-type values (strings, integers, booleans), lists or maps of these values including “nested” DTOs. Not sure who came up with this idea, but I’m using it because it ensures that the DTO becomes a data structure that only enforces a schema (field names, the expected types, required fields, and optional fields), but doesn’t enforce semantics for any value put into it. That way it can be created from any data source, like submitted form values, CLI arguments, JSON, XML, Yaml, and so on. Using primitive values in a DTO makes it clear that the values are not validated. The DTO is just used to transfer or carry data from one layer to the next. A question that popped up during the workshop: can we consider DateTimeImmutable a primitive-type value too? If so, can we use this type inside DTOs?

Is it a DTO or a Value Object?

A common misunderstanding in my workshops (well, whose fault is it then? ;)), is about the distinction between a DTO and a value object. And so I’ve been looking for a way to categorize these objects without mistake.

What’s a DTO and how do you recognize it?

A DTO is an object that holds primitive data (strings, booleans, floats, nulls, arrays of these things). It defines the schema of this data by explicitly declaring the names of the fields and their types. It can only guarantee that all the data is there, simply by relying on the strictness of the programming language: if a constructor has a required parameter of type string, you have to pass a string, or you can’t even instantiate the object. However, a DTO does not provide any guarantee that the values actually make sense from a business perspective. Strings could be empty, integers could be negative, etc.

Simple Solutions 1 - Active Record versus Data Mapper

Having discussed different aspects of simplicity in programming solutions, let’s start with the first topic that should be scrutinized regarding their simplicity: persisting model objects. As you may know, we have competing solutions which fall into two categories: they will follow either the Active Record (AR) or the Data Mapper pattern (DM) (as described in Martin Fowler’s “Patterns of Enterprise Application Architecture”, abbrev. PoEAA).

Active record

How do we recognize the AR pattern? It’s when you instantiate a model object and then call save() on it:

What's a simple solution?

“As I’m becoming a more experienced programmer, I tend to prefer simple solutions.” Or something similar. As is the case with many programming-related quotes, this is somewhat of a blanket statement because who doesn’t prefer simple solutions? To make it a powerful statement again, you’d have to explain what a simple solution is, and how you distinguish it from not-so-simple solutions. So the million-dollar question is “What is a simple solution?”, and I’ll answer it now.

When to use a trait?

When to use a trait? Never.

Well, a trait could be considered to have a few benefits:

Benefits

If you want to reuse some code between multiple classes, using a trait is an alternative for extending the class. In that case the trait may be the better option because it doesn’t become part of the type hierarchy, i.e. a class that uses a trait isn’t “an instance of that trait”.
A trait can save you some manual copy/paste-ing by offering compile-time copy/paste-ing instead.

Downsides

On the other hand, there are several problems with traits. For instance:

Decoupling your security user from your user model

This article shows an example of framework decoupling. You’ll find a more elaborate discussion in my latest book, Recipes for Decoupling.

Why would it be nice to decouple your user model from the framework’s security user or authentication model?

Reason 1: Hexagonal architecture

I like to use hexagonal architecture in my applications, which means among other things that the entities from my domain model stay behind a port. They are never exposed to, for instance, a controller, or a template. Whenever I want to show anything to the user, I create a dedicated view model for it.

Effective immutability with PHPStan

This article is about a topic that didn’t make the cut for my latest book, Recipes for Decoupling. It still contains some useful ideas I think, so here it is anyway!

DateTimeImmutable is mutable

I don’t know where I first heard it, but PHP’s DateTimeImmutable is not immutable:

<?php

$dt = new DateTimeImmutable('now');
echo $dt->getTimestamp() . "\n";

$dt->__construct('tomorrow');
echo $dt->getTimestamp() . "\n";

The result:

1656927919
1656972000

Indeed, DateTimeImmutable is not really immutable because its internal state can be modified after instantiation. After calling __construct() again, any existing object reference will have the modified state as well. But an even bigger surprise might be that if a constructor is public, it’s just another method that you can call. You’re not supposed to, but you can. Which is another reason to make the constructor of non-service objects private and add a public static constructor to a class that you want to be immutable:

DDD entities and ORM entities

I was tweeting something about having separate “DDD” and “ORM” entities in a project in a project, and that I don’t understand this. There were some great comments and questions, thanks a lot for that! To be honest, I understand more about it now. In this article I’ll try to provide some more information about this.

I’m glad many developers want to use input from the book “Domain-Driven Design” by Eric Evans to improve their domain model. I recommend reading this book and getting your information from the source, because unfortunately the internet, tweets, e-books, including my own books, aren’t able to reflect a full, nor a correct view of everything there is to find out about this topic. All too often DDD is completely misinterpreted to be “an elitist, dogmatic approach to programming, where we use DTOs, layers, and CQRS”…

Too much magic?

Years ago my co-worker Maurits introduced me to the term “magic” in programming. He also provided the valuable dichotomy of convention and configuration (or in fact, he’d choose configuration over convention…). I think this distinction could be very helpful in psychological research, figuring out why some people prefer framework X over framework Y. One requires the developer to spell out everything they want in elaborate configuration files, the other relies on convention: placing certain files with certain names and certain methods in certain places will make everything work “magically”.

The Dependency Injection Paradigm

Paradigm; a nice word that means “a theory or a group of ideas about how something should be done, made, or thought about” (Merriam-Webster). In software development we have them too. From the philosophy and history of science courses I’ve followed, I remember that scientists working with different paradigms have great difficulty understanding each other, and appreciating each other’s work.

Paradigm Shifts

An example of a paradigm is the theory that the sun revolves around the earth. To a certain extent this is a fruitful theory, and it has been used for thousands of years. There’s of course another paradigm: the theory that the earth revolves around the sun. This is also a fruitful theory, and it can be used to explain a lot of observations, more than the previous theory. Still, people got angry with each other for moving the earth out of the center of the universe. Paradigm changes, or shifts, occur when the old theory has been stretched too much. It becomes impossible to hold on to it. Then some people start to experiment with a completely different paradigm, one that sounds totally weird, but in the end proves to have more power.

Quick Testing Tips: One Class, One Test?

I’ve mentioned this several times without explaining: the rule that every class should have a test, or that every class method should have a test, does not make sense at all. Still, it’s a rule that many teams follow. Why? Maybe they used to have a #NoTest culture and they never want to go back to it, so they establish a rule that is easy to enforce. When reviewing you only have to check: does the class have a test? Okay, great. It’s a bad test? No problem, it is a test. I already explained why I think you need to make an effort to not just write any test, but to write good tests (see also: Testing Anything; Better Than Testing Nothing?). In this article I’d like to look closer at the numbers: one class - one test.

Quick Testing Tips: Write Unit Tests Like Scenarios

I’m a big fan of the BDD Books by Gáspár Nagy and Seb Rose, and I’ve read a lot about writing and improving scenarios, like Specification by Example by Gojko Adzic and Writing Great Specifications by Kamil Nicieja. I can recommend reading anything from Liz Keogh as well. Trying to apply their suggestions in my development work, I realized: specifications benefit from good writing. Writing benefits from good thinking. And so does design. Better writing, thinking, designing: this will make us do a better job at programming. Any effort put into these activities has a positive impact on the other areas, even on the code itself.

Where do types come from?

In essence, everything is a string.

Well, you can always go one layer deeper and find out what a string really is, but for web apps I work on, both input data and output data are strings. The input is an HTTP request, which is a plain-text message that gets passed to the web server, the PHP server, the framework, and finally a user-land controller. The output is an HTTP response, which is also a plain-text message that gets passed to the client. If my app needs the database to load or store some data, that data too is in its initial form a string. It needs to be deserialized into objects to do something and later be serialized into strings so we can store the results.

Quick Testing Tips: Testing Anything; Better Than Testing Nothing?

“Yes, I know. Our tests aren’t perfect, but it’s better to test anything than to test nothing at all, right?”

Let’s look into that for a bit. We’ll try the “Fowler Heuristic” first:

One of my favourite (of the many) things I learned from consulting with Martin Fowler is that he would often ask “Compared to what?”

Agile helps you ship faster!

Compared to what?

[…]

Often there is no baseline.

Quick Testing Tips: Self-Contained Tests

Whenever I read a test method I want to understand it without having to jump around in the test class (or worse, in dependencies). If I want to know more, I should be able to “click” on one of the method calls and find out more.

I’ll explain later why I want this, but first I’ll show you how to get to this point.

As an example, here is a test I encountered recently:

Don't test constructors

@ediar asked me on Twitter if I still think a constructor should not be tested. It depends on the type of object you’re working with, so I think it’ll be useful to elaborate here.

Would you test the constructor of a service that just gets some dependencies injected? No. You’ll test the behavior of the service by calling one of its public methods. The injected dependencies are collaborating services and the service as a whole won’t work if anything went wrong in the constructor.

Do tests need static analysis level max?

I recently heard this interesting question: if your project uses a static analysis tool like PHPStan or Psalm (as it should), should the tests by analysed too?

The first thing to consider: what are potential reasons for not analysing your test code?

Why not?

1. Tests are messy in terms of types

Tests may use mocks, which can be confusing for the static analyser:

$object = $this->createMock(MockedInterface::class);

The actual type of $object is an intersection type, i.e. $object is both a MockObject and a MockedInterface, but the analyser only recognizes MockObject. You may not like all those warnings about “unknown method” calls on MockObject $object so you exclude test code from the analysis.

Book excerpt - Decoupling from infrastructure, Conclusion

This article is an excerpt from my book Advanced Web Application Architecture. It contains a couple of sections from the conclusion of Part I: Decoupling from infrastructure.

This chapter covers:

A deeper discussion on the distinction between core and infrastructure code
A summary of the strategy for pushing infrastructure to the sides
A recommendation for using a domain- and test-first approach to software development
A closer look at the concept of “pure” object-oriented programming

Core code and infrastructure code

In Chapter 1 we’ve looked at definitions for the terms core code and infrastructure code. What I personally find useful about these definitions is that you can look at a piece of code and find out if the definitions apply to it. You can then decide if it’s either core or infrastructure code. But there are other ways of applying these terms to software. One way is to consider the bigger picture of the application and its interactions with actors. You’ll find the term actor in books about user stories and use cases by authors like Ivar Jacobson and Alistair Cockburn, who make a distinction between:

Testing your controllers when you have a decoupled core

A lot can happen in 9 years. Back then I was still advocating that you should unit-test your controllers and that setter injection is very helpful when replacing controller dependencies with test doubles. I’ve changed my mind: constructor injection is the right way for any service object, including controllers. And controllers shouldn’t be unit tested, because:

Those unit tests tend to be a one-to-one copy of the controller code itself. There is no healthy distance between the test and the implementation.
Controllers need some form of integrated testing, because by zooming in on the class-level, you don’t know if the controller will behave well when the application is actually used. Is the routing configuration correct? Can the framework resolve all of the controller’s arguments? Will dependencies be injected properly? And so on.

The alternative I mentioned in 2012 is to write functional tests for your controller. But this is not preferable in the end. These tests are slow and fragile, because you end up invoking much more code than just the domain logic.

Does it belong in the application or domain layer?

Where should it go?

If you’re one of those people who make a separation between an application and a domain layer in their code base (like I do), then a question you’ll often have is: does this service go in the application or in the domain layer? It sometimes makes you wonder if the distinction between these layers is superficial after all. I’m not going to write again about what the layers mean, but here is how I decide if a service goes into Application or Domain:

Should we use a framework?

Since I’ve been writing a lot about decoupled application development it made sense that one of my readers asked the following question: “Why should we use a framework?” The quick answer is: because you need it. A summary of the reasons:

It would be too much work to replace all the work that the framework does for you with code written by yourself. Software development is too costly for this.
Framework maintainers have fixed many issues before you even encountered them. They have done everything to make the code secure, and when a new security issue pops up, they fix it so you can just pull the latest version of the framework.
By not using a framework you will be decoupled from Symfony, Laravel, etc. but you will be coupled to Your Own Framework, which is a bigger problem since you’re the maintainer and it’s likely that you won’t actually maintain it (in my experience, this is what often happens to projects that use their own home-grown framework).

So, yes, you/we need a framework. At the same time you may want to write framework-decoupled code whenever possible.

A simple recipe for framework decoupling

If you want to write applications that are maintainable in the long run, you have to decouple from your framework, ORM, HTTP client, etc. because your application will outlive all of them.

Three simple rules

To accomplish framework decoupling you only have to follow these simple rules:

All services should get all their dependencies and configuration values injected as constructor arguments. When a dependency uses IO, you have to introduce an abstraction for it.
Other types of objects shouldn’t have service responsibilities.
Contextual information should always be passed as method arguments.

Explanations

Rule 1

Following rule 1 ensures that you’ll never fetch a service ad hoc, e.g. by using Container::get(UserRepository::class). This is needed for framework decoupling because the global static facility that returns the service for you is by definition framework-specific. The same is true for fetching configuration values (e.g. Config::get('email.default_sender')).

Violating the Dependency rule

I write about design rules a lot, but I sometimes forget to:

Mention that these rules can’t always be applied,
Describe when that would be the case, and
Add examples of situations where the rule really doesn’t matter.

The rules should work in most cases, but sometimes need to be “violated”. Which is too strong a word anyway. When someone points out to me that I violated a rule, I’m like: Wow! I violated the rule? I’m so sorry! Let’s fix this immediately. Whereas in practice it should be more like: Yeah, I know that rule, but it makes more sense to follow that other rule here, because […]. In other words, pointing out that a certain rule has been violated should not be a sufficient reason to adhere to that rule. My favorite example is “But that violates SRP!” (Single Responsibility Principle). Whoops, I wouldn’t want to do that! Or would I?

Relying on the database to validate your data

One of my pet peeves is using the database schema to validate data.

Several ways in which this normally happens:

Specifying a column as “required”, e.g. email VARCHAR(255) NOT NULL
Adding an index to force column values to be unique (e.g. CREATE UNIQUE INDEX email_idx ON users(email))
Adding an index for foreign key integrity, including cascading deletes, etc.

Yes, I want data integrity too. No, I don’t want to rely on the database for that.

Free book chapter: Key design patterns

I wanted to share with you a free chapter from my latest book, “Advanced Web Application Architecture”. I’ve picked Chapter 11, which gives a compact overview of all the design patterns that are useful for structuring your web application in a way that will (almost) automatically make it independent of surrounding infrastructure, including the web framework you use.

Chapter 11 is the first chapter of Part II of the book. In Part I we’ve been discovering these design patterns by refactoring different areas of a simple web application. Part II provides some higher-level concepts that can help you structure your application. Besides design patterns, it covers architectural layering, and hexagonal architecture (ports & adapters). It also includes a chapter on testing decoupled applications.

DDD and your database

The introduction of Domain-Driven Design (DDD) to a larger audience has led to a few really damaging ideas among developers, like this one (maybe it’s more a sentiment than an idea):

Data is bad, behavior is good. The domain model is great, the database awful.

(We’re not even discussing CRUD in this article, which apparently is the worst of the worst.)

By now many of us feel ashamed of using an ORM alongside a “DDD domain model”, putting some mapping configuration in it, doing things inside your entities (or do you call them aggregates?) just to make them easily serializable to the database.

Is all code in vendor infrastructure code?

During a recent run of my Advanced Web Application Architecture training, we discussed the distinction between infrastructure code and non-infrastructure code, which I usually call core code. One of the participants summarized the difference between the two as: “everything in your vendor directory is infrastructure code”. I don’t agree with that, and I will explain why in this article.

Not all code in vendor is infrastructure code

Admittedly, it’s easy for anyone to not agree with a statement like this, because you can simply make up your own definitions of “infrastructure” that turn the statement false. As a matter of fact, I’m currently working on my next book (which has the same title as the training), and I’m working on a memorable definition that covers all the cases. I’ll share with you the current version of that definition, which consists of two rules defining core code. Any code that doesn’t follow both these rules at the same time, should be considered infrastructure code.

Rules for working with dynamic arrays and custom collection classes

Here are some rules I use for working with dynamic arrays. It’s pretty much a Style Guide for Array Design, but it didn’t feel right to add it to the Object Design Style Guide, because not every object-oriented language has dynamic arrays. The examples in this post are written in PHP, because PHP is pretty much Java (which might be familiar), but with dynamic arrays instead of built-in collection classes and interfaces.

Using phploc for a quick code quality estimation - Part 2

In part 1 of this series we discussed the size and complexity metrics calculated by phploc. We continue with a discussion about dependencies and structure.

Structure

This section gives us an idea of how many things there are of a certain type. These can be useful indicators too. For example, if the number of Namespaces is low, there may be a lack of grouping, which is bad for discoverability. It’ll be hard to find out where a certain piece of logic can be found in the code. Too many namespaces, relative to the number of Classes/Interfaces/Traits, is not a good sign either. I would expect every namespace to have a couple of classes that naturally belong together.

Using phploc for a quick code quality estimation - Part 1

When I want to get a very rough idea of the quality of both code and structure of a PHP code base, I like to run phploc on it. This is a tool created by Sebastian Bergmann for (I assume) exactly this purpose. It produces the following kind of output:

Directories                                          3
Files                                               10

Size
  Lines of Code (LOC)                             1882
  Comment Lines of Code (CLOC)                     255 (13.55%)
  Non-Comment Lines of Code (NCLOC)               1627 (86.45%)
  Logical Lines of Code (LLOC)                     377 (20.03%)
    Classes                                        351 (93.10%)
      Average Class Length                          35
        Minimum Class Length                         0
        Maximum Class Length                       172
      Average Method Length                          2
        Minimum Method Length                        1
        Maximum Method Length                      117
    Functions                                        0 (0.00%)
      Average Function Length                        0
    Not in classes or functions                     26 (6.90%)

Cyclomatic Complexity
  Average Complexity per LLOC                     0.49
  Average Complexity per Class                   19.60
    Minimum Class Complexity                      1.00
    Maximum Class Complexity                    139.00
  Average Complexity per Method                   2.43
    Minimum Method Complexity                     1.00
    Maximum Method Complexity                    96.00

Dependencies
  Global Accesses                                    0
    Global Constants                                 0 (0.00%)
    Global Variables                                 0 (0.00%)
    Super-Global Variables                           0 (0.00%)
  Attribute Accesses                                85
    Non-Static                                      85 (100.00%)
    Static                                           0 (0.00%)
  Method Calls                                     280
    Non-Static                                     276 (98.57%)
    Static                                           4 (1.43%)

Structure
  Namespaces                                         3
  Interfaces                                         1
  Traits                                             0
  Classes                                            9
    Abstract Classes                                 0 (0.00%)
    Concrete Classes                                 9 (100.00%)
  Methods                                          130
    Scope
      Non-Static Methods                           130 (100.00%)
      Static Methods                                 0 (0.00%)
    Visibility
      Public Methods                               103 (79.23%)
      Non-Public Methods                            27 (20.77%)
  Functions                                          0
    Named Functions                                  0 (0.00%)
    Anonymous Functions                              0 (0.00%)
  Constants                                          0
    Global Constants                                 0 (0.00%)
    Class Constants                                  0 (0.00%)

These numbers are statistics, and as you know, they can be used to tell the biggest lies. Without the context of the actual code, you can interpret them in any way you like. So you need to be careful making conclusions based on them. However, in my experience these numbers are often quite good indicators of the overall code quality as well as the structural quality of a project.

Dividing responsibilities - Part 2

This is another excerpt from my latest book, which is currently part of Manning’s Early Access Program. Take 37% off Object Design Style Guide by entering fccnoback into the discount code box at checkout at manning.com.

Make sure to read part 1 first.

Create read models directly from their data source

Instead of creating a StockReport model from PurchaseOrderForStock objects, we could go directly to the source of the data, that is, the database where the application stores its purchase orders. If this is a relational database, there might be a table called purchase_orders, with columns for purchase_order_id, product_id, ordered_quantity, and was_received. If that’s the case, then StockReportRepository wouldn’t have to load any other object before it could build a StockReport object; it could make a single SQL query and use it to create the StockReport, as shown in Listing 11).

Dividing responsibilities - Part 1

I’m happy to share with you an excerpt of my latest book, which is currently part of Manning’s Early Access Program. Take 37% off Object Design Style Guide by entering fccnoback into the discount code box at checkout at manning.com.

Chapter 7: Dividing responsibilities

We’ve looked at how objects can be used to retrieve information, or perform tasks. The methods for retrieving information are called query methods, the ones that perform tasks are command methods. Service objects may combine both of these responsibilities. For instance, a repository (like the one in Listing 1) could perform the task of saving an entity to the database, and at the same time it would also be capable of retrieving an entity from the database.

Hand-written service containers

You say "convention over configuration;" I hear "ambient information stuck in someone's head." You say "configuration over hardcoding;" I hear "information in a different language that must be parsed, can be malformed, or not exist."
— Paul Snively (@paul_snively) March 2, 2019

Dependency injection is very important. Dependency injection containers are too. The trouble is with the tools, that let us define services in a meta-language, and rely on conventions to work well. This extra layer requires the “ambient information” Paul speaks about in his tweet, and easily lets us make mistakes that we wouldn’t make if we’d just write out the code for instantiating our services.

Assertions and assertion libraries

When you’re looking at a function (an actual function or a method), you can usually identify several blocks of code in there. There are pre-conditions, there’s the function body, and there may be post-conditions. The pre-conditions are there to verify that the function can safely proceed to do its real job. Post-conditions may be there to verify that you’re going to give something back to the caller that will make sense to them.

Final classes by default, why?

I recently wrote about when to add an interface to a class. After explaining good reasons for adding an interface, I claim that if none of those reasons apply in your situation, you should just use a class and declare it “final”.

PHP 5 introduces the final keyword, which prevents child classes from overriding a method by prefixing the definition with final. If the class itself is being defined final then it cannot be extended.

Reusing domain code

Last week I wrote about when to add an interface to a class. The article finishes with the claim that classes from the application’s domain don’t usually need an interface. The reason is that domain code isn’t going to be swapped out with something else. This code is the result of careful modelling work that’s done based on the business domain that you work with. And even if you’d work on, say, two financial software projects in a row, you’ll find that the models you produce for each of them will be different in many subtle (if not radical) ways. Paradoxically you’ll find that in practice a domain model can sometimes be reused after all. There are some great examples out there. In this article I explain different scenarios of where and how reuse could work.

When to add an interface to a class

I’m currently revising my book “Principles of Package Design”. It covers lots of design principles, like the SOLID principles and the lesser known Package (or Component) Design Principles. When discussing these principles in the book, I regularly encourage the reader to add more interfaces to their classes, to make the overall design of the package or application more flexible. However, not every class needs an interface, and not every interface makes sense. I thought it would be useful to enumerate some good reasons for adding an interface to a class. At the end of this post I’ll make sure to mention a few good reasons for not adding an interface too.

More code comments

Recently I read a comment on Twitter by Nikola Poša:

I guess the discussion on my thread is going in the wrong direction because I left out a crucial hashtag: #NoCommentsInCode - avoid comments in code, write descriptive classes, methods, variables.https://t.co/MuHoOFXCvV
— Nikola Poša (@nikolaposa) July 13, 2018

He was providing us with a useful suggestion, one that I myself have been following ever since reading “Clean Code” by Robert Martin. The paraphrased suggestion in that book, as well as in the tweet, is to consider a comment to be a naming issue in disguise, and to solve that issue, instead of keeping the comment. By the way, the book has some very nice examples of how comments should and should not be used.

Negative architecture, and assumptions about code

In “Negative Architecture”, Michael Feathers speaks about certain aspects of software architecture leading to some kind of negative architecture. Feathers mentions the IO Monad from Haskell (functional programming) as an example, but there are parallel examples in object-oriented programming. For example, by using layers and the dependency inversion principle you can “guarantee” that a certain class, in a certain layer (e.g. domain, application) won’t do any IO - no talking to a database, no HTTP requests to some remote service, etc.

Objects should be constructed in one go

Consider the following rule:

When you create an object, it should be complete, consistent and valid in one go.

It is derived from the more general principle that it should not be possible for an object to exist in an inconsistent state. I think this is a very important rule, one that will gradually lead everyone from the swamps of those dreaded “anemic” domain models. However, the question still remains: what does all of this mean?

Doctrine ORM and DDD aggregates

I’d like to start this article with a quote from Ross Tuck’s article “Persisting Value Objects in Doctrine”. He describes different ways of persisting value objects when using Doctrine ORM. At the end of the page he gives us the following option - the “nuclear” one:

[…] Doctrine is great for the vast majority of applications but if you’ve got edge cases that are making your entity code messy, don’t be afraid to toss Doctrine out. Setup an interface for your repositories and create an alternate implementation where you do the querying or mapping by hand. It might be a PITA but it might also be less frustration in the long run.

Road to dependency injection

Statically fetching dependencies

I’ve worked with several code bases that were littered with calls to Zend_Registry::get(), sfContext::getInstance(), etc. to fetch a dependency when needed. I’m a little afraid to mention façades here, but they also belong in this list. The point of this article is not to bash a certain framework (they are all lovely), but to show how to get rid of these “centralized dependency managers” when you need to. The characteristics of these things are:

Deliberate coding

I wanted to share an important lesson I learned from my colleague Ramon de la Fuente. I was explaining to him how I made a big effort to preserve some existing behavior, when he said something along the lines of: the people who wrote this code, may or may not have known what they were doing. So don’t worry too much about preserving old stuff.

These wise words eventually connected to other things I’ve learned about programming, and I wanted to combine them under the umbrella of a blog post titled “Deliberate coding”. Doing something “deliberately” means to do it consciously and intentionally. It turns out that not everyone writes code deliberately, and at the very least, not everyone does it all the time.

When and where to determine the ID of an entity

This is a question that always pops up during my workshops: when and where to determine the ID of an entity? There are different answers, no best answer. Well, there are two best answers, but they apply to two different situations.

Auto-incrementing IDs, by the database

Traditionally, all you need for an entity to have an ID is to designate one integer column in the database as the primary key, and mark it as “auto-incrementing”. So, once a new entity gets persisted as a record in the database (using your favorite ORM), it will get an ID. That is, the entity has no identity until it has been persisted. Even though this happens everywhere, and almost always; it’s a bit weird, because:

Context passing

I’m working on another “multi-tenant” PHP web application project and I noticed an interesting series of events. It felt like a natural progression and by means of a bit of dangerous induction, I’m posing the hypothesis that this is how things are just bound to happen in such projects.

In the beginning we start out with a framework that has some authentication functionality built-in. We can get the “current user” from the session, or from some other session-based object. We’ll also need the “current company” (or the “current organization”) of which the current user is a member.

Exceptions and talking back to the user

Exceptions - for exceptional situations?

From the Domain-Driven Design movement we’ve learned to go somewhat back to the roots of object-oriented design. Designing domain objects is all about offering meaningful behavior and insights through a carefully designed API. We know now that domain objects with setters for every attribute will allow for the objects to be in an inconsistent state. By removing setters and replacing them with methods which only modify the object’s state in valid ways, we can protect an object from violating domain invariants.

Modelling quantities - an exercise in designing value objects

I recently came across two interesting methods that were part of a bigger class that I had to redesign:

class OrderLine
{
    /**
     * @var float
     */
    private $quantityOrdered;
    
    // ...

    /**
     * @param float $quantity
     */
    public function processDelivery($quantity)
    {
        $this->quantityDelivered += $quantity;
        
        $this->quantityOpen = $this->quantityOrdered - $quantity;
        if ($this->quantityOpen < 0) {
            $this->quantityOpen = 0;
        }
    }

    /**
     * @param float $quantity
     */
    public function undoDelivery($quantity)
    {
        $this->quantityDelivered -= $quantity;
        if ($this->quantityDelivered < 0) {
            $this->quantityDelivered = 0;
        }
        
        $this->quantityOpen += $quantity;
        if ($this->quantityOpen > $this->quantityOrdered) {
            $this->quantityOpen = $this->quantityOrdered;
        }
    }
}

Of course, I’ve already cleaned up the code here to allow you to better understand it.

ORMless; a Memento-like pattern for object persistence

Something that always bothers me: persistence (the user interface too, but that’s a different topic ;)). Having objects in memory is nice, but when the application shuts down (and for PHP this is after every request-response cycle), you have to persist them somehow. By the way, I think we’ve all forever been annoyed by persistence, since there’s an awful lot of software solutions related to object persistence: different types of databases, different types of ORMs, etc.

Mocking at architectural boundaries: the filesystem and randomness

In a previous article, we discussed “persistence” and “time” as boundary concepts that need mocking by means of dependency inversion: define your own interface, then provide an implementation for it. There were three other topics left to cover: the filesystem, the network and randomness.

Mocking the filesystem

We already covered “persistence”, but only in the sense that we sometimes need a way to make in-memory objects persistent. After a restart of the application we should be able to bring back those objects and continue to use them as if nothing happened.

Lasagna code - too many layers?

I read this tweet:

"The object-oriented version of spaghetti code is, of course, 'lasagna code'. Too many layers." - Roberto Waltman
— Programming Wisdom (@CodeWisdom) February 24, 2018

Jokes taken as advice

It’s not the first time I’d heard of this quote. Somehow it annoys me, not just this one joke, but many jokes like this one. I know I should be laughing, but I’m always worried about jokes like this going to be interpreted as advice, in its most extreme form. E.g. the advice distilled from this tweet could be: “Layers? Don’t go there. Before you know it, you have lasagna code…”

Mocking at architectural boundaries: persistence and time

More and more I’ve come to realize that I’ve been mocking less and less.

The thing is, creating test doubles is a very dangerous activity. For example, what I often see is something like this:

$entityManager = $this->createMock(EntityManager::class);

$entityManager->expects($this->once())
    ->method('persist')
    ->with($object);
$entityManager->expects($this->once())
    ->method('flush')
    ->with($object);

Or, what appears to be better, since we’d be mocking an interface instead of a concrete class:

$entityManager = $this->createMock(ObjectManagerInterface::class);
// ...

To be very honest, there isn’t a big different between these two examples. If this code is in, for example, a unit test for a repository class, we’re not testing many of the aspects of the code that should have been tested instead.

Reducing call sites with dependency injection and context passing

This article continues where Unary call sites and intention-revealing interfaces ended.

While reading David West’s excellent book “Object Thinking”, I stumbled across an interesting quote from David Parnas on the programming method that most of us use by default:

The easiest way to describe the programming method used in most projects today was given to me by a teacher who was explaining how he teaches programming. “Think like a computer,” he said. He instructed his students to begin by thinking about what the computer had to do first and to write that down. They would then think about what the computer had to do next and continue in that way until they had described the last thing the computer would do… […]

Unary call sites and intention-revealing interfaces

Call sites

One of the features I love most about my IDE is the button “Find Usages”. It is invaluable when improving a legacy code base. When used on a class it will show you where this class is used (as a parameter type, in an import statement, etc.). When used on a method, it will show you where this method gets called. Users of a method are often called “clients”, but when we use “Find Usages”, we might as well use the more generic term “call sites”.

Simple CQRS - reduce coupling, allow the model(s) to evolve

CQRS - not a complicated thing

CQRS has some reputation issues. Mainly, people will feel that it’s too complicated to apply in their current projects. It will often be considered over-engineering. I think CQRS is simply misunderstood, which is the reason many people will not choose it as a design technique. One of the common misconceptions is that CQRS always goes together with event sourcing, which is indeed more costly and risky to implement.

Layers, ports & adapters - Part 3, Ports & Adapters

In the previous article we discussed a sensible layer system, consisting of three layers:

Domain
Application
Infrastructure

Infrastructure

The infrastructure layer, containing everything that connects the application’s use cases to “the world outside” (like users, hardware, other applications), can become quite large. As I already remarked, a lot of our software consists of infrastructure code, since that’s the realm of things complicated and prone to break. Infrastructure code connects our precious clean code to:

Layers, ports & adapters - Part 2, Layers

The first key concept of what I think is a very simple, at the very least “clean” architecture, is the concept of a layer. A layer itself is actually nothing, if you think about it. It’s simply determined by how it’s used. Let’s stay a bit philosophical, before we dive into some concrete architectural advice.

Qualities of layers

A layer in software serves the following (more or less abstract) purposes:

Layers, ports & adapters - Part 1, Foreword

Looking back at my old blog posts, I think it’s good to write down a more balanced view on application architecture than the one that speaks from some of the older posts from 2013 and 2014. Before I do, I allow myself a quick self-centered trip down memory lane.

Why Symfony? Seven facts

The archive tells an interesting story about how my thoughts about software development changed over time. It all started with my first post in October 2011, How to make your service use tags. Back in those days, version 2 of the Symfony framework was still quite young. I had been working with symfony (version 1, small caps) for a couple of years and all of a sudden I was using Symfony2, and I absolutely loved this new shiny framework. I learned as much as I could about it (just as I had done with version 1), and eventually I became a Certified Symfony Developer, at the first round of exams held in Paris, during a Symfony Live conference. During those years I blogged and spoke a lot about Symfony, contributed to the documentation, and produced several open source bundles. I also wrote a book about it: A Year With Symfony.

Designing a JSON serializer

Workshop utilities

For the workshops that I organize, I often need some “utilities” that will do the job, but are as simple as possible. Examples of such utilities are:

Something to dispatch events with.
Something to serialize and deserialize objects with.
Something to store and load objects with.
Something to use with event sourcing (an event store, an event-sourced repository).

I put several of these tools in a code base called php-workshop-tools. These utilities should be small, and very easy to understand and use. They will be different from the frameworks and libraries most workshop participants usually use, but they should offer more or less the same functionality. While designing these tools, I was constantly looking for the golden mean:

The case for singleton objects, façades, and helper functions

Last year I took several Scala courses on Coursera. It was an interesting experience and it has brought me a lot of new ideas. One of these is the idea of a singleton object (as opposed to a class). It has the following characteristics:

There is only one instance of it (hence it’s called a “singleton”, but isn’t an implementation of the Singleton design pattern).
It doesn’t need to be explicitly instantiated (it doesn’t have the traditional static getInstance() method). In fact, an instance already exists when you first want to use it.
There doesn’t have to be built-in protection against multiple instantiations (as there is and can only be one instance, by definition).

Converting this notion to PHP is impossible, but if it was possible, you could do something like this:

Duck-typing in PHP

For quite some time now the PHP community has becoming more and more professional. “More professional” in part means that we use more types in our PHP code. Though it took years to introduce more or less decent types in the programming language itself, it took some more time to really appreciate the fact that by adding parameter and return types to our code, we can verify its correctness in better ways than we could before. And although all the type checks still happen at runtime, it feels as if those type checks already happen at compile time, because our editor validates most of our code before actually running it.

Refactoring the Cat API client - Part III

In the first and second part of this series we’ve been doing quite a bit of work to separate concerns that were once all combined into one function.

The major “players” in the field have been identified: there is an HttpClient and a Cache, used by different implementations of CatApi to form a testable, performing client of The Cat Api.

Representing data

We have been looking at behavior, and the overall structure of the code. But we didn’t yet look at the data that is being passed around. Currently everything is a string, including the return value of CatApi::getRandomImage(). When calling the method on an instance we are “guaranteed” to retrieve a string. I say “guaranteed” since PHP would allow anything to be returned, an object, a resource, an array, etc. Though in the case of RealCatApi::getRandomImage() we can be sure that it is a string, because we explicitly cast the return value to a string, we can’t be sure it will be useful to the caller of this function. It might be an empty string, or a string that doesn’t contain a URL, like 'I am not a URL'.

Refactoring the Cat API client - Part II

The world is not a safe thing to depend upon

When you’re running unit tests, you don’t want the world itself to be involved. Executing actual database queries, making actual HTTP requests, writing actual files, none of this is desirable: it will make your tests very slow as well as unpredictable. If the server you’re sending requests to is down, or responds in unexpected ways, your unit test will fail for the wrong reasons. A unit test should only fail if your code doesn’t do what it’s supposed to do.

Refactoring the Cat API client - Part I

Some time ago I tweeted this:

I didn't mention this yet, but I'm working on a series of videos about the subject matter of my Principles of Package Design book.
— Matthias Noback (@matthiasnoback) May 14, 2015

It turned out, creating a video tutorial isn’t working well for me. I really like writing, and speaking in public, but I’m not very happy about recording videos. I almost never watch videos myself as well, so… the video tutorial I was talking about won’t be there. Sorry!

Collecting events and the event dispatching command bus

It was quite a ride so far. We have seen commands, command buses, events and event buses. We distilled some more knowledge about them while formulating answers to some interesting questions from readers.

Why you should not dispatch events while handling a command

In a previous post we discussed a sample event (the UserSignedUp event):

class UserSignedUp implements Event
{
    public function name()
    {
        return 'user_signed_up';
    }

    public function __construct($userId)
    {
        $this->userId = $userId;
    }

    public function userId()
    {
        return $this->userId;
    }
}

An instance of such an event can be handed over to the event bus. It will look for any number of event handlers that wants to be notified about the event. In the case of the UserSignedUp event, one of the interested event handlers is the SendWelcomeMailWhenUserSignedUp handler:

Some questions about the command bus

So far we’ve had three posts in this series about commands, events and their corresponding buses and handlers:

Now I’d like to take the time to answer some of the very interesting questions that by readers.

The difference between commands and events

Robert asked:

[…], could you possibly explain what are the main differences between a command bus and an even dispatcher?

From commands to events

In the previous posts we looked at commands and the command bus. Commands are simple objects which express a user’s intention to change something. Internally, the command object is handed over to the command bus, which performs the change that has been requested. While it eventually delegates this task to a dedicated command handler, it also takes care of several other things, like wrapping the command execution in a database transaction and protecting the original order of commands.

Responsibilities of the command bus

In the previous post we looked at commands and how you can use them to separate technical aspects of the input, from the actual behavior of your application. Commands are simple objects, handed over to the command bus, which performs the change that is needed.

As we learned, the command bus eventually calls the command handler which corresponds to the given command object. For example when a SignUp command is provided, the SignUpHandler will be asked to handle the command. So the command bus contains some kind of a lookup mechanism to match commands with their handlers. Some command bus libraries use a naming convention here (e.g. handler name = command name + “Handler”), some use a kind of service locator, etc.

A wave of command buses

Recently many people in the PHP community have been discussing a thing called the “command bus”. The Laravel framework nowadays contains an implementation of a command bus and people have been talking about it in several vodcasts.

My interest was sparked too. Last year I experimented with LiteCQRS but in the end I developed a collection of PHP packages known as SimpleBus which supports the use of commands and events in any kind of PHP application (there is a Symfony bridge too, if you like that framework). I also cover the subject of commands, events and their corresponding buses extensively during my Hexagonal Architecture workshop.

Decoupling from a service locator

Decoupling from a service locator - shouldn’t that be: don’t use a service locator?

Well, not really, since there are lots of valid use cases for using a service locator. The main use case is for making things lazy-loading (yes, you can also use some kind of proxy mechanism for that, but let’s assume you need something simpler). Say we have this EventDispatcher class:

class EventDispatcher
{
    public function __construct(array $listeners)
    {
        $this->listeners = $listeners;
    }

    public function dispatch($event)
    {
        foreach ($this->listeners[$event] as $listener) {
            $listener->notify();
        }
    }
}

Now it appears that some event listeners are rather expensive to instantiate. And, even though it may never be notified (because it listens to a rare event), in order to register any event listener, we need to instantiate it:

Dependency injection smells

The Symfony2 DependencyInjection Component has made my life and work as a developer a lot easier. Choosing the right way to use it however can be a bit difficult sometimes. Knowing what a/the service container can do, helps a lot, and also thinking about how you would do it with just PHP can put you back on track. To be able to recognize some problems related to dependency injection in your own code, I will describe a few “dependency injection smells” below (a term derived from “code smells”, used by Kent Beck, Martin Fowler and the likes).