Meeting the Broadway team - talking DDD, CQRS and event sourcing

Matthias Noback

July 23, 2015

Visiting Qandidate in Rotterdam

Last Thursday I had the honor of meeting (part of the) Broadway team: three very smart developers working for Qandidate in central Rotterdam: Gediminas Šedbaras, Willem-Jan Zijderveld and Fritsjan. A short walk from the train station brought me to their office, where we talked about CQRS, event sourcing and some of the practical aspects of using the Broadway framework.

As you may have read I’ve been experimenting with Broadway extensively during these last weeks. Knowing a lot about the theory behind it, it was really nice to see how everything worked so smoothly. I had some questions however, which have now been answered (thank you guys for taking the time to do this!).

Snapshotting

For example I wanted to know about snapshotting and what they thought of it. As I saw in the list of open issues for Broadway, some people are interested in this. When you’re doing event sourcing, before an aggregate can undergo any changes, it needs to be reconstituted based on previous events that are all stored in the event store. When the event store contains many, many events for a given aggregate, this process can become too slow for use in a production environment. Snapshotting solves the issue by storing the current state of an aggregate and only replaying events that occurred later than the time of the snapshot.

At Qandidate they didn’t experience this problem in a production application yet. They do feel that snapshotting is the proper solution for a real problem. Before resorting to it (because it may complicate maintenance a bit, see the section “Correcting mistakes” below), you should reconsider your design first. When your aggregates undergo so many changes that the need for snapshotting arises, you might have another problem. Possibly, an important concept is missing from your domain model (this is what DDD people like to say a lot ;)). Or you may be solving a “read” issue on the “write” side of your application. Anyway, in some cases the need for snapshotting is “legitimate” so Broadway will probably provide an off-the-shelf solution for snapshotting at some point (there is an open issue discussing this feature).

Open source vs company work

This brought me to the question how the Qandidate developers manage to divide their attention between “regular” work and “open source” work. Broadway is not just a small library to maintain, it’s an actual framework. It still doesn’t have an aweful lot of users, but nevertheless: it does take a serious amount of time to deal with issues and pull requests for it. Currently, the team is allowed to spend some time on this during working hours, which is really great. Adding new features doesn’t often have a high priority though, since these are often features required by the community, but for the company itself there is no urgency. Judging by the way some of the current pull request are being handled, I personally feel that the current situation is just fine though.

Event store management

Currently several features are being worked on which are related to “event store management”. The current version of Broadway’s event store can’t be queried for events of a certain type (the class of an event). Being able to do so would be nice, since it allows you to replay just certain events and let read model projectors process just a slice of all the events. Some people want to take this even further and want to be able to query for data inside stored events. This requires query/indexing capabilities within JSON blobs (event objects are always serialized to simple arrays, then persisted as JSON strings). This is impossible when using a MySQL database. But it was suggested that it might be possible if the events would be stored in a PostgreSQL database instead of a MySQL database like it currently is. The Broadway team itself is not fully convinced if querying the event store should be very convenient, but they imagine it can be useful in some high-performance environments.

Replaying events

Event stores are very powerful once they are combined with projectors. A projector can subscribe to domain events and produce read models based on what’s changed in the write model. In general, each use case (a list of things, a detailed view of a thing, a list of the same things but this time just for administrators, etc.) requires a new read model projector as well as a new read model. The general recommendation by the Broadway team is to not always follow this advice blindly. If read model A has just one extra field compared to read model B, you might as well combine them and adjust the query a bit.

Combining read models

I always assumed that each read model should correspond to one use case and that a read model query should return all the data required for the view, no more no less. I asked the people from Qandidate about this and it turns out this isn’t always feasible. For example, projector A which updates read model A, might need to use read model repository B to incorporate data from it into read model A. By doing this, read model A becomes sensitive to changes in read model B, which are not automatically reflected in A. So you have to enhance the projector of A to subscribe to the same events as B was already subscribed to. This duplicates some of the effort as well as some write-to-read translation knowledge. In reality you may need to run separate queries, for example in your controller, in order to be able to combine and provide the right data for your views (e.g. templates).

Correcting mistakes

My next question was: using event sourcing, is it easy to recover from a mistake? The first part of the answer first focussed on the read side: when you accidentally destroy some read model, or all read models (it happens to everyone, right?), it’s extremely easy to reconstruct it, by simply loading all events in the event store, and letting the read model projectors do their work again. Reprocessing the entire history of your application might be a matter of minutes (if you’re lucky of course).

Looking at the write side things are a bit more difficult. It’s hard to correct a mistake that you made in the design of your model, for instance when aggregate boundaries need to lie somewhere else, events need different data, events were not generated when they should have been, etc. As the Broadway team told me: it’s easy to fix these mistakes if the code hasn’t been released yet, when it’s running in production, it’s a different story.

When the internal state of an aggregate needs to change (i.e. the values of its properties), it’s not that bad, since that state is reconstituted from the event store for each change anyway. Once snapshotting (see above) is a supported feature, it may be harder to change the internals of an aggregate, since a snapshot contains the exact internal state of an aggregate at some point in time. So once you have stored a snapshot of an aggregate, changing one of its property names isn’t that simple anymore. The solution would be to remove existing snapshots, then generate them again.

Upcasting

When events themselves need to change because a design mistake has been made, or a new feature has to be implemented, it might become a bit harder. The event store will be filled with serialized “version 1” events. Newly created events will be “version 2” events. A technique which the Broadway team suggested, and something they are still working on, is called upcasting. A PR is open for this. Upcasting can be described as a way to migrate your events to newer versions. Each time you want to change part of it, you write a bit of code that is able to convert an older serialized event to the format of newer events. The “upcasted” event is never stored in the event store, since the event store is truly append-only and should always only contain the events as they happened back then.

You can read more about what the Broadway team envisions in the documentation of Axon, a CQRS framework which served as a source of inspiration for Broadway itself.

Although I personally love the ideas behind CQRS and event sourcing, looking at the project I’ve been creating with Broadway I can imagine it would be hard to step in as a new developer who is used to doing (Symfony) projects in a more traditional way. I asked the Qandidate developers about this and they could imagine it as well. There are a lot of “moving parts”. There is a pretty large amount of code to be understood. There are commands, events, entities, value objects, projectors, read models, processors, etc. It can be hard to get (and keep!) a full understanding of the flow of an application. A recommendation is to create and maintain some large visual overviews of the application flow. It’s useful when you need to explain a new team member what’s going on but it also serves as an aid in discussions about parts that have to be changed.

Testing things

Working with Broadway I wasn’t much bothered by the amount of classes I had to produce. They are small and focussed anyway. I noticed that they are often quite simple as well, probably because Commands and Queries, Writes and Reads are completely separate. It turns out that these classes are much easier to unit test as well. Especially because Broadway comes with a lot of base classes for PHPUnit which offer scenario-style testing (given-when-then).

All of this allows you to unit test everything without too much effort. Most of the actual effort of developing an event-sourced application goes to where it’s required the most: the domain model. However, I also noticed that unit testing isn’t sufficient if you want to trust your application to function correctly: a lot can go wrong at the configuration level. Command handlers, event subscribers, metadata enrichers, etc. - everything has to be defined as a service and registered properly. I also noticed that my read models sometimes contained mistakes because I only gave little attention to their quality. My read model unit tests didn’t capture my mistakes sometimes.

So, I started feeling the need for functional tests or acceptance tests. The Qandidate developers themselves like to use the modelling by example approach where you run acceptance tests twice: with and without the infrastructure layer “enabled”. This turned out to be a very useful approach for them (and I agree: it’s a great idea).

Asynchronous command/event handling

For SimpleBus I’ve created a RabbitMQ integration, which can be used when you like to send commands to a messaging server and let them be handled by another process. It can also be used to broadcast events to other applications, again, by sending them to a messaging server.

Broadway doesn’t come with built-in support for asynchronous operations. But at Qandidate they have created a custom event processor which does exactly this: it sends events to RabbitMQ, thereby allowing “offline” processes to do some heavy work in the background. The Broadway team is a bit hesitant to also process commands asynchronously, even though they have strictly kept to CQRS, so it shoudln’t be a big problem. However, it will require a lot more work on the UI side, which so far they didn’t think is worth it.

CRUD vs event sourcing

When I recreated (part of) an existing CRUD-style application using Broadway, I could immediately solve about five severe problems which the original version had, just because I applied CQRS and used event sourcing. It occured to me that CQRS/event sourcing might be something which, if you try it once, you never want to let go. When asked about this, the Qandidate developers agreed: it’s hard to get back to CRUD afterwards. On the other hand, for some types of applications event sourcing doesn’t make sense and the CRUD style might suffice or be even better. Some clients require this as well - it’s more of an Excel-based, data-driven approach to software.

Shortcomings, problems, etc.

I asked the Broadway team about what they think is Broadway’s biggest shortcoming. It turned out, their main concern was with the tools related to the read model. They think that Broadway’s job should end with calling the read model projector. After that, it’s all up to the implementer: Broadway users are free to use whatever type of database (MySQL, MongoDB, ElasticSearch, etc.) to store their read data. Since Broadway itself comes bundled with just an ElasticSearch (and an in-memory) adapter for read model repositories, this may not be clear for them. They may be stuck with ElasticSearch and have to work around some of its issues. For example, using the “factory” settings, an ElasticSearch query returns only 500 results.

For me personally, this wouldn’t really count as a shortcoming of Broadway itself. Its users just need to know something about the technology stack they’re using and shouldn’t count on everything to “just work”. It reminds me of the law of leaky abstractions. No matter how nice your abstractions are, at some point, you need to deal with the underlying, low-level details.

Conclusion

I had a great time meeting the guys at Qandidate. Again, a big thank you for taking the time to answer my questions and a very interesting conversation in general. I hope Broadway itself will gain some more interest in the community. It really deserves to get a lot of attention.

If you’d like to find out more about Broadway, check out the Qandidate labs blog and, for starters, the article Bringing CQRS and Event Sourcing to PHP. Open sourcing Broadway!.