Below you will find pages that utilize the taxonomy term “Legacy Code”
New edition for the Rector Book

A couple of weeks ago, Tomas Votruba emailed me saying that he just realized that we hadn’t published an update of the book we wrote together since December 2021. The book I’m talking about is “Rector - The Power of Automated Refactoring”. Two years have passed since we published the current version. Of course, we’re all very busy, but no time for excuses - this is a book about keeping projects up-to-date with almost no effort… We are meant to set an example here!
Refactoring without tests should be fine
Refactoring without tests should be fine. Why is it not? When could it be safe?
From the cover of “Refactoring” by Martin Fowler:
Refactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which “too small to be worth doing”. However the cumulative effect of each of these transformations is quite significant.
Good design means it's easy-to-change
Software development seems to be about change: the business changes and we need to reflect those changes, so the requirements or specifications change, frameworks and libraries change, so we have to change our integrations with them, etc. Changing the code base accordingly is often quite painful, because we made it resistant to change in many ways.
Code that resists change
I find that not every developer notices the “pain level” of a change. As an example, I consider it very painful if I can’t rename a class, or change its namespace. One reason could be that some classes aren’t auto-loaded with Composer, but are still manually loaded with require
statements. Another reason could be that the framework expects the class to have a certain name, be in a certain namespace, and so on. This may be something you personally don’t consider painful, since you can avert the pain by simply not considering to rename or move classes.
Do you have an exit strategy?
It’s an extremely common problem in legacy code bases: a new way of doing things was introduced before the team decided on a way to get the old thing out.
Famous examples are:
- Introducing Doctrine ORM next to Propel
- Introducing Symfony FrameworkBundle while still using Zend controllers
- Introducing Twig for the new templates, while using Smarty for the old ones
- Introducing a Makefile while the rest of the project still uses Phing
And so on… I’m sure you also have plenty examples to add here!
Release of the Rector book
TLDR;
Rector - The Power of Automated Refactoring is now 100% completed
Tomas Votruba and I first met a couple of years ago at one of my favorite conferences; the Dutch PHP Conference in Amsterdam (so actually, we’re very close to our anniversary, Tomas!). He presented Rector there and it was really inspiring. A year later I was working on a legacy migration problem: our team wanted to migrate from Doctrine ORM to “ORM-less”, with handwritten mapping code, etc. I first tried Laminas Code, a code generation tool, but it lacked many features, and also the precision that I needed. Suddenly I recalled Rector, and decided to give it a try. After some experimenting, everything worked and I learned that this tool really is amazingly powerful!
Early release of Rector - The power of automated refactoring
In October 2020 I asked Tomáš Votruba, the mastermind behind Rector, if we could have a little chat about this tool. I wanted to learn more about it and had spent a couple of days experimenting with it. Tomáš answered all my questions, which was tremendously valuable to me personally. When this happens I normally feel the need to share: there should be some kind of artefact that can be published, so others can also learn about Rector and how to extend it based on your own refactoring needs.
Successful refactoring projects - The Mikado Method
You’ve picked a good refactoring goal. You are prepared to stop the project at anytime. Now how to determine the steps that lead to the goal?
Bottom-up development
There is an interesting similarity between refactoring projects, and regular projects, where the goal is to add some new feature to the application. When working on a feature, I’m always happy to jump right in and think about what value objects, entities, controllers, etc. I need to build. Once I’ve written all that code and I’m ready to connect the dots, I often realize that I have created building blocks that I don’t even need, or that don’t offer a convenient API. This is the downside of what’s commonly called “bottom-up development”. Starting to build the low-level stuff, you can’t be certain if you’re contributing to the higher-level goal you have in mind.
Successful refactoring projects - Set the right goal
Refactoring is often mentioned in the context of working with legacy code. Maybe you like to define legacy code as code without tests, or code you don’t understand, or even as code you didn’t write. Very often, legacy code is code you just don’t like, whether you wrote it, or someone else did. Since the code was written the team has introduced new and better ways of doing things. Unfortunately, half of the code base still uses the old and deprecated way…
Successful refactoring projects - Prepare to stop at any time
Refactoring projects
A common case of refactoring-gone-wrong is when refactoring becomes a large project in a branch that can never be merged because the refactoring project is never completed. The refactoring project is considered a separate project, and soon starts to feel like “The Big Rewrite That Always Fails” from programming literature.
The work happens in a branch because people actually fear the change. They want to see it before they believe it, and review every single part of it before it can be merged. This process may take months. Meanwhile, other developers keep making changes to the main branch, so merging the refactoring branch is going to be a very tedious, if not dangerous thing to do. A task that, on its own, can cause the failure of the refactoring project itself.
Road to dependency injection
Statically fetching dependencies
I’ve worked with several code bases that were littered with calls to Zend_Registry::get()
, sfContext::getInstance()
, etc. to fetch a dependency when needed. I’m a little afraid to mention façades here, but they also belong in this list. The point of this article is not to bash a certain framework (they are all lovely), but to show how to get rid of these “centralized dependency managers” when you need to. The characteristics of these things are:
Combing legacy code string by string
I find it very curious that legacy (PHP) code often has the following characteristics:
- Classes with the name of a central domain concept have grown too large.
- Methods in these classes have become very generic.
Classes grow too large
I think the following happened:
The original developers tried to capture the domain logic in these classes. They implemented it based on what they knew at the time. Other developers, who worked on the code later, had to implement new features, or modify domain logic, because, well, things change. Also, because we need more things.
Reducing call sites with dependency injection and context passing
This article continues where Unary call sites and intention-revealing interfaces ended.
While reading David West’s excellent book “Object Thinking”, I stumbled across an interesting quote from David Parnas on the programming method that most of us use by default:
The easiest way to describe the programming method used in most projects today was given to me by a teacher who was explaining how he teaches programming. “Think like a computer,” he said. He instructed his students to begin by thinking about what the computer had to do first and to write that down. They would then think about what the computer had to do next and continue in that way until they had described the last thing the computer would do… […]
Unary call sites and intention-revealing interfaces
Call sites
One of the features I love most about my IDE is the button “Find Usages”. It is invaluable when improving a legacy code base. When used on a class it will show you where this class is used (as a parameter type, in an import statement, etc.). When used on a method, it will show you where this method gets called. Users of a method are often called “clients”, but when we use “Find Usages”, we might as well use the more generic term “call sites”.
Keep an eye on the churn; finding legacy code monsters
Setting the stage: Code complexity
Code complexity often gets measured by calculating the Cyclomatic Complexity per unit of code. The number can be calculated by taking all the branches of the code into consideration.
Code complexity is an indicator for several things:
- How hard it is to understand a piece of code; a high number indicates many branches in the code. When reading the code, a programmer has to keep track of all those branches, in order to understand all the different ways in which the code can work.
- How hard it is to test that piece of code; a high number indicates many branches in the code, and in order to fully test the piece of code, all those branches need to be covered separately.
In both cases, high code complexity is a really bad thing. So, in general, we always strive for low code complexity. Unfortunately, many projects that you’ll inherit (“legacy projects”), will contain code that has high code complexity, and no tests. A common hypothesis is that a high code complexity arises from a lack of tests. At the same time, it’s really hard to write tests for code with high complexity, so this is a situation that is really hard to get out.
Simple CQRS - reduce coupling, allow the model(s) to evolve
CQRS - not a complicated thing
CQRS has some reputation issues. Mainly, people will feel that it’s too complicated to apply in their current projects. It will often be considered over-engineering. I think CQRS is simply misunderstood, which is the reason many people will not choose it as a design technique. One of the common misconceptions is that CQRS always goes together with event sourcing, which is indeed more costly and risky to implement.
Behind the scenes at Coolblue
Leaving Qandidate, off to Coolblue
After I had a very interesting conversation with the developers behind the Broadway framework for CQRS and event sourcing the day wasn’t over for me yet. I walked about one kilometer to the north to meet Paul de Raaij, who is a senior developer at Coolblue, a company which sells and delivers all kinds of - mostly - electrical consumer devices. Their headquarters are very close to the new and shiny Rotterdam Central station. The company itself is doing quite well. With 1000+ employees they keep needing more office space.
Book review: Modernizing Legacy Applications in PHP
Legacy code
I’m happy that I’ve discovered the work of P.M. Jones recently. His and mine interests seem to align at several interesting points. Though I don’t personally enjoy “putting .308 holes in targets at 400 yards” (as quoted by Phil Sturgeon), I do care a great deal about package coupling (and cohesion for that matter) and I’m also lightly inflammable when it comes to the use of service locators. It also appears that Paul, just like myself and many others in this business, has felt the pain of working on a legacy PHP application, trying to add features to it, or change existing behavior. This is a particularly hard thing to do and because so many incompetent developers have been creating PHP applications since the dawn of PHP, chances are that a big part of the job of competent PHP developers nowadays consists of maintaining these dumpsites of include statements and superglobals.
Silex: Using HttpFoundation and Doctrine DBAL in a Legacy PHP Application
In my previous post, I wrote about wrapping a legacy application in Silex, using output buffering and Twig. Finally, to allow for better decoupling as well as lazy loading of services, we passed the actual Silex\Application
instance as the first argument of legacy controllers.
The first and quite easy way we can enhance our legacy application, is to make use of the request
service (which contains all the details about the current request, wrapped inside the Symfony HttpFoundation’s Request
class). So, instead of reading directly from $_GET
and $_POST
, we can change the edit_category()
controller into the following:
Let Silex Wrap Your Legacy PHP Application (and add Twig for templating)
Ever since I am using the Symfony Framework (be it version 1 or 2), I tend to describe every other project I’ve done (including those that were built on top of some third party “framework” like Joomla or WordPress) as a “legacy project”. Though this has sometimes felt like treason, I still keep doing it: the quality of applications written using Symfony is usually so much higher in terms of maintainability, security and code cleanliness, that even a project done last year using “only PHP” looks like a mess and seems to be no good software at all. So I feel the strong urge to rebuild everything I have in portfolio (as do many other developers), but “this time, I will do it the right way”.