The Importance of Refactoring

“If it’s not broken, don’t try to fix it” the old adage goes, but when it comes to software engineering at least, this is poor advice — at best.

Woman sitting by computer
Anyone who has worked more than a few years in software development knows that one of the hardest sells is convincing your boss to allow time for refactoring and cleaning up the code. On the surface it may seem like a reasonable stance — after all, no project asked for the change, no customer is paying for it — so why would you waste time on it?

Why indeed should you refactor, if leaving the code as-is doesn’t cost you anything? Because the code that’s “not broken” is costing you, you just haven’t noticed it.

The truth is that just like things in the physical world, code also needs to be maintained. It may not degrade and wear as physical objects do, but hidden in any piece of code are bugs no one’s found, bad designs no one’s bothered to correct, and potential improvements just waiting to be discovered.

I’ve lost count of how many times I’ve been introduced to codebases that for the main part have remained untouched for years or even decades. The old faux-truth of not fixing what’s not broken is embedded deep, and hard to shift.

I think a large part of the blame lies with the all too common project-financing model, where all resources for development are allotted to projects. Projects have very specific deliverables and stakeholders and are rarely interested in paying for anything outside their scope, and so refactoring is left by the wayside.

But doing it this way means that code will only be fixed or improved upon if there’s an explicit requirement for it in a project, or a bug is discovered. It’s genuinely surprising how often companies neglect to allocate any resources at all for general development and maintenance outside of projects.

So, what happens when you leave code unattended and unloved? You build up technical debt, and to exemplify this, here are some typical issues you will find in poorly maintained code and the costs associated with them:

    • PUBAR or Patched-Up-Beyond-All-Recognition
      This is code that was written a very long ago and has been repeatedly patched by a succession of developers to make quick fixes for errors detected. Significant for this type of code is that each patch has only been done to fix an immediate issue with little or no consideration for side-effects or long-term impacts (“there’s no time for a proper fix, just patch it”). The code tends to have large complex methods spanning hundreds of lines, and deeply nested if-statements. This is the kind of code that breaks any time there’s a major update of the product because it relies on internal dependencies or hard-coded assumptions. This type of code is costly for several reasons: it will break when least expected when you change something else and thus cause unanticipated work (and therefore delays in the project); it’s often difficult to understand due to lack of coherent design and implementation thus takes more time to get into and fix (and therefore causing delays in the project); it tends to be brittle and cause issues at the customer requiring support effort (causing delays in the project and bad-will at the customer).
    • Holy code
      This was written at the dawn of time and hasn’t been changed since. The person who wrote it left the company ages ago, there’s zero documentation, and no one currently really understands how it works. It’s often full of commented-out code (with no explanation) but no comments explaining the actual intent behind the code. If you change any of it, it usually breaks. The cost of this type of code is a bit insidious; it will keep working for years on end, until that one day when it suddenly doesn’t and it breaks down completely. The only option then is to re-write it from scratch, at great cost, and often it will also require other changes in the system (causing big delays in the project).
    • “We’ll do it properly in the release”
      This was written as a temporary hack years ago with the intention of doing a proper implementation later in the project. This never happened, and temporary became permanent. Significant for this type of code is that it barely works and is held together with spit and shoestring, or just plain luck, and is fertile ground for new and interesting bugs. As with holy code, this code tends to break firmly at the worst possible moment, and with a similar fallout in terms of cost.

There are of course many flavors of bad code, and I could write a whole series of articles on that topic alone, but the above is where your code typically ends up if you ignore your code-smells. As you can see, just because your code isn’t broken doesn’t mean it isn’t costing you. Skipping regular refactoring is like skipping regular service of your car. Sure, it will run fine for a while, but then problems will start cropping up and slowly begin affecting performance. And when it inevitably does break down, it will be very expensive.

So, how do you introduce refactoring in your organization? This will largely depend on your organization. Most developers understand the value of refactoring, so usually, the ones needing to be convinced are management. After all, they are the ones having to pay for this. The important thing about refactoring, however, is that it’s done regularly.

Here are some suggestions:

    • On a regular basis — every one or two sprints if you are using Agile, or at least once a month — select one component or module per developer for refactoring. Set aside at least a whole day for this. If it’s the first time you’re doing refactoring it’s a good idea to spend this first instance just going through the whole component and note obvious problem areas. Then in the next slot, you can start doing improvements.
    • Assign components to developers who haven’t worked on it before (or at least not much). This way you will both get fresh eyes on the code and spread code knowledge across your team.
    • For large and complex components it may be a good idea to refactor using pair-programming or hold team brainstorming meetings to hash out ideas.

It’s important to note that just like code reviewing, refactoring is not a mud-slinging contest. The objective is not to talk down other people’s code, but to see if you can find improvements. For this reason it’s important that the team realizes they own all the code together as a team. If the code is crap, then the team needs to own that, and fix the code. Blaming one individual is not going to be helpful, even if they are the culprit. Instead, take the opportunity to teach good programming practices as well as fixing the issue. If it’s done in a constructive spirit, no one needs to feel their toes were stepped on, and your team will be all the better for it.

The fact of the matter is, the team owns the code, and if you want really good code you need to care about the code. Not just the product itself or the functionality it provides, but the actual code. When developers care about the code and feel they own it, they will produce better code. And better code provides better functionality and fewer bugs.

It’s also important to understand that refactoring is not bug fixing. Bug fixing is reactive — you find a bug, you fix the bug. Refactoring is proactive — you try to improve the code just to make it better, to avoid getting bugs in the first place, which in the long run is much, much cheaper. Refactoring also tends to look at the bigger picture, not just fixing specific flaws, but re-evaluating design and implementation, to see if you can do better.

The thing about writing software is; the moment you’ve finished developing a component you know how you should have done it in the first place. Every programmer recognizes this, the feeling of “if I only knew then what I know now, I would have done it like this instead”. We’ve experienced it many times.
In writing this is a well-known concept. You do a first draft to get your ideas down, and then you revise, often multiple times, to get to the finished product. As writer Neil Gaiman puts it:

“The process of doing your second draft is a process of making it look like you knew what you were doing all along.”

Sadly we rarely get that opportunity as developers, we are typically forced to push our first draft out as the final product. And that’s why refactoring is so important. It gives you an opportunity to revise your implementation with the wisdom of hindsight.

To do that second draft. To do it right.