I just ran across a really interesting blog post by Joel Spolsky from last April: Things You Should Never Do, Part 1. Actually. the post pissed me off. This is one of those hot-button topics that I have had to deal with several times in my career, and have had to manage in the face of entrenched beliefs. His statement is t hat you should never rewrite a code base from scratch. The reasoning is “No major firm has ever successfully survived a product rewrite. Just look at Netscape … ” Whatever.

I am a fixer. I was the guy who was able to make code reliable. I was the guy who found and fixed the obscure bugs. As I progressed in my career and started to manage teams of developers, more often than not I was handed the really crummy re-engineering projects because I could fix the problems and make customers happy. Sometimes success is its own penalty.

I have inherited code so bad that bug fixes cost 4x in time and usually created new bugs in the process. I have inherited huge bodies of Java code written entirely as if Java were a 3G procedural language – ignoring the object-oriented paradigm completely. I have been tasked with fixing code that – for a simple true/false comparison – made 12 comparisons, 8 database, insertions and 7 deletions – causing an 180x performance penalty. I have inherited code so bad it broke the compiler. I have inherited code so bad that you could not change a back-end database query without breaking the GUI! It takes a real gift for bad programming to do these things.

There are times when the existing code – all or part – simply needs to be thrown away. There are times that code is so tightly intertwined that you cannot simply fix one piece at a time. And in some cases there are really good business reasons, like your major customers say your code is crap and needs to be thrown away. Bad code can bleed a company to death with lost sales, brand impairment, demoralization, and employee turnover.

That said, I agree with Joel’s basic premise that re-writing your product can kill your company. And I even agree about a lot of the social behaviors he describes that create failure. There is absolutely no reason to believe that the people who developed bad code the first time will not do the same thing the next time. But I don’t agree that you should never rewrite. I don’t agree that it has never been done successfully. I know because I have done it successfully. Twice. Out of three attempts, but hey, I got the important projects right.

We tend not to hear about successful rewrites because the companies that carried it off really don’t want everyone knowing that previous versions were terrible. They would rather focus on happy customers and competitive products. It’s very likely that companies who need to rewrite code will screw up a second time. Honestly, there are a lot more historic rewrite flameouts than success stories. Companies know what they want to fix in the code, but they don’t understand what they need to fix in the company. I contend this is because there are company behaviors that promote failure, and if they did it once, they are likely to do it again. And again. Until, mercifully, the company goes down in flames. There are a lot of reasons why re-architecture and re-implementations projects fail. In no particular order …

  1. Big eyes: You are the chief developer and you hate your current product. You have catalogued everything that is wrong with it and how you would fix it. You have extensive lists of features you would like to implement. You have a grand vision of how this product should function, how it should be architected, and how it will be implemented. This causes your re-engineering effort to fail because you think that you are going to build perfect software, tackle every problem, and build every feature, in the first revision. And you commit to do so, just to get the project green-lighted.
  2. Resources: You current product sucks. It really sucks. It has atrocious quality and low performance, and is miserable to manage. It’s so freaking bad that customers ask for their money back, and sales falter. This causes your re-engineering effort to fail because there is simply not enough time, and not enough revenue to pay for your rebuild. Not with customers breathing down management’s neck, and investors looking for the quick “liquidity event”. So marketing keeps on marketing, sales keeps on selling, and you keep on supporting the old mess you have.
  3. Bad blood: When you car gets old and dies, you don’t expect someone to give you a new one for free. When your crappy old code no longer supports your customers, in essence you need to pay for new code. Yes, it is unfortunate that you bought a lemon last time, but you need to make additional investments in time and development resources, and fix the problems that led you down the wrong path. Your project fails because management is so bitter about the failure that they muck around with development practices, apply more pressure and try to get more involved with day-to-day development, when the opposite is needed.
  4. Expectations: Not only is the development team excited at not having to work on the atrocious code you have now, but they are really looking forward to working on a product that has semi-modern design. The whole department is buzzing, and so is management! This causes your re-engineering effort to fail because the Chickens think that no only are you going to deliver perfect software, but you are going to deliver every feature and function of the old crappy product, as well as a handful of new and extraordinary features as well. And it’s unlikely that management will let you adjust the ship date to accommodate the new demands. If they do, the temptation is to keep working until it’s perfect, but nothing is perfect – this is a good way to keep coding while Rome burns.
  5. Sales: In all the excitement, Sales bragged to a couple major customers about what an amazing new product the development team is building, and it solves all the problems you have today. The customers think this is great, and say “Call us when it’s ready.” Sales grind to a halt. This causes your re-engineering effort to fail because you now have two months to develop what you estimated would take 18.
  6. People: The people who were terrible coders are still on the team because nobody wanted to fire them. The managers who forced releases out the door early, before implementation and QA were completed are still with the company. The executive who threatens employees’ jobs if they fail to deliver on time are still with the company. The Product Manager who fails to do market research to validate bright ideas is still at the company. The engineering ‘leaders’ with no clue about process or leadership skills are still leading when they should be coding. The effort failed before it began.

Re-engineering efforts can fail for a whole new set of reasons, in addition to whatever wrecked the initial project. And unfortunately rewrites always begin at a disadvantage, because management is already miffed that the last development project failed. Building software is risky, but re-engineering can work. If you want to get it right the second time, you need to perform a same critical evaluation of people and processes, just as you hopefully did with technology. You will end up overhauling much of the organization, including management, to avoid the technical and leadership failures of the past. If all of this has not scared you off, consider code re-engineering.