Fluid Code
A blog about refactoring, .Net and all things agile by Danijel Arsenovski

Dealing with legacy code

February 6th, 2009 by admin

Recently someone on a group on internet asked for some ideas and suggestions on refactoring legacy code. Not your usual refactoring, but the project consisting of refactoring legacy code.  Now, I guess the first thing that you might ask at that point is “what is legacy code anyway”?
According to some TDD purist, legacy code is any code that is not covered by comprehensive set of unit tests. Generally, legacy code is some code in advanced lifecycle phase, burdened by heavy debt, sprinkled with smells like long methods and large classes and often implemented in some older technology, language or language version.
Here is what I replied:

Hi,

I’d say you are in for a ride.
Here are a few thoughts based on my experience.

First of all, do not discard rewriting application from scratch without giving it a really good consideration. In my experience, it is often much easier to develop an application from zero when compared with refactoring legacy version; it is also much more difficult to refactor the legacy code than it might look at the first site. If your team doesn’t have prior experience with refactorng of legacy code, it’s all too easy to underestimate the task. Do an experiment. Choose a component or a class and see how long does it take to refactor it. On one occasion, I performed one such experiment. I was able to reduce a 4000 LOC component to a 2000 LOC component in a period of two weeks (eliminating dead code, duplication etc). 2000 LOC that were left over were still very far from being in any decent shape. We decided to go for developing the project from zero and to discard old code altogether.

On another occasion, I had to deal with some really old java code. Project contained proprietary implementation of XML parser. Not too much open source in the days when they created the application. Honestly, not a kind of code you wish to look at when maintaining something that is a typical enterprise application. Solution to this proved to be relatively easy; while XML used was not 100% well-formed, it was possible to replace it with 3rd party XML parser.

This leads us to my next thought. Start by refactoring on architectural level. Replace whole layers of application with 3rd party solutions when possible. For example, ORM tools and frameworks are only recently hitting the mainstream. It is not very likely that your application is using it. If possible, replace the persistence layer of coded SQL with some ORM tool.

The third suggestion. Know your problem domain but don’t use your legacy code base for the purpose of understating it. Reverse-engineering legacy codebase into domain knowledge is futile task; legacy code is often littered with duplication, dead code, poorly named elements etc. Try to understand the domain by speaking to domain expert/customer. Without domain knowlege you will find yourself in complete darkness. Maybe it is a feature, maybe a bug or a dead code altogether. Take a look at this story:
Refactoring Finds Dead Code

Finally, build up some reliable testing harness. Start by creating functional/integration tests and then add unit tests wherever possible.

You might wish to take a look at this article of mine, it talk about legacy VB apps, but many points are valid for any legacy code:
Moving up the technology stack: VB6 migration reality check

Good luck!

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • Reddit
  • Digg
  • StumbleUpon
  • Bloglines
  • Google Bookmarks
  • Y!GG

Posted in Refactoring

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.