Wednesday, September 24, 2008

Locking down legacy code.

So let’s look at the situation where we have a block of code that’s run, but not unit tested. There are many great examples that look like

someBlock()
{
...
o = calculateSomething(a,b);
....
}

now let’s say we wanted to refactor calculateSomething, but it’s dangerous to do that, since we aren’t sure the tests will assert against the changes we might make. Even if the code coverage is 100%, that doesn’t guarantee that the asserts are 100%. For example the test

testSomeBlock()
{
new SomeObject().someBlock();
}

would yield 100% coverage, and not provide any asserts against change.

So let’s look at a locking technique for this code.
1) ‘peel’ the function. Make calculateSomething(){ return calculateSomething1();}
2) Clone the function make calculateSomething2();
3) Add an in production code assert

calculateSomething(a,b)
{
result1 = calculateSomething1(a,b);
result2 = calculateSomething2(a,b);
assertEquals(result1, result2);
return result2;
}

4) now you can refactor calculateSomething2 with safety, even if the above execute only test above was your only test.
5) clean up, once you feel safe with the refactor, remove the calculateSomething1, and inline calculateSomething2.

Of course there need to be a few properties of calculateSomething(a,b), like it has a return value & it doesn’t cause problems by being run twice, etc...

Now this bring some added bonuses:
Case 1. image you have a big block of code you can do an automatic refactoring, like ‘extract method’ and then apply this technique to it.

Case 2. Theory based testing. the interesting thing about the calculateSomething2(a,b) == calculateSomething1(a,b) is you don’t care what a and b are, nor do you need pre-knowledge of their return value. This means you can write a random generator of a,b and blast calculateSomething with thousands of cases.

For example, i have a triangle class that calculates the angles given 3 points. then if i have a generator of point(x,y) i can write
for(int i = 0; i < 10000; i++)
{
Triangle t = new Triangle(generatePoint(),generatePoint(), generatePoint());
assertEquals(t.getAngles1(), t.getAngles2);
}
and WOW, i have massively locked down code in a matter of seconds.

Case 3. Code in production. So lets say I have a total mess of legacy code, in production. I can use this technique, but with a ‘soft’ logging fail. refactor, wait a week, and if nothing went wrong, remove old methoda. It’s not ideal, but ‘total mess of legacy code’ took us away form ideal in the beginning.