Wednesday, March 25, 2020

Recovering a lost code base

Twice in my career I have realized that the code that was in production at a client did not match the code that was in source control. Of course if you have any type of automated deploy process this will not be the case, but if you have a manual deploy it can be.

Yesterday, Simon Cropp told me of an amazing technique to recover from this. I am writing this in the hope that it will never be of any use to anyone, ever :-)


First, let's go thru my history of what I did.

The First Time: we found the code in the closet.


The first time this happen was over a decade ago. We were fortunate that the company was being sued and a court order forbid the destruction of data. The employee in question had been let go a month or two before but their computer was in lockdown storage. We were able to find the code and check it in to source code control. This is the ideal solution. We get everything we needed. The original source.
We were lucky. Very lucky. The code hadn't been modified, since they weren't there to modify it. There was only 1 place the code could be because that module was only worked on by them. We didn't have to guess between versions. And, we didn't delete the code.

The Second Time: we decompiled the code

The next time we weren't as lucky. This was a VB.NET project and the latest in source control didn't match what was in production. Was it very different? Who knows. The employee in question had also been let go and when we realized this we ended up just decompiling the code that was in production. This creates very bad Visual Basic code, so we took advantage of it to create reasonably bad C# and move languages. This was nice, but we lost a lot of intent that was in the original source code. It would have been better if we could have gotten it back. However, we cut our losses and moved on.

The Next Time: decompiled to detect and recover the source

Here is a better way. It's a mix between the first two.
  1. Decompile the production source
  2. Also, compile and decompile the possible sources you have found.
  3. Compare the decompiled sources with source control.
    1. Maybe there's a match? 
      1. Yay! Then you have got the original source or it's equivalent 
    2. Maybe you have partial matches?
      1. Create a new code base of patches from the matching places.
      2. Repeat
    3. Maybe you have a section with no matches
      1. You can try to reproduce the code manually
      2. or, you can copy and paste those sections into the new code base and small sections of decompiled code in large sections of original code.
      3. repeat