Tough Bug

I have been assigned a trouble ticket that has been extremely difficult to diagnose. This problem had been left over from the previous contractor that maintained the software. They were unable to resolve the problem. Now we were on the hook to fix it. Like most problems, I attacked this one head on by checking the production audits of the problem items. At first I could not make head or tails of the situation.

In general I refuse to let any problems get the better of me. So I started looking more broadly at related items that we audited in production. Then I found something that was of interest. The user seemed to make an unexpected change in some of the data right before the problem happened. This was exciting. It seemed like this was the source of the problem. I tried to duplicate this problem in the development environment. I was disappointed to find out that I could not make the problem happen. Still I thought I had been on to something.

Sometimes you need to try out a couple things to get to the bottom of the matter. I decided to install the version of the application that the users were running. However I instead pointed the application to my development database. That’s when I first made the problem happen myself. Usually this is the point where a fix comes quickly. I was still perplexed why I could not make the problem happen when I used my debug version. Oh well. I tried a release version that I built. I still could not make this problem on anything other than our official release.

This troubled me. However like I said before, I am not a quitter. So I set up my virtual machine to be a build machine. I did a build just like our configuration management team does. Now I was going crazy, because my build would not make the problem happen. This is where I broke down and asked the configuration management team if I could borrow their build machine. Wouldn’t you know it? On their machine, the application uses a second copy of our code which is ever so slightly modified. The small modification was causing the problem.

Development is certainly partially responsible for this. Why do we have two copies of the same code, but with subtle differences that cause bugs? I plan to get to the bottom of this. However I am now at the point where the fix is trivial. That is a bonus because this was causing me to lose sleep this weekend. I crush bugs. They don’t crush me.