Role of Detective

Today we got a trouble ticket about some data that just would not behave in our application. The phenomenon seemed to be a scenario that was not supposed to happen. At first I assigned this to a junior developer. Then I checked up with the developer. Seems he was still stuck on the last problem I assigned him. So I took over today's problem myself.

Luckily, we audit a lot of things in our system. So I checked the audit data in the Production database. Indeed this data had a strange history. After calling a couple users, I at least figured out the final state that the data should be in. So the first order of business was to write an Oracle PL/SQL script to fix the data. No problem.

The more difficult task was to determine why this problem happened in the first place. This was not a common occurrence. Otherwise we would have heard about it by now. I asked a system administrator to retrieve the log file that our application writes to the user's workstation. I knew which user was working with the data when it went haywire (the database audit told me this). When I got the user's workstation log, it did not match up with the database audit. Something was really fishy. At this point a junior developer, or even a senior one without domain experience, would have not been able to progress any further.

I pushed this problem to the back of my mind. I did not even try to figure it out right away. And sure enough a theory popped into my head. Suppose this user was working on two different workstations on the day when the problem occurred. Sure enough, after making a couple calls, I found out that this was the case. This does not solve the problem. I need to get my hands on the workstation log from the other machine that the user worked on. But I have a better idea of what may have been the problem. This is a perfect example of how maintenance developers need strong detective skills to resolve all but the most trivial of problems.