How Slow Can You Go?

Our users had complained that sometimes the system slowed down to a halt. This problem got assigned to a developer on my team. Unfortunately the developer was out of the office for a few days last week. Tomorrow I need to provide a status on the progress made on this trouble ticket. This made me go work with the developer personally to get the research moving faster.

The developer got some feedback from a sys admin that 3 users were causing the CPU to be utilized at 100%. The developer was then trying to determine what these users we doing. So the developer sent an e-mail to the customer asking what one of these users was doing on the day when the system stopped working. I immediately saw the problem with this tactic. The user in question was actually a developer from another company. So I told the developer we don't need to go through the customer. We could go straight to the developer.

However I retraced the steps back to the facts of the matter. Yes maybe these 3 users identified by system administration might have been involved with the problem. However maybe they were not. I showed the developer some other event logs showing other users having recurring problems when the slowdown occurred. Earlier I had e-mailed the developer about this, but he just responded with some questions. Today I sat the developer down and showed him how I wanted him to generate test data to replicate this recurring problem the users encountered. When we had the data ready, we found that when certain users went to some specifics screens in the app, all users in that part of the app were frozen.

I am not sure if this is the cause of the problem. But we ran some good tests. Next step is to get sys admin to monitor the system when we duplicate the problem. We also went and talked with the other developer who sys admin says was involved with CPU 100% utilization. And we worked out a plan to have him try his work in development to see if we can replicate the problem. In the maintenance world, I have found that most developers need a lot of prodding to get them to make significant progress on trouble tickets. I am only sorry that I am the one who need to hold the prod.