Saturday, April 24, 2010

Stories from my life

A guy with whom I'm playing an online computer game commented to me: "As a programmer you already know, there is no such thing as a "random" glitch".

There may be a cause, but things can look pretty random. I had to fly down to North Carolina once because a system we had written for a large company (controlling vending machines that took credit cards) started randomly charging customers $0 for some of the purchases (which were NOT $0).

I remember the flight well, because I had a cold and the cabin was unpressurized, and I arrived weak and pale with blood trickling out of my ears after some of the worst pain I've ever experienced. The guy in charge of the computer that was running my program whom I was rushing to see was unable to see me at first because an Ebay auction on some collectible was in the end stages, but finally I got to see the system.

After hours of comparing my input logs to the output we were generating, and trying to figure out HOW my program could possibly produce such nonsense - and running the program in a test mode with the same input to try to reproduce the problem - I was ready to tear my hair out.

I wrote a quick test program that did simple arithmetic and logged the output. It worked fine. I went to my hotel room in despair and left it running.

The next day, I checked the output log of the test program and found that for several minute-long periods the computer would add 1 and 1 and arrive at 0.

The computer they were using to run my program was really a collection of a large set of processing units, and processes could be run by any one of these - when my process was run on a defective unit, it produced defective results.

When I asked them how this could possibly have been going on undetected, they explained that the system ran diagnostics continuously, but the results were sent to a display that was itself no longer working, so they hadn't checked in months. Probably higher priority Ebay auctions.

Our president explained to me that it would be bad for company relations if I killed anybody. In retrospect, it might have been worth it.

No comments: