Friday, July 11, 2008

Patching is painful

We make ourselves patch our Unix and Linux servers twice a year. Otherwise it just wouldn't happen. It seems to take a long time to get ready to for patching, gathering what needs to be applied, downloading the patches, testing things, doing the patching. Last night I patched the stage servers. Every time we patch and reboot, we seem to find some server that has a problem booting up. Last night, with about 16 stage servers, I had one that developed a fan problem and keeps shutting down, one that took over an hour to boot up because of memory errors, and one that simply would not boot at all.

I was on the phone with HP for quite a while last night, the offline diagnostics were not conclusive about the reason for crashing, they decided to send down a field engineer. This usually takes at least 2 hours for them to get down there, so I went in to try to get the server running. I eventually isolated the problem, and got the server booted up. I sent the online diagnostics to HP, about an hour later they figured out that the system needed a backplane replacement. This is a really big machine, a backplane replacement is a 2 hour minimum task. (It's currently scheduled for Sunday morning since I got the box running.)

None of that would have happened had I not patched. Hopefully all the problems for this patch cycle are taken care of now, and we won't have any problems when we go to patch production on the 21th...

No comments: