Why Should I Patch My Server?
Wait, what?
Why should I patch my server?
Because you won’t get bug fixes.
But my server works just fine, if it’s not broken don’t fix it, right?
What about security fixes?
My server isn’t accessible from the internet. I’m not worried about it getting hacked.
But it’s still possible, right? Do you really want to take that chance?
Yeah, patching requires time and money. It could even require downtime. It’s more important for us to keep the server going than to worry about that stuff. We’re safe.
Over months memory was slowly drained by a rogue application that never released it. A month before, some users were experiencing slowness, but nothing too bad. The week before, a ticket was opened to with the helpdesk. It wasn’t really a major issue. It was probably their laptops, not the server. And then in happens, the ticket goes from P3 to P1. The server is down and nobody knows why. Nobody can log in. The ticket is then escalated to L2 and then to L3 support. When someone opens a remote terminal to the server, they see OOM-Killer alerts and then nothing. There is no response in the console or ssh. The server is rebooted. The applications are running again and nobody really knows what happened. They open a ticket with the OS vendor so they can tell them why their server crashed.
The logs show nothing. Journalctl shows business as usual and then a reboot. There’s no OOM-Killer alerts, no memory errors, and no kernel core-dump. There’s nothing in the application logs. The OS vendor reports that the reason is inconclusive except that they never patched the server. In the years since the server was set up, hundreds of patches were created and sent out. The issue could come from any number of places. To paraphrase Arthur Conan Doyle, in order to get to the truth, you must eliminate the impossible. The easiest way to do that is to make sure that the systems are patched against known issues so that when a new issue comes along it is easier to diagnose and repair.
If the company performed all of these patches, could this have still happened? Yes, of course. Bug fixes can only fix issue that are already found and reported. Another part of good system hygiene is just keeping an eye on the server. Of course, you don’t log into a few hundred servers at all times watching them. You set up a monitoring system like Nagios or one of a dozen others. If CPU cycles, memory or swap usage, or network lag goes up, you have a chance to do something about it before the system crashes. That’s when you want to start your investigation, not after you’ve experienced downtime. A monitoring system isn’t terribly costly, but it’s not completely free either. It takes time to set up and it takes resources to host it.
To answer the original question, you patch your server to save money in the long run against system and application crashes and against security breaches. If you don’t, you’re doing yourself a disservice and loss in revenue could ultimately cost significantly more than the time lost for patching.