Experiences with (very) rare Linux crashing, upgrades

By apexwm, 15 December, 2010 04:07

One of the really nice things I love about Linux is that it can be upgraded from version to version of most mainstream distributions. This means that ideally, a system can be kept up to date over time while preserving settings.

I've heard of so-so luck in the real world with upgrades from version to version. Personally though, I've had pretty good luck mainly with upgrading Fedora from version to version. I love the fact that the binaries are upgraded, while maintaining config files, settings, user data, etc.

Recently, I had a very rare case where a CentOS 5.4 server was crashing. Some binaries started to show segmentation faults (started with yum, then logrotate, and certwatch), then eventually the server would run for a few days, then the kernel would simply crash dump (similar to the common blue screen of death that we are all too familiar with in Windows). I've only seen Linux do this a few times in many years of experience. In cases previously, it was mostly due to hardware problems. One time it was due to a faulty kernel on the particular machine (Dell Vostro laptop). It seems that in this case though on this server, it was due to a software problem of some sort. When upgrading CentOS from version 5.4 to 5.5, the problems went away completely. So it seems that something (I suspect there was something up with the Python libraries) software-wise was awry. But what I was amazed at is how every single setting, configuration file, user data, everything... was completely preserved from the old version of CentOS to the new upgraded version. The CentOS upgrade process refreshed all of the binaries on the system including the kernel and the system booted up and was up and running in no time. No tweaking or adjustments were needed at all. This is one of the most amazing things about Linux is that the upgrade process can be so straightforward and so effective. Great great stuff here. I am so used to seeing upgrades in Windows and other software fail miserably.

 

Talkback

I've only been running Linux for 10 years now, and have had only one crash. I think it was Mandrake 8, not sure on the version. But when I upgraded to the next version all was okay. I ran a server for 3 years using Mandriva 10.2, and the only time it went down was in a power outage.
ator1940 15 December, 2010 14:14
Report offensive content Reply


ator1940 : Thanks for sharing your experiences as well. Thankfully, Linux is not plagued by the software problems that Microsoft Windows has, which range from everything of corrupted filesystems to corrupted configs and everything else that causes blue screens of death (BSOD). Linux has the track record of stability and people are generally aware of this. That's why any soft of software corruption comes as a great surprise. Same with any sort of crashing, really.
apexwm 15 December, 2010 17:23
Edit Delete Report offensive content Reply


Track record of stability? Well, not really. I'm sure you haven't really done anything special with Linux all your years, since it's not that hard to see a crash.

Stable filesystems? Yeah, maybe if you use ext3. Which can still corrupt on power outages etc. You know, I've been in business longer than you and I quite often see machines with power outages etc. And you know what? Never has an NTFS corrupted or needed a checking. Not even if the machines are running database operations and writing files while the power went out. Linux on the other hand? Long checks and problems with files. And no, I'm not going to suggest them to run ext4 or some other testing filesystems.

And yes, it's their fault that they get power outages. But that's not the only time Linux crashes and causes problems. Fortunately many of them have been able to transfer to Windows Server, which runs nicely without problems on the same machine. So no hardware problem there.

Also what's nice is that you didn't care to even find out what the problem was. "Oh, there was a problem. I ran an upgrade, everything looks ok, so I don't care." Wouldn't want to use systems administered by you...
Getaclueapexwm 22 January, 2011 21:56
Report offensive content Reply


"Track record of stability? Well, not really. I'm sure you haven't really done anything special with Linux all your years, since it's not that hard to see a crash."

If you have experience with both operating systems (which doesn't seem to be the case), I'd love to hear of your comparisons of both and the environments used, and what happened. Since you didn't provide any examples, I'm assuming you have none.

"Never has an NTFS corrupted or needed a checking. Not even if the machines are running database operations and writing files while the power went out. "

Then I think you have been extremely lucky. I've seen countless times (as I mentioned) where the NTFS filesystem has become corrupted, and next thing we know the server needs rebooting to do a full filesystem check. And in most cases, it was NOT due to a power outage, but simply while the server was running. Meanwhile I've had Linux-based servers under higher load, running ext2 and ext3 with no corruption whatsoever. So there's the basis of my conclusions.

"But that's not the only time Linux crashes and causes problems."

I would love to hear of your experiences. In my case I can fit the times I've seen true Linux kernel crashes, on one hand. With Windows I've lost count.
apexwm 25 January, 2011 18:33
Edit Delete Report offensive content Reply