Maintenance of Windows vs. Linux - Security, Updates, Licensing, Compatibility, Troubleshooting

Open Source Software vs. Commercial Software:
Migration from Windows to Linux
An IT Professional's Testimonial

Maintenance Headache of Windows

A Note on Reliability

In a typical company with numerous computers, the best practice for installing updates is to first download and install them in a test environment that closely resembles the environment used every day. This sets up an ideal system for testing updates first, before deploying them to real used systems. This process greatly decreases the chances of downtime due to bad or updates that break things. However, the more frequent and numerous the updates, the more work this requires. It is a fine balancing act of letting updates pile up and deciding when to test and deploy them, when they are released on average of once per week. In Linux, updates are not released in batches, they are released immediately as they are developed. So, instead of installing updates every time a new one is released, they should probably be installed in batches to simulate the Microsoft method. However, you will probably find that the Linux updates are generally released for fixing glitches instead of major security breaches. Therefore, they can usually be deployed less frequently than updates released by Microsoft. Again, this varies greatly.

Every week, Windows users are urged to apply all of the updates on their computers. Granted, they almost always fix the problem. But once in a while, an update will break something else though. Everybody that uses Windows has probably witnessed this or has heard of this happening. Why is this? My only thought is again because of the limited number of developers at Microsoft. On a well tuned Linux computer, you are actually more secure than you would be with Windows; it doesn't seem to have as many security problems or bugs, and if some are found they are fixed very quickly. With closed source software such as Windows, you can risk using software that is vulnerable to a malicious attack, while waiting for a patch to be released to fix the problem. I pointed this out earlier with the Internet Explorer vulnerability (KB961051) where Microsoft released a bulletin acknowledging a security issue, but two days later still had no patch or fix for it, leaving its customers with unsecure software while it scrambles to find a fix and release a patch. But, back to Linux, even better the updates can usually be installed without a reboot. Without a reboot, seriously? Yes. In fact, most software in Linux can be installed without rebooting. This is because Linux is very modular, or made up of many components that all work together. The Linux kernel (the main executable of Linux), can stay running in memory while you fine tune other software and settings. We all know how many times we need to reboot Windows when installing patches or updates, pretty much with every single one. This is not a big deal for a regular home PC that is probably turned off at night anyway. But, what about a critical server that is answering requests for business critical applications? The reboot may be quick, but it is downtime. I don't know about you, but I would prefer my servers to stay running as much as possible.

Linux is designed to stay up and running, and this is apparent in everyday usage of simply using it. Not only is the kernel very modular in design as I just mentioned, but other things like system services and processes can be paused or reloaded instead of completely restarted and disconnecting users. This is one thing that Windows flat out lacks. Windows system services have 3 options: start, stop, and restart. However in Linux, you have additional options for most services: start, stop, restart, and reload. The reload option is very useful and is usually used for reloading the configuration for a service without actually restarting it so that no interruptions take place. This is just plain smart, especially in server environments. So, in order to restart the service, an after-hours maintenance period would probably have to be scheduled for a Windows server. Where with Linux and the option of reloading, maintenance can be done during the day while users are connected and utilizing the service.

Typical reboot prompt for Windows updates, this one mandatory so the user must stop all work and reboot before the counter runs out. Seriously?

Some may argue that two servers could be set up in a cluster, so when one reboots the other server would stay up and continue to answer requests from users that need to connect to it. But, this would require two servers, and would therefore double the cost for hardware and software, so this is not a viable option for those trying to keep costs down. And in most cases when the servers reboot, somebody needs to check on them and make sure they come back up correctly. This can prove to be very inconvenient especially for servers that are scheduled to reboot in the middle of the night. And what about desktop PCs? Well, this isn't as critical as it only affects one user that is currently using the PC. But, some Windows updates actually make it mandatory that the PC be rebooted right away! Am I serious? Yes, this is definitely true. Now, Windows does offer several options where you can select to download all updates automatically then prompt to install them. But the default setting is to have Windows download and install the updates automatically. In theory this sounds great, the user won't have to worry about anything and all of the updates will be automatically installed in the background, right? Not quite so. Unfortunately Windows is not as efficient and cannot apply updates without restarting. See the example on the left with Windows XP where the user is prompted to reboot and is given no option but to stop all work and close applications within 5 minutes or their system will force a reboot itself! Notice that the "Restart later" button is grayed out!

Not only does Windows need to be restarted for most updates released by Microsoft, but even new versions of Windows such as Windows XP still has a Windows 98 feel where it requires a reboot to simply add, install, or enable a hardware device. With Linux, a reboot is almost never needed, as a module for the device that is needed is simply inserted on the fly, and the device is instantly activated. This is true for even the most complex devices, since every device is either supported directly by the Linux kernel or is compiled as a kernel module (which is inserted on the fly).

Typical reboot prompt for Windows XP after installing a simple USB modem device.

And what's even more disconcerting is that even the latest and greatest version of Windows such as Windows Vista still requires frequent rebooting just to install minor patches from Microsoft. Again, Linux does not need to be restarted after installing software patches since they are completely separate from the Linux kernel which is always running. It is still unknown to the exact reason Windows needs to be restarted even after the installation of the simplest of patches. The screenshot at the right is the message that Vista users get when the patches have been downloaded but not completely installed until a reboot is done.

Typical reboot prompt for Windows Vista updates and patches.

In Linux you can have your system automatically update itself (Red Hat Network daemon, or Yum in Fedora), but usually I review and if necessary install the update packages manually. Manually installing the updates is fine for a home environment. In a corporate or enterprise environment, there are other packages that will allow deployment of updates. Nothing is perfect, and even in Linux after an upgrade, it might need some attention or settings checked. This is rare, but sometimes necessary. Luckily, security issues with Linux only come across on occasion, as opposed to Windows where they are a regular occurence.

As mentioned already, Linux seldom requires a reboot when installing updates. This can keep server systems that use Linux running for quite a long time. It is definitely not unheard of to have a Linux server running for 1, 2, or more years without a reboot. Almost every time that I have had to reboot a Linux server was related to a long power outage where the battery backup unit simply ran out of power and couldn't continue running the servers. Therefore, I have personally witnessed many Linux servers running for well over a year without a reboot. In fact, recently a company called "Ksplice" [1], has come up with a solution for Linux called that allows kernel patching to take place in memory, allowing the kernel to be patched without a reboot! This means that you could have Linux machine running indefinitely, provided that it had reliable power backup incase of a power failure.

In the world of Windows, having a server running this long is esentially unheard of. Why is this? For some reason as I have already mentioned, most Microsoft patches require a reboot. I honestly do not know the exact reason for this, but I can only speculate that either Windows just isn't modular or efficient enough so that one component of it can be updated and rebooted without affecting something else. Either that, or it's just too quirky to rely on unless the whole server is rebooted, to ensure the updates are installed correctly. Whatever the reason is, it is rare to have a Windows server up for more than 30 days straight, since Microsoft releases major updates every 30 days, and sometimes even more frequently than that if they are critical enough. It is good to know that updates are being released frequently, however the part about requiring the server to be rebooted is not such as good. This can cause frustrated users that are cut off from services that the server provides.

On top of Microsoft patches to Windows computers, there are also numerous cases where software installations will conflict with each other, which requires additional rebooting as well. Often times, two 3rd party products will conflict, requiring Windows to be rebooted before the next product can be installed. This is demonstrated in the screenshot below for Symantec Endpoint Protection. In this case, some sort of previous installation is pending, so the computer needs to be rebooted before the Symantec Endpoint Protection software can be installed. The error below is being reported by the Microsoft Installer itself.

A reboot prompt for Symantec Endpoint Protection antivirus software, that requires a reboot from another installed product, in Windows XP.

On the contrary, as I have mentioned before, Fedora and Red Hat Linux use a software installation mechanism called RPM (Red Hat Package Manager). The RPM system is extremely efficient. And, it never requires a reboot unless you are installing an update to the kernel package itself. For every other piece of software, packages can be updated and put to work instantly. This is a huge contrast to the Microsoft Installer which not only can require reboots, but can conflict between different packages.

Typical list of MOM errors caused by server reboots from Windows updates.

OK, so we've acknowledged that rebooting is a nuissance, especially with servers. Not only to systems administrators need to monitor the servers after the reboots, but they also need to keep an eye on the patches that were also installed from the reboots. Patching Windows or Linux keep the same concept, that software is being updated or upgraded and it can have a negative impact on the server or computer. Patches can break things, and this is something that can happen with any operating system. So when you look and Windows and Linux, both need to be patched, so I will not focus on that point. I would however like to touch on the rebooting process, especially on servers, and the monitoring and cleanup that is necessary when rebooting Windows servers. Normally, the rebooting must be done after hours so as not to interrupt normal operations. Since Linux does not have to be rebooted, it can normally be patched during the day, sometimes even while users are using it. But, patching Windows after hours is no fun. This takes time away from systems administrators that are doing the patching, in most cases personal time in the evening, to monitor the server reboots. Then, there's the cleanup afterwards. Wait, what do I mean by cleanup, how can that be caused by a simple reboot? Well, in most enterprise environments, servers are closely monitored for failures. In this world, a simple reboot can throw up red flags as the server is temporarily unavailable during this period. Some hardware/software monitoring products will account for this, and some do not. I will focus on Microsoft's own monitoring system, the Microsoft Operations Manager (or MOM). MOM is an extensive piece of software that will monitor all aspects of Windows servers, hardware and software alike. It is a good product, I admin, and does what it is designed to do. But, there is one flaw with it. Microsoft forgot to put in any functionality to handle the server rebooting for updates. So, you can have your server patch itself, and reboot itself, but MOM will complain extensively as soon as the server reboots. This typically generates a whole slew of errors, leaving the system admin to sort through them one by one, weeding out any potential problems from those just caused by the reboot. This a tedious and very time consuming task, and must take place every time the server reboots automatically. A workaround to this problem is to put the server in maintenance mode so that no alerts are generated, but this is a manual process as well. So either way, the system admin must manually take some sort of steps.

Microsoft is very keen on rebooting. Even some errors in MOM suggest rebooting as a solution to the problem, as demonstrated in the screenshot below. This is actually Microsoft's own solution! This screenshot is a memory error that seems to be common in Windows. I have seen it happen in both virtual machines as well as physical machines of different types, so I believe it is a software problem of some sort. As you can see, the resolution says to reboot in order to fix it.

A non-paged pool memory error reported in Microsoft Operations Manager (MOM 2005). The Microsoft suggestion to fix the error is to reboot the server.

One thing that is fun to do is check to see what systems are used for some of your favorite websites, as well as their total uptimes (time since the last reboot). For instance, the Netcraft website has a neat script that will check to see what software is running any website that you choose. It also tries to query the server to get the uptime (time since the last reboot). So, go ahead and query your favorite websites and see what software they use, and how long they have been up since a reboot. One thing that I noticed right away when I checked sites that use Microsoft IIS is that they do not post their uptimes even though Windows supports this query. Even Microsoft itself, has the server set so that it doesn't report an uptime even though it could report that value if it was enabled on the server. My only guess is that they are embarrassed to display a short uptime. In my findings I did see a couple of servers running Microsoft IIS that had been up for quite some time, but I can only imagine how vulnerable they would be from a security standpoint as they would not have any security patches applied since the last reboot! Luckily Microsoft has taken the time to install enough servers in a cluster to ensure that the end users don't see any down servers. This may be OK for a large company like Microsoft that has enough money to invest in enough hardware to accomplish this, but for smaller companies this might be a waste or even outside of the budget to purchase more hardware all because Windows needs to be rebooted so often. A company would be better off purchasing more hardware to guard against a catasrophic hardware failure, not because the operating system needs to reboot itself often. Spend some time on the Netcraft site above for some fun on checking out various web servers around the Internet, and see what kind of results you find. I will go into much more detail later on the hidden costs of this extra hardware and downtime with Microsoft Windows.

Uptime of Microsoft Windows 2003 servers running the Microsoft IIS web server at Microsoft itself, showing typical uptimes of less than 60 days at a time, indicating frequent rebooting. Graph courtesy of Netcraft.

Uptime of Linux servers running the Apache web server at Linux.com, showing typical uptimes of more than 170 days at a time, indicating very infrequent rebooting. Graph courtesy of Netcraft.

Anomalies Galore

I think we've all seen our Windows PCs lock up, reboot automatically, throw out strange errors that are unexplained, blue screen, etc. But why does this happen time after time again? It's happened as far back as I can remember, with Windows 95 all the way up to Windows Vista, and on known good hardware. And to boot, there are problems that have been around for years in previous versions of Windows, that are still there in the latest patched version of Windows XP! Is it simply that Microsoft doesn't care to fix problems that exist, or cannot it not devote the resources in getting them fixed? This question will probably go unanswered indefinitely, as I don't picture Microsoft being openly supportive on an answer to these questions. And it's not like any developer out there can just pick up the source code and fix the problem like the open source community, so problems can linger until Microsoft gets around to fixing them.

Fortunately in the open source world of Linux there is really no such thing as limitations because of the vast depth of the open development community, so problems like this are fixed and do NOT get strung out over years of time. But I think we have all become accustomed to this type of behavior in Windows, and therefore we just expect that this is normal. But is it really? The answer should be no, it should not happen very frequently. Probably the most noticeable problems like this that I have seen lately are NTFS filesystem errors on servers (Windows 2003 Server) and desktop PCs (Windows XP), that cause the system itself to completely crash and data to be lost. NTFS is Microsoft's filesystem used in all its server operating systems (Windows Server 2000, 2003 and 2008) and desktop operating systems as well (Windows 2000, XP and Vista). It has a history in the server arena because it can handle large files and has good performance, etc. Unfortunately, NTFS is at the center of the hard disk I/O because it is what organizes files on you disk. Time and time again I have seen this filesystem totally get corrupted for no rhyme or reason, leaving the server or desktop PC totally useless. Not only does this prevent the system from booting, but is very painful, time consuming, and can be VERY expensive if you need to retain the data stored on the disk for any reason. From large servers hosting hundreds of mailboxes, to simple desktop PCs, I have seen NTFS fail time and time again, when there were no hardware problems. I have reinstalled everything over the same hardware and it would continue to run just fine for quite some time, telling me the problem was caused by pure software, at no fault of the hardware. Unfortunately, these types of anomalies can cause extreme amounts of downtime when the system crashes. Basically, the time to reinstall the operating system back to the state it was before the crash. Fortunately, we have tools such as Norton Ghost and others that can back up the entire C partition and restore it if the filesystem gets totally hosed. But, I don't think this should happen to begin with, if the hardware is working as it should.

Luckily, the filesystems used in Linux are extremely stable. In my 11 years of experience, I have not once seen a Linux system become totally unstable and not boot, because of a strange anomaly like the filesystem becoming corrupt on its own. Also, the need to defragment is pretty much non-existent. Most of those that use Windows are probably used to running the defragment utility often to "optimize" performance. It's almost like your house, things need to be cleaned up once in a while to keep things in ship shape. But, does it have to be this way? No so with Linux. To date, I have never even needed to run a defragment process on any of the filesystems that Linux uses. I have mainly used the ext2, ext3, and XFS filesystems, with very pleasing results. XFS has been my favorite since its early years of being ported over to Linux, around 2000 is when I started using it. There is practically no file size limit, it is very efficient at storing lots of small files, and it is a journaling filesystem (changes written to disk are kept in a log so that no data is lost if the kernel cannot write to the disk such as the disk going offline or spinning down). I have put the XFS filesystem through stress tests over years and years of use, and am happy to say that I have had servers using it running for well over a year without a reboot. I have never witnessed any sort of file corruption in any way shape or form. In fact, the only times that the servers were rebooted were due to power failures. In the Windows world, this is plain unheard of. I have screenshots of two of these servers as an example of the uptime, note the total uptime in days. One of these is a web server running Apache and the other a mail server running Sendmail. These servers were eventually rebooted because of a widespread power outage that knocked out power for 2 days, and our battery backup ran out of power to run the servers any longer.

A Time Bomb?

Just today I received an article in my email inbox for my regular CNet newsletter describing faults with the Windows registry, focusing on problems that can be caused by the registry becoming "corrupted". Its focus was on why CD and DVD drives disappear in Windows. It's filled with a list of other issues and some links of possible solutions. Which brings up a pretty good subject: the Windows registry. The purpose of the Windows registry is pretty straightforward, to house all of Windows' settings. It is a nice convenient way to keep everything in one place, that's for sure. So, what is all of the fuss about issues and corruption? Well, the whole problem with the registry is that just about every program and service that runs will need to access the registry for something or another. Because of this, the registry is VERY fragile and must be modified with great caution, since many things depend on it. This makes sense as critical settings are housed here, and one slip can cause some serious damage. Which is probably why any article posted by Microsoft for a fix within the registry has disclaimers all over about the danger modifying the registry. There are countless articles posted on Microsoft's website that contain solutions to problems, and direct the user to modify the registry, yet a few lines down exclaim how dangerous it can be to modify the registry and that Microsoft does not guarantee any results from doing the modification that they are suggesting be done! One could say this is a little controversial, however I don't think Microsoft is going to try writing a patch for each and every fix that is encountered. It's much easier to post the change that must be done to the registry instead.

Let's face it, the registry is basically a single point of failure in Windows. Luckily, Linux has a much simpler design and does not have a single point of failure like the registry. Linux has a variety of ways that programs and the kernel keep its settings. Often times, settings are stored in plain text files on the hard drive. Sometimes they are stored in the "proc filesystem" (a virtual filesystem located as /proc on the hard disk). Either way, things are kept simple and it's pretty much impossible to modify any of these files and create a system that won't boot. There must be a reason that Microsoft puts up all of the warnings, and this is probably because the registry is a single point of failure and is very fragile. Having the core of the operating system this fragile doesn't seem like a great recipe for a stable operating system.

A Microsoft posted article about how to edt the registry, advising of the danger in doing so.

A Touch on Stability

I would like to touch on the issue of stability as I started to get into already. One thing over the years of using Linux and Windows side by side is that Windows has never been nearly as stable as Linux. This is not just my observation, either. Doing some quick Google or Yahoo searches on this subject will yield some interesting results to back up this statement. It is comical to me that administrators of Windows machines will run across crashes, and usually the fix is to reboot and continue on the way and not wonder what happened. It's working after a reboot, why spend the time and try to figure out what went wrong or caused the crash? This is the mentality of administrators of Windows machines because it is the norm in that environment. But, if you step back and consider that Linux does not behave like this, and Linux crashes are usually caused by faulty hardware or badly programmed applications, a different mentality should be obtained. In 11 years of administration of Linux machines, I have only witnessed one crash where the server completely froze and caused what is known as a "kernel panic". Because this type of a crash is such a rare instance, I was able to dig in and discover the issue and determine a solution. In this case, the stock kernel provided with Red Hat 7.1 for the DEC Alpha architecture (this is NOT the same as Intel architecture ) was compiled incorrectly. This was definitely not an easy task to locate and fix the problem by any means, but there WAS a fix. Luckily, we were able to recompile the kernel and the server was up and running, rock solid just like any other. But back in the world of Windows, crashes and the blue screen of death or BSOD still occur even in the latest Windows Server 2008 with all of the latest patches. To this day I still see this happen in Microsoft environments quite frequently. My theory of this is because of the structure of Windows and its multiple points of failure. And as this subject continues, I reiterate the point that this causes confusion to users, costs companies and individuals for downtime, and can lead to further problems and data corruption which can cause a chain reaction of problems and further costs. This should not be the way to run systems of a business and should not be accepted as normal operation.

A Closer Look at Windows Anomalies

Unfortunately, having a vast number anomalies almost makes things seem more fragile knowing that something could suddenly break at any time. This can simply not be avoided in the world of computers and servers, where things are always bound to go wrong. This is just what we call Murhpy's Law, which states that if there's any chance that something go wrong, it will. Murphy's Law is applied to the world of technology all of the time. But, by decreasing your chances of having to deal with these problems seems to be a logical move to me. Yes, hardware will fail. But with modern servers, hardware is now redundant to where if a component fails, the server can stay up and running, often times while the technician makes a repair to it. But, this only goes so far as the operating system. A server's components can be completely redundant, but if the operating system fails, the redundancy is no good and the server becomes inoperable. I have seen many times where servers that have mutiple redundancies with their hardware have crashed because of Windows problems. The anomalies with Microsoft products have been so great I have compiled a list of ones that I have encountered on multiple occasions, over multiple years and over different versions of Windows, where I have ruled out the problem to being caused by faulty hardware. Especially since these very same problems reoccurred in different environments on completely different hardware, and at different times. To me, some of these should just not be tolerated in a production or business critical environment.

Top 10 Windows Anomalies

A Windows event log entry showing a corrupted NTFS filesystem on a Windows 2003 R2 server. (Server name and volume name intentionally blurred.)

Failing NTFS filesystem. Time and time again I have seen Windows servers completely crash because the NTFS filesystem has failed. Everything from errors appearing while a server is running, causing a slow death or strange popups, to rebooting a server and having it never fully boot back up again. To this day even with the latest Windows 2003 or XP operating systems with all of the latest patches, I still see this happen from time to time. I have seen it on the most basic desktop PCs to servers housing hundreds of mailboxes or servers housing gigabytes of files. I have concluded the issue to not be related to failing hardware, in the cases where I have been able to re-install Windows over the same hardware and have it continue to run without problems, as well as run hardware diagnostics that pass all tests. The reasons to the corruption are still unknown to this day, other than NTFS seems unstable. NTFS is a journaling filesystem, so no data loss should be encountered in theory. However, a server operating system needs to be more stable. Whereas using the ext3 filesystem in Linux (as well as the xfs filesystem), I have NEVER witnessed any sort of file corruption to date, so I have nothing to compare this to. Also as a side note, the ext3 and xfs filesystems do NOT need frequent checkdisk or defragementation operations performed.

Temporary files causes strange behaviour of many applications. Over time, applications will generate temporary files and these files are usually left in place. This means that folders of temporary files will build up and cause slews of strange problems. I have seen such issues as Microsoft Word locking up, Microsoft Outlook's window get frozen and cannot be resized, problems opening attachments in Outlook, and a list of others. Commonly, the solution is to delete all temporary files in these folders and the problems magically disappear. Linux has a service called "tmpwatch" that automatically purges out temporary files by age, and also a service called "logrotate" that automatically rotates old log files by age.
Printer Drivers crashing the Windows Print Spooler Service. I am sure that anybody that runs a Windows print server has seen this problem. It is usually caused by a flaky or unstable printer driver. Unfortunately the Windows Event Viewer doesn't provide many helpful details on the cause of the service crashes. Through process of elimination of the jobs that were printed just before the crashes, can usually narrow down the cause somewhat. Since printer drivers are provided by any third party, the cause and fix, if available, can be frustrating.
Memory Leak of Windows Processes. This is one of the more complicated issues to troubleshoot. In the past I have seen Windows processes themselves eat up memory as shown in Task Manager for unknown reason. The last incident that I saw this was on a file server where the WMI service at up to 2 GB of memory, and caused a blue screen of death and the server never rebooted after that, requiring a complete re-install of Windows.
Windows Reboots Itself Due to a "Bugcheck". The computer will at random reboot (even while programs are still open). The error in the Event Log shows "The computer has rebooted from a bugcheck. The bugcheck was: 0x000000e2 (0x00000000, 0x00000000, 0x00000000). A dump was saved in C:\WINDOWS\Minidump\....". The times that I have seen this, it appeared after a batch of Windows updates were installed, and the cause was never found. We did plenty of research which came up with tons of posts around the Internet of the very same error, but no solutions or causes posted or known. We ended up having to do a complete reinstall of Windows which fixed the problem and it never came back. This ruled out any possible problems related to hardware. See example screenshot of this error in Windows XP.
Windows Locks up when disconnected from the network in a Domain environment. Everything is working great in a Windows domain between the servers and the client workstations, until the network connection is lost. Suddenly, the workstations lock up. Why is this? Unfortunately I have never found the exact reason or put enough time into finding out why. It seems to happen mostly when the client workstations have mapped drives. This has happened among multiple versions of Windows, and still does today with Windows XP. I have not tested Vista. Try it sometime, and you will see this result in all its ugliness. When this happens, users can wait and wait for Windows to recover itself, or simply reboot to continue working.
A file is stated as "in use" or "access denied". Basically, a file is locked or in use, but to try and find out what is locking it can be quite the task. Files frequently get into this state on the Windows NTFS filesystem, and sometimes when the task that has locked it cannot be located, the computer needs to be rebooted in order to free up the file. With Linux, this file locking problem is almost non-existent, due the better way that Linux handles file locking. Linux is more loose in this aspect, where files can be moved or modified on the fly, which at first sounds as a problem. However, in my experience, it frees up a LOT of wasted time by not having to troubleshoot locked files, when the problem of locked files (such as in Linux), doesn't exist. I have seen instances for instance on Microsoft web servers, where a simple HTML file is locked by IIS, the Internet Information Services service (or the web server service). In this case, the service that locked the file was known, but the reason was not. The only way to free up the file was to restart the IIS service, causing interruption to all website traffic to the server. The Apache web server on Linux is far superior in this regard, as with Linux, in that files are not locked and this problem is therefore non-existent.
Computer generally slows down over time. This one is very difficult to pinpoint and troubleshoot. In fact, commonly it is just regarded as normal behavior for a computer when it should not be. When an operating system like Windows is first installed and runs very fast, why does it slow down over time? Everything from boot time to general time it takes to open and close applications, seems to take longer the more the computer is used. This is one mystery that may never receive a sure answer. But, from my personal observations on the issue, Windows users can do some maintenance to try and alleviate these problems by: cleaning out temporary folders, uninstalling programs that aren't needed, defragmenting the hard drive, and other common tasks. This usually helps, but only by a marginal amount. I have never seen a Windows system return to the speed that it once was when I first installed it. Sure, you can get around the problem by adding more memory to your system. But, this is still like putting on a bandaid, and not fixing the root problem. It can also lead to extra cost when there sholdn't be any. Based on the workarounds and partial fixes, my best inkling is that things are generally getting clogged up over time. Kind of like a house that needs some spring cleaning. But, my point in mentioning this is that I have not seen this same behavior with Linux. In fact, every Linux computer I have used runs just like it did the day I installed the operating system. The difference between Windows and Linux with this is huge. Linux also has automatic mechanisms in place to clean out temporary files (in the /tmp folder) and to rotate system logs (in the /var/log folder by the logrotate service). Windows just flat out lacks these mechanisms, along with being very inefficient at keeping things neat and clean, so therefore my theory is that the entire Windows operating system just gets clogged up over time. As I mention in other areas of this article, this leads to higher maintenance of Windows systems and more time and money spent on maintenance. Overall, Linux just seems to be much more efficient than Windows at a lot of things, and this is demonstrated by the reasons above.
Cannot log on to a domain laptop when connected to a network outside of the domain. When taking a domain laptop to a non-domain network, the user cannot log in due to an error that says "The system cannot log you on because the domain ... is not available". I have seen this one happen several times out of the blue, and unfortunately it goes away on its own. Currently I have not been able to find a known cause. The kicker is that the laptop should look for the domain controller, then give up and allow the user to log on with cached credentials. This is done by design, and works 99.9% of the time. But for the 0.01% of the time when it does not work, it can be a very serious problem since it denies the user any use of the laptop with their own account. And what makes matters worse, the only known way of fixing it is to connect the laptop back on the domain network, where the issue magicly resolves itself. This can cause huge headaches for a laptop user that is travelling or on the road, far away from the domain network.
USB flash drives do not show up with drive letters when other (network) drives are mapped with the same letters. If you happen to map network drives with low letters like D, E, F, etc. and insert a flash drive, Windows doesn't know to move to the next available letter. Instead, it tries to use a letter of one of the already mapped network drives. This is a serious flaw and comes up all of the time. To correct this behavior, the user must have Administrative privileges to open the Disk Management control panel and manually look for and assign the next available drive letter for the flash drive. With Linux, drive letters of the alphabet are not used for any devices. Instead, a label for the device is used and the device is immediately available on the desktop or under the /mnt filesystem under its label name. This avoids mapping alphabet letters to devices, maintaining these mappings, and conflicting letters as in this example.

Massive Outages with Massive Usage

As mentioned already, one of Windows' biggest issues is that it is a constant target of viruses, worms, and malware in general. While this may not seem to be the entire fault of Windows in that it has a major market share and hence it is the target in most cases, it can have disastrous consequences on a massive scale if organizations and entities use only Windows in their environment. As some would say, "having all of your eggs in one basket". Diversifying is good for a lot of things, such as investing for example. It doesn't make sense to put all of your money in to one company or stock, as if that stock or company goes belly up, there goes your money. Instead, it's better to invest in several stocks together so that if something bad happens to one of them, only a portion of your money is affected. The same can be applied to IT as well. It is not a good idea to invest everything in to one operating system, especially one that is prone to security problems like Windows. And, I would not consider Linux itself a single operating system since it contains hundreds of different configurations (or distributions) because it is truly open source (you can have a full featured computer with Linux, or a very basic stripped down computer with only a couple of services running on it). Windows itself only has a couple variations, and whether Windows is running on computers or servers, they still run the same OS kernel with similar services. I've touched on this before, because simply logging in to a Windows Server will show a lot of extra junk that should only ever be installed on a computer and never a server. But, Microsoft doesn't seem to have any intentions of having a truly separate server OS which only contains services for a server and not a standard computer. And yes, Microsoft does market its Windows OS separately (for example Windows 11 or Windows Server 2019), but under the covers a lot of the same services are installed on both by default.

Companies that solely use Windows have experienced massive outages primarly due to security issues. Most of the time, it is due to Microsoft releasing a bad update or one that causes massive conflict with another piece of software, as described previously. Other times it can be due to a third party product like the event on July 19, 2024 where massive IT outages occurred worldwide due to a security product called "Crowdstrike" that runs on Windows operating systems to cause massive scale PC and server crashing with the famous Windows blue screen of death (BSOD). IT organizations across the globe scrambled to reboot and remediate PCs and servers as quickly as possible however the impact of this outage was very widespread and felt throughout. Meanwhile, those systems on Linux, as in a lot of cases, were not affected. Sure, Linux has had issues with software as well on a massive scale, too, but they are much less frequent.

One core issue of Windows, since it is the target of security vulnerabilities, is that it must run security software on it 24/7, which takes up resources and overhead. And yes, Linux has and should have security software running on it too, but it is very efficient and not even nearly as bloated as those that run on Windows, mostly because the number of security issues with Linux is far less than Windows to begin with. And, when a security product like Crowdstrike releases an update for its software that causes Windows to crash as in the event on July 19, bad things will happen all over, and in this case affected industries all over the globe including airlines and airports, payroll systems, and thousands of other businesses and organizations. What is even more comical, is Microsoft itself releasing a statement that rebooting the PC or server up to 15 times can resolve this issue. Say what??! I have already addressed the rebooting problem of Windows above which is one thing, but a 15-time reboot suggestion from the software vendor (Microsoft) is a new one for me. What type of IT organization has the resources to reboot each and every one of its computers and servers 15 times each?!

A Microsoft posted statment from the Azure Global Status page on July 19, 2024, mentioning a 15 time reboot fix for its Windows operating system.

This type of reliability should not be tolerated by any critical service, period. This should be a learning lesson for all corporations or organizations affected.

Sure, Linux has had its issues as well but not a failure to this scale. And yes, there are a lot of times where certain software is only available on Windows because it has the dominant market share. However, in those cases, a better solution such as using Linux as the core operating system and an application emulation layer such as Wine should be considered. This gives the best of both worlds where the Windows application can be used, but the operating system running it is Linux and is not affected by 3rd party software failures such as Crowdstrike. Wine has been mentioned in various places in this article, but this page has the full detail.

Next Section : Maintenance Headache of Windows:Troubleshooting

Previous Section: Maintenance Headache of Windows:Staying Secure,Virus Outbreaks,Malicious Software,Elevated Rights Model

Table of Contents

Click Here to Continue reading on making the actual migration.

References

1. Ksplice : Never Reboot Linux