Sun, 22 Feb 2004

One Bad Bit

I spent 9 hours this week helping a Windows user recover from a nasty crash. And it's not done, yet. Windows XP Pro, Microsoft Office, and another critical application have been reinstalled, Windows and Office updates applied, and user data restored. There are still many applications to reinstall and configure and a lot of clean up to do.

Sadly, the system would quite likely have been completely recoverable if I had been able to toggle just one bad bit. In the worst case, one disk sector was corrupted, so it could have been more than a single bit, but I like to think it was just one. An entire Windows system defeated by one bad bit.

Last Thursday, the user noticed that the system was responding very, very sluggishly. She did what all Windows users do when they're having system troubles–she rebooted.

On reboot, the system blue screened. The error message was displayed so briefly, it couldn't be read. Then the system rebooted itself. This cycle repeated until the computer was powered off.

I was called to help. When I fired up the machine, it entered the same reboot loop. It exhibited the same symptoms attempting to boot to Safe Mode and Safe Mode With Command Prompt. I was only able to successfully gain access to the machine by booting to the Recovery Console using the Windows XP Pro installation CD.

The user had forgotten the Administrator password, so to even boot into Recovery Console, I had to first reset the Administrator password. I did that with Petter Nordahl-Hagen's Offline NT Password & Registry Editor.

Recovery Console includes a pathetically weak set of tools for performing system recovery.

When I was unable to locate the source of the problem, we decided to pay Microsoft $35 for support. After attempting the initial, obvious solutions, the Microsoft tech suggested restoring the registry to an earlier version.

First, we renamed the registry hives (system, security, sam, and default) in the C:\Windows\System32\Config folder. Then we copied backups from C:\Windows\Repair to the Config folder. There was trouble, though. No backup copy of the System hive existed. And even if it had, the other hives were 11 months old. Restoring such an outdated registry would have been nearly as painful, perhaps more painful, than completely reinstalling Windows.

The support tech indicated the only option left was reinstalling Windows.

Of course, that means not only reinstalling the OS–it means reinstalling all the application software as well.

I didn't want to give up the fight just yet. So, I used the current System registry hive and backup copies of the other hives. That gave me a bootable system! But there were, not surprisingly, many problems. I did verify that the application data was intact. And that gave the user some hope.

Obviously, the problem was in the registry. So, I tried putting each of the current registry hives back in play, one at a time, to see which one was bad. That's when I discovered the real problem. The Software hive, the largest of them all, had a CRC error. It could not be read or edited.

The best I was able to do was:

type software.old > software.new

That got me a 14MB file before failing with a CRC error. The original file was 18MB.

Attempting to run with the truncated software registry hive failed. No surprise there. More disconcerting was the fact that I could not boot the system into recovery console any longer!

I was able to remedy that situation by using Petter's Offline NT Password Boot Disk (mentioned earlier) to boot, do the necessary copies and renames to get a working registry again.

In the end, I reinstalled Windows.

The registry is the Achilles' Heel of Windows systems. It is an endless source of trouble. It grows and collects cruft until it collapses in failure. It makes migrating from one system to another nearly impossible. It makes backing up or migrating a single application with its user and state data nearly impossible.

Yes, it makes some interesting things possible, but in my opinion, it was a disastrous design error. I'll happily stick with Linux for my own use.

[/windows] [link]

About this weblog

This site is the personal weblog of Marc Mims. You can contact Marc by sending e-mail to:
marc@questright.com.

Marc writes here about cycling, programming, Linux, and other items of personal interest.

This site is syndicated with RSS.

Archives

Credits

CSS stolen from Tom Coates who didn't even complain.