My Windows Home Server died just before Thanksgiving. Read more to read the saga, and learn a bit more about Windows Home Server’s robust Server Reinstallation process (and, how it works in the real world).
Last Wednesday, I was doing some regressive bug testing for Microsoft on a Windows Home Server (WHS) bug. When I flipped the power switch on my power supply, the power supply exploded… literally. While the smoke was clearing, I shut off power to my server room, and let things clear out.
Fast forward to today. I decided to let the server just sit, I needed a new power supply, didn’t have time, and figured that between Black Friday and Cyber Monday there would be at least one power supply deal. I ordered the Thermaltake 430W dual fan power supply (now $39.99 from Amazon). After all, this was a home server, and cool winds prevail over quietness in that environment.
Long story short, the new power supply didn’t fix things. My fears were confirmed when my motherboard would do nothing but beep to me. Four years for a PC is a lifetime in my world, and so my Celeron D-powered WHS box was officially dead. Thankfully, I had a backup plan: my Dell Inspiron 530n. I took the hard drives out of the old system, plugged them into my Inspiron, and inserted the Windows Home Server install disc.
Yes, I know, the Inspiron 530n is typically considered overkill in the Home Server environment. However, I’d like to change convention wisdom on that. See, GPU folding can now be accomplished with a $50 video card. And, if you’re going to have a PC running 24/7/365, it’s better for the planet to donate what that system is doing when it is idle.
Donating idle computer resources to research is a lot more “green” than falsely-titled “green” PCs (which often cut back on performance dramatically, in order to save pennies on your power bill each year… literally). Put your power to good use, and keep that in mind when buying your (next) Home Server.
WHS includes an option for a Server Re-installation. This process wipes out the system partition (where Windows and Program Files reside), but retains system backups and the file storage matrix. Server Re-installation was a breeze, no errors at all.
(Well, a breeze in terms of the walls of Windows… I had to run chkdsk more than a couple of times to fix post-install file system errors).
From left to right: Dell Inspiron 530n, Power Mac G4, and (dead) Celeron D Home Server
Next, I immediately installed drivers and ran Windows Update to get to WHS Power Pack 1. I wanted to make sure that was one of the first things to be applied, since my WHS database, storage matrix, and files were all made by PP1… and I had no idea how the original WHS would handle them.
Then, after running Microsoft Update and AutoPatcher until there was nothing left to update, I ran the WHS Console… and held my breath. The result was neither great nor bad at the same time. WHS identified my files, my drives, and my PC backups. But, the backup service was offline.
The problem stemmed from the fact that one of my hard drives was Parallel ATA (PATA), and the Inspiron 530n only has Serial ATA (SATA) ports. I attempted to remove the drive from the storage matrix. That was the one bug that I ran into with the whole process. Normally, the Remove Drive feature is supposed to give you a manifest of what files and backups would be lost when you remove a hard drive. It couldn’t give me a list. Worse, it said that a “file conflict” prevented removing the drive!
So, rather than press the issue, I dug up a USB 2.0 enclosure for 3.5-inch PATA hard drives. I plugged it in, and held my breath, hoping that WHS would recognize the drive, even though it was now a USB 2.0 drive. USB 2.0 hard drives carry a different device identifier, so this wasn’t a sure thing. Thankfully, WHS looked past that, and properly identified the drive as my old 250 GB PATA. My guess is the WHS team at Microsoft smartly placed database serial numbers on each drive, ensuring that if a drive changed types (like, from PATA to USB), the drive wouldn’t require reformatting.
A quick system reboot had my the backup service running again. But, that wasn’t the end of trouble. The WHS services were still crashing in an infinite loop (I’d click to send the error report off, and the service would re-launch… only to crash again). I’ve seen this in the past, but never since updating to Power Pack 1. Worse, the Backups still weren’t showing up.
Chkdsk couldn’t find anything wrong with the DATA matrix of drives (where the backups and files are stored). Next, I ran Windows Home Server’s repair backup database wizard… that finally fixed the backup problem.
Or, not. It fixed things on the WHS-side of my system, but when I attempted to mount a backup… you know, make sure the backups actually were still there… nope. WHS Console insisted that the blasted backup service still wasn’t running.
That took a few hours to solve. Basically, WHS did not recompile the user accounts. The only way to do that, is to re-run the WHS Console installer (the connector software that you put onto PCs). Of course, there’s no documentation from Microsoft saying you have to do that. I guess they assume that you’ll either wipe the home server in frustration, or pull out hair from your head until you guess to try that.
Either way, after re-running the WHS Console installer on one PC, WHS began working properly on every PC, complete with system backups and restore ability.
Debriefing, Suggestions, Conclusions
First, Microsoft, please fix the Server Reinstallation bug I just mentioned. I know you’re going to read this, and this is a bug that is just too obvious for me even to write up. Anyone that has done a full WHS reinstall must run into this… Yes, I know some of you will cry out that since I ran into the problem, it’s my job to file the report… I will at some point.
Now, on to some more constructive, product-wide lessons learned.
WHS Console should automatically offer to run chkdsk and Repair the Backup Database if backups aren’t available… even if the backup service is running. Most people don’t know how to solve Backup Database/Service problems, and likely will just wait for the backups to start reporting that they haven’t been done in 7, 10, 15, 30… days.
In addition, I’d also suggest to Microsoft that when a Server Reinstallation takes place… Microsoft auto-launch a Server Reinstallation Wizard. This would be a task presented to the user that walks through connecting any missing hard drives, runs chkdsk (yes, again for good measure) on all the drives, and evaluates all WHS databases for errors. It would also be a good time to let users batch reinstall missing add-ins and other settings that may need to be reset. This would also be a great time to check and make sure that the user accounts were rebuilt, as I noted above.
As to my personal setup, all of the above only took a few hours… five or six at the most (well, aside from the hair-pulling user profile issue). I now have a much more powerful Home Server, ready for pulling double duty as a server, and a dedicated GPU folding station. All that I have left to do is reinstall a few programs and add-ins, restore web services, and swap that USB 2.0 hard drive for an eSATA hard drive. Microsoft did a great okay job with Windows Home Server in terms of recovery, it even somewhat rebounded from its own errors when I told it to.