Lee H. Marzke on 8 Dec 2011 21:13:41 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Vmware oops |
ESXi is now 100% a custom vmware kernel, the linux part was removed, and it only has a small busybox module to run basic linux type commands. So I would not expect you to do anything with a Linux recovery tool but damage. As in my recent Plug ZFS talk , I discussed a known issue with RAID-5 called the 'write-hole' \1. So if you lose power to the array you can wind up with silent data corruption where the array still provide correct data, but the parity is scrambled. Now when you actually lose a drive and reconstruct the array, the rebuild silently replaces your data with junk. So the moral is: - never use RAID-5 without NVRAM backup ( especially for random writes on ESX ) - backup VM's - this really very easy with VMware VDR backup now included free with entry level VMware Accelleration kits. - Use NVRAM battery backed arrays ( Such as NetApp ) to minimize issues with losing power. - Use ZFS, with the additional feature of background scrubbing to fix any silent data corruption in the background. So if you read up more about this problem, it is expected that RAID-5 arrays that lose power may fail to rebuild when a disk actually fails and this may be what happened to your system. The only things done wrong in that case are 1) use of RAID-5, and 2) not having backups or replica's. If the sysadmins didn't have control of that - perhaps the management should leave for not spending the money on good arrays like the NetAPP, or COW based filesytems like NetAPP or ZFS (Nexenta) , and free VMware backup software included with Vsphere. Lee \1 http://blogs.oracle.com/bonwick/entry/raid_z ----- Original Message ----- > From: "jeff" <jeffv@op.net> > To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org> > Sent: Thursday, 8 December, 2011 10:37:12 PM > Subject: [PLUG] Vmware oops > > `Something happened' to RAID array on an ESXi 4 server [RAID5 4 > drives]. > > Drives were reinitialized, server boots, ESXi comes up. > We can find everything except the vmdk we're looking for. > Recovery software [run from its own OS] finds tons of files that were > inside the vmdk, but most are trashed. > > Vmware consultants stated that this happens a lot - when storage > fails, > vmdk's get hosed. > > Of course there's no backup. > > Do I even try booting with linux and running any of our tools or > should > I advise some people to start updating their resumes? I don't > believe > there's anything that runs inside the ESXi shell [ctl-alt-f1], is > there? > > I love hearing coworkers say, "UGH. This is LINUX. How do I > navigate?" > > > Thanks, > --Perplexed in PA > > > ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug