Rich Freeman via plug on 7 Nov 2022 13:30:32 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Box won't boot after RAID drive swap |
On Mon, Nov 7, 2022 at 11:29 AM Keith via plug <plug@lists.phillylinux.org> wrote: > > if you can't remove a > drive from RAID 1 and replace it with a new drive then LVM is the clear > winner. Of course you can do this. Nobody would use it otherwise. If a drive fails during operation, the array becomes degraded and keeps operating. If during boot a drive is missing, the default is to mount it degraded, but there is a command line option to tell mdadm to refuse to mount a degraded array. There are reasons why you might want to do that, but it isn't the default. Of course a distro could make it their default by putting that option in their startup scripts. If you install a new drive mdadm won't just wipe it out and use it. You need to add it to the array first. You can pre-add drives as spares to an array in which case if a drive fails then a spare will be selected and added, and the array will immediately begin to rebuild. You can also replace a drive in an array while it is still present. In this mode mdadm will just add it as an additional mirror and then once it has fully rebuilt it will automatically remove the drive that is to be replaced. Basically it does the right thing in most circumstances. Hard to be certain what is going on here, but it could be that the distro has overridden the behavior and is preventing a degraded array from mounting, or the array just isn't finding any drives. Keep in mind this is a computer that can't even be reliably booted to firmware, so this is getting beyond the scope of raid. I'm definitely interested in how the data fares at the end of everything, though it is definitely worth mentioning that mdadm has no protection for silent corruptions (ie changes in data on disk that do not trigger the drive to report a read error - that could be due to bit flips, or it could be due to corrupted writes to the disk due to hardware issues). If you want protection against silent corruption without adopting zfs/btrfs, check out dm_integrity. I'm not sure offhand if you can easily incorporate it into LVM, but it is a device mapper layer that will turn silent corruptions into read errors which will trigger the appropriate recovery in your raid software as long as it happens at a lower layer. None of this is to comment on LVM vs mdadm RAID. I haven't looked into the differences closely enough to comment on that. I've only used mdadm for this (and btrfs/zfs). Just wanted to say that normally mdadm handles disk failures as you'd expect from any RAID. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug