Rich Freeman via plug on 28 Oct 2022 06:21:41 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] RAID1 impending failure questions |
On Fri, Oct 28, 2022 at 8:45 AM Walt Mankowski via plug <plug@lists.phillylinux.org> wrote: > > First question -- The two devices in the array are /dev/sdc1 and > /dev/sdd1. The alert says the questionable drive is sdc1. When I open > things up, is there any easy way to tell which drive is which? Will > the serial number be printed on the drive? Yes, but of course it won't have "sdc" on it. smartctl will output the serial number and you can find that on the drive. One trick I use is to put a label with the serial number on the side of each drive next to the cables. I have a fair number and I don't want to jostle half a dozen connectors fishing for the drive I want to pull. > Second question -- Let's say I remove the old drive, install the new > drive and it's sde1. Will the system think it's a RAID1 with one drive > and just use that until I add sde1 to the array? Assuming you're talking about mdadm, then yes. It will boot as a degraded array. Obviously I can't make promises but this is how every other RAID system I've encountered works. Of course you can set some up to refuse to start the array in a degraded mode for safety, but the most typical option is to operate degraded and scream for help, since the main point of RAID is to avoid downtime. If you're going to go this route a cleaner approach would be to fail and remove the old drive, so then it isn't seen as missing, though obviously you are not redundant. > Third question -- As long as everything is working, and assuming I've > got the slots, power, cables, etc, would it make sense to add the new > drive as a third drive in the array, let it sync, then remove the old > drive from the array? This is definitely the safest option and what I always do unless I'm tight for interfaces. I believe mdadm has a --replace option that will do it as one step, but you could make the array triple-redundant first and then remove the old one. mdadm does this all online, so the system is usable while this is going on, and it checkpoints so you can actually shutdown/reboot at any point and it will just resume where it left off. Of course you can't actually remove drives until they're replicated (at least not without ending up degraded). I haven't used mdadm in a few years, but most raid-like software implementations have similar features. They degrade if a drive is lost, and they usually have a way to cleanly replace a drive without the array becoming degraded. It sounds like your old drive is well-behaved and just giving some errors. If you get a drive that is misbehaving and causing disruption to other drives due to some kind of interface issue then it is probably safest to just fail and disconnect it and operate degraded. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug