Rich Freeman on 5 Jan 2011 06:51:38 -0800 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Software mirror (RAID1) |
On Wed, Jan 5, 2011 at 1:49 AM, JP Vossen <jp@jpsdomain.org> wrote: > I'm going off on a tangent to Rich's RAID5 question. I don't have a better > answer for him, and he's already covered my objections to abusing RAID5. Tangents are always welcome - usually I'm the one starting them, however... > NOTE: RAID is **not** a backup!!! RAID & backups are different things and > using RAID does not allow you not to back up! Couldn't agree more on this. I use a script that employs sarab (itself a script for dar with rotation logic), gpg, and s3cmd to get encrypted backups of important stuff onto Amazon S3 reduced-redundancy storage (with multiple copies of the key stashed away in secure offsite locations). I do depend on my RAID as a form of backup for less-important data, so I wouldn't be able to safely use raid1 to the extent that you do. I probably will tolerate some loss of redundancy during my data migrations to save myself the expense of an additional controller card, but I wouldn't want to routinely degrade my array. Stuff like DVR recordings and various reproducible junk isn't worth 10 cents/month/GB to backup (plus transfer costs), but I'd still regret losing them. > RAID1 Cons: > * Arguably slower than other solutions I think this one is oversold from what I've read. It REALLY depends on your use case. Striped RAIDs of any kind (including raid5) are really great for high-bandwidth streaming of large files (either reads or writes). They're no better/worse than standalone drives for random read seeks, and they're worse than standalone drives for random writes of small amounts of data (must re-read entire stripe), except for COW implementations (only ZFS that I'm aware of - btrfs doesn't support raid5 yet). RAID1 implementations are generally no better/worse than standalone drives for high-bandwidth streaming of large files (either reads or writes). They're also the same as standalone drives for random writes of small amounts of data. They're double the performance of standalone drives for random read seeks. So, RAID5 performs the same or better than RAID1 for all use cases except random reads of small amounts of data. The problem is that random reads of small amounts of data is probably 95% of what most hard drives end up doing, and this is why SSDs do so well. Some might take issue with my assertion that writes on RAID1 are the same as a standard hard drive, since RAID1 is usually cited as having a write penalty. I think this is oversold, but I'm open to argument here. It is true that a write must tie up all drives, but that only deprives you of the ability to do parallel reads on the additional drives, which is something that standalone drives never had in the first place. It also isn't a differential penalty of RAID1 since RAID5 has to do the same. The main difference between RAID1 and RAID5 in this regard is that since RAID5 drives inevitably always end up doing all seeks in parallel the heads of all drives are always in the same place anyway, so there is no additional seeking for a write. RAID1 tends to have more independent drive operation due to parallel reads, and only on writes do they need to sync up. Note also that the benefits of RAID5 for high-bandwidth reads only apply if: 1. You really are sustaining high-bandwidth. Simply reading a big file slowly (most media playback) doesn't get you anything since any RAID configuration can sustain this. 2. Your bus/arch/software/cpu/etc can actually handle the bandwidth. You might have 10 drives striped with 6Gb/s SATA, but I doubt that any of the busses on your motherboard can really deal with 60Gb/s of data running around, and unless you have 500GB of RAM you're not going to be buffering it even if your memory had the bandwidth for it. Note also that clever use of software/implementations/etc can probably get very good bandwidth out of RAID1 - you just need to seek two different parts of the same file and read them in parallel. Also, while most implementations of RAID1 limit you to a single mirror, there is no reason you couldn't have 10 mirrors and 10x the read performance. >> Expanding a RAID1 is not practical, but expanding a RAID5 is trivial. > > Depending on how you define it, expanding RAID1 is easy, but a bit time > consuming. As noted above, you can expand onto bigger hard drives pretty > easily without any reinstalls. No, not the same as or quite as easy as > expanding a RAID5, but not impractical either. So, my thinking in this regard was that with a working RAID5 adding a 1TB disk gets you 1TB of additional usable space. With a working RAID1 you need to add 2TB of disk to get you 1TB of additional usable space. You keep paying that N/2 vs N-1 penalty over and over again. On the other hand, that only applies with like-sized disks. You're only going to add identical disks if you expand storage not long after initial setup. Just look at my case - I had a 120GB drive fail. Now, I could replace that with a 120GB drive for $40, and get the same amount of space. Or, I can spend $120 and get 750GB of additional usable space even after retiring the failed and two additional old 120GB hard drives (with a power savings). Unless you're constantly expanding, new drive replacements will be so much larger than existing drives that you're going to end up creating new arrays anyway, which negates the RAID5 advantage here. That's why in the end I ended up rethinking this and went with conservative RAID1. I can convert later if it makes sense, but most likely if I add another drive it will be a 10TB drive or something ridiculous like that anyway. Maybe by then I can just buy 1, format it with btrfs, copy the data over, and then mirror it across all my old drives combined. (Btrfs supports mixed collections of drives - every file ends up on two independent drives, but if drive sizes are seriously mismatched you might not be able to use all the space on the largest drive since total space is limited to the combined total of the smallest N-1 drives I guess - or something like that.) Thanks for the insightful post, as always! Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug