Rich Freeman via plug on 24 Apr 2020 18:42:14 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] Gluster best practices? |
On Fri, Apr 24, 2020 at 9:21 PM Will via plug <plug@lists.phillylinux.org> wrote: > > I'm glad the list picked up on my slacking. Finally people are listening to Keith about LizardFS after.... 3 years? > All of the distributed filesystems should handle this well, but recently I had an HBA fail on one of my LizardFS nodes. I was getting tons of errors on multiple drives with zfs eventually failing several pools. The cluster just marked those chunks as endangered and began replicating them to nodes that had space. I ended up just removing that node and watching my data rebuild, and then when I got a new HBA I first did some testing just to make sure the drives were reliable and then put it back into the cluster, and the data then rebalanced. While I did end up with endangered data, the cluster never was offline/unresponsive/etc. Chunks set with 3x replication just became undergoal dropping to 2x replication. If I had been using classic RAID and lost the only HBA on a host I'd have just lost the array, possibly with some data corruption if I wasn't using zfs. Granted, if you had multiple HBAs in a host and carefully paired your drives you could endure something like this with traditional RAID, especially with zfs. However, with the distributed filesystems you have redundancy above the host level, so you can lose anything on a single host, or an entire host, and the cluster isn't interrupted. The only gotcha with lizardfs is that the cluster uses a single active master server at any time, so that is a point of failure. You can have other masters shadowing it so that the data on the master is replicated, and you can promote any of those to be active. The next version of lizardfs will have the high-availability features included which will automate this. With my setup I don't really need THAT much reliability so I'm happy just to know that if my master fails I can just ssh into my shadows, check the metadata version on each one, and then promote the server with the newest data and tweak my DNS so that everything finds it. I've already moved my master server around by basically doing exactly this - granted with nothing mounted and the cluster idle. And of course with the exception of the master server I can easily do upgrades and reboots of individual nodes while the cluster is online. When a node is down the cluster will start to replicate its data a bit, but that really isn't a big deal and it all gets cleaned up in the end. Another gotcha is that the FUSE client seems to hog RAM sometimes. Not the end of the world, but not great either. The other thing that would be nice is if it supported reflinks - you can do them with mfsmakesnapshot at the command line, which is functionally equivalent, but you can't do a cp --reflink=auto/always to get a COW snapshot of a file. -- Rich ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug