Rich Freeman via plug on 24 Apr 2020 05:32:20 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Gluster best practices?


On Thu, Apr 23, 2020 at 11:31 PM JP Vossen via plug
<plug@lists.phillylinux.org> wrote:
>
> We are considering using Gluster for a project at work, ...
>
> I know Keith has talked a lot about LizardFS (project activity for which
> seems to have flip-flopped with MooseFS again) and I've suggested them,
> and ZFS, and someone else suggested DRBD.  We already rejected Ceph
> because it flat out wouldn't work on Oracle Linux.  I don't think we've
> even considered BTRFS, and I'm not 100% sure I'd go for ZFS on Linux.

So, I would really give pause to gluster.  It was mainstream because
it was the first big option in this space.  However, everybody seems
to be moving away from it so you're basically adopting a legacy
technology.

If you want the "nobody ever got fired for buying IBM" solution then
Ceph is the most mainstream solution to your problem.  I'm not sure
why it doesn't work on Oracle Linux.  They have this page on their
website but I didn't dig deeper:
https://docs.oracle.com/cd/E37670_01/E66514/html/ceph-ngv_vd1_dt.html

For an enterprise deployment of an actual cluster you want dedicated
hardware for the cluster itself, so it really doesn't matter what
distro you use on the cluster.  I'd pick whatever works best with Ceph
that you can live with.  Now, where you would need compatibility is on
the clients since obviously your application needs to be able to
access Ceph, unless you use a shim layer of NFS/SMB/etc, in which case
the fileservers are what need to run the client.

The main downside to Ceph is that it is more complex, is said to not
work great on a small number of nodes (<5 or so), and it needs 1GB of
RAM per 1TB of storage for reliability, which adds up really fast and
can make it harder to run on consumer-grade hardware.

The upside of Ceph is that it has a lot of support and you can
probably pay RedHat/etc to be 100% behind you if you use it.  It also
scales better than anything else I'm aware of so you could run a farm
of VMs/containers/whatever on top of it without a problem if you do it
right.

Honestly, in any large company situation I'm not sure that I'd even
propose something other than Ceph backed by distro that supports it
officially.  At least not in your typical conservative large-company
setting.

Now the alternatives:

ZFS - I love ZFS, but I don't think it is really a solution to your
problem compared to whatever you're already doing.  ZFS is a
single-host solution.  If you want to have multiple hosts running ZFS
then you're going to have to export all that data via NFS/SMB/etc, and
you'll have a separate filesystem for each host, and any
above-host-level redundancy would have to be done at the application
layer.  It is the same as just using ext4+lvm+mdadm as far as that is
concerned.

Lizardfs - I personally use this because I want to use commodity
hardware and not have to pay an extra $100 for RAM every time I buy a
$100 hard drive.  It works fine and Keith summed up the current state
of affairs well.  You're not going to get formal support from somebody
like RHEL/Oracle for this.  You might be able to buy formal support
from LizardFS itself, and chances are if you do so you'll make it more
likely to persist and of course you can just talk to them about the
future of it.  Heck, if you're likely to consider this as an option
then just pick up the phone and talk to their sales team and you'll
know more than us.  For static data storage it should perform just
fine - it will not scale the way Ceph does if you need to do a lot of
IOPS on it.  It also consumes RAM on the metadata server at the file
level so this becomes more of an issue if you're talking about 50TB of
10KB files vs 50TB of 2GB files.  Any server you want to use as a
metadata server or shadow will need enough RAM.  I have 47TB of
storage on my cluster and the metadata server is using about 180M for
the lizardfs-master process, and the whole container is using 255M - I
can run it on an ARM SBC without any issues.  The chunkservers which
is what does all the storage use almost no resources.  Oh, and if you
want to use lizardfs, please use a stable version - when you look at
people complaining on the lists 99% of the time they were running a
non-stable version.  Don't do that with 20+TB of storage...

I'm not as familiar with gluster, but with any of the other solutions
you will not want to use hardware RAID or any other software RAID at
the host level.  These solutions already provide redundancy at above
the host level, so you're just doubling your disk requirements for
little real benefit.  If you really want more than 2x redundancy then
just set that at the cluster level and it won't cost you any more, and
you'll get better performance/protection.

Again, in any enterprise setting at this point I think you really have
to ask why you aren't using Ceph before you really even consider
anything else.  I'm not saying there aren't valid reasons - but it
really is becoming the default.  If anything does go wrong, somebody
will end up asking you why you didn't use Ceph so make sure you have
an answer...

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug