K.S. Bhaskar via plug on 24 Apr 2020 13:21:43 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Gluster best practices?

Unless you never need to go back and search through archived data (in which case, why not just store it for posterity in a salt dome somewhere in Nevada), it seems to me that data volumes like that belong in databases rather than file systems. Of course, databases reside in file systems so you would still need a large file system, but you wouldn't have to grep through the files for whatever you want to find…

– Bhaskar

On Thu, Apr 23, 2020 at 11:31 PM JP Vossen via plug <plug@lists.phillylinux.org> wrote:
We are considering using Gluster for a project at work, because it's
relatively simple and seems to meet our needs.  I was wondering if
anyone has any experience with using it, best practices, things that
will bite us, etc.

The use case is pretty simple HCI for data retention.    The hardware we
use right now maxes out at about 27TB, and there may be cases where we
need more for a very simple flat-file data archive with a tree structure
of `CCYY/MM/DD/{daily files <= 1000}`.  Each node has hardware RAID,
though we'd consider a JBOD config if needed.  We do require some
resilience so that we can lose at least 1 node in the cluster.  We'd
also like the ability to add more nodes to grow it as needed, and
Gluster seems to require adding a either 2 or 3 nodes at a time, which
makes sense, bit I want to confirm we're not missing something obvious
that would let us grow 1 node at a time.

We already have a simple 2 node hot/standby HA pair, but that obviously
doesn't scale beyond the capacity of a single machine (and around 27TB).
  So this is the next step.  I'm on the edges of this one, it's not my
project, so I can present ideas or clues but not drive it.  Note the OS
is Oracle Linux 7.x for better or worse.  (It's worse, but we can't
change it.)

I know Keith has talked a lot about LizardFS (project activity for which
seems to have flip-flopped with MooseFS again) and I've suggested them,
and ZFS, and someone else suggested DRBD.  We already rejected Ceph
because it flat out wouldn't work on Oracle Linux.  I don't think we've
even considered BTRFS, and I'm not 100% sure I'd go for ZFS on Linux.
(ZFS is a no brainer on our FreeNAS, but that's not a fit here.)

Thoughts or clues?

--  -------------------------------------------------------------------
JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug