Will via plug on 23 Apr 2020 22:32:32 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Gluster best practices?


FYI, for cluster you want 4 nodes minimum. Once you dip below 3 (which during patching, happens in a 3 node cluster) the system goes into read only. For 4 nodes you can grown them fairly easily. Gluster DOES have issues with extreme sensitivity to timing so chrony here is your friend. Fixing split brain issues and other issues have literally cost me months to a year of work of productivity. I strongly suggest considering Ceph, Lizard, or just about anything else since Gluster is finicky and has the same repeated issues that pop up and the expertise with it to fix a production system is quite a bit. 

-Will C

On Fri, Apr 24, 2020 at 1:12 AM Keith C. Perry via plug <plug@lists.phillylinux.org> wrote:
I was going to behave and not come in and ask if you looked at LizardFS but since you mentioned me, I thought I would share that I did move my company storage to LFS in January (30Tb across 3 servers for now, reallocating 21Tb over the summer).  Its already been a major win.  I have a "sick" server that has been locking up but the overall storage network has available.  There might have been some data corruption on the VM guests but LFS fixed it before I had time to confirm a real issue.  Additionally I was able to just restart the production VMs on other servers. Nothing to manually move around, no manual procedures to consider if I had to rebuild an array.  It just works and keep my data at goal.

In regards to development.  Someone contacted me awhile back from the corporate sponsor's office awhile back about an old request I made.  I took the opportunity to voice my concerns.  The person let me know what while in the last 2 years there has been resource issues, they were fully staffed up again and were ramping up development.  Hopefully we'll see a new version this year but either way, the indication I was given was that LizardFS was not abandoned.

As for Gluster, I would be curious to know what you or anyone else has came across in regards to working with Gluster when you completely lose a storage node.  In particular, how do you rebuild and re-balance data (I don't think its automated).


~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Keith C. Perry, MS E.E.
Managing Member, DAO Technologies LLC
(O) +1.215.525.4165 x2033
(M) +1.215.432.5167
www.daotechnologies.com

----- Original Message -----
From: "JP Vossen via plug" <plug@lists.phillylinux.org>
To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>
Sent: Thursday, April 23, 2020 11:31:41 PM
Subject: [PLUG] Gluster best practices?

We are considering using Gluster for a project at work, because it's
relatively simple and seems to meet our needs.  I was wondering if
anyone has any experience with using it, best practices, things that
will bite us, etc.

The use case is pretty simple HCI for data retention.    The hardware we
use right now maxes out at about 27TB, and there may be cases where we
need more for a very simple flat-file data archive with a tree structure
of `CCYY/MM/DD/{daily files <= 1000}`.  Each node has hardware RAID,
though we'd consider a JBOD config if needed.  We do require some
resilience so that we can lose at least 1 node in the cluster.  We'd
also like the ability to add more nodes to grow it as needed, and
Gluster seems to require adding a either 2 or 3 nodes at a time, which
makes sense, bit I want to confirm we're not missing something obvious
that would let us grow 1 node at a time.

We already have a simple 2 node hot/standby HA pair, but that obviously
doesn't scale beyond the capacity of a single machine (and around 27TB).
  So this is the next step.  I'm on the edges of this one, it's not my
project, so I can present ideas or clues but not drive it.  Note the OS
is Oracle Linux 7.x for better or worse.  (It's worse, but we can't
change it.)

I know Keith has talked a lot about LizardFS (project activity for which
seems to have flip-flopped with MooseFS again) and I've suggested them,
and ZFS, and someone else suggested DRBD.  We already rejected Ceph
because it flat out wouldn't work on Oracle Linux.  I don't think we've
even considered BTRFS, and I'm not 100% sure I'd go for ZFS on Linux.
(ZFS is a no brainer on our FreeNAS, but that's not a fit here.)

Thoughts or clues?

TIA,
JP
--  -------------------------------------------------------------------
JP Vossen, CISSP | http://www.jpsdomain.org/ | http://bashcookbook.com/
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug