Keith C. Perry via plug on 23 Apr 2020 22:12:37 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Gluster best practices?

I was going to behave and not come in and ask if you looked at LizardFS but since you mentioned me, I thought I would share that I did move my company storage to LFS in January (30Tb across 3 servers for now, reallocating 21Tb over the summer).  Its already been a major win.  I have a "sick" server that has been locking up but the overall storage network has available.  There might have been some data corruption on the VM guests but LFS fixed it before I had time to confirm a real issue.  Additionally I was able to just restart the production VMs on other servers. Nothing to manually move around, no manual procedures to consider if I had to rebuild an array.  It just works and keep my data at goal.

In regards to development.  Someone contacted me awhile back from the corporate sponsor's office awhile back about an old request I made.  I took the opportunity to voice my concerns.  The person let me know what while in the last 2 years there has been resource issues, they were fully staffed up again and were ramping up development.  Hopefully we'll see a new version this year but either way, the indication I was given was that LizardFS was not abandoned.

As for Gluster, I would be curious to know what you or anyone else has came across in regards to working with Gluster when you completely lose a storage node.  In particular, how do you rebuild and re-balance data (I don't think its automated).

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 
Keith C. Perry, MS E.E. 
Managing Member, DAO Technologies LLC 
(O) +1.215.525.4165 x2033 
(M) +1.215.432.5167

----- Original Message -----
From: "JP Vossen via plug" <>
To: "Philadelphia Linux User's Group Discussion List" <>
Sent: Thursday, April 23, 2020 11:31:41 PM
Subject: [PLUG] Gluster best practices?

We are considering using Gluster for a project at work, because it's 
relatively simple and seems to meet our needs.  I was wondering if 
anyone has any experience with using it, best practices, things that 
will bite us, etc.

The use case is pretty simple HCI for data retention.    The hardware we 
use right now maxes out at about 27TB, and there may be cases where we 
need more for a very simple flat-file data archive with a tree structure 
of `CCYY/MM/DD/{daily files <= 1000}`.  Each node has hardware RAID, 
though we'd consider a JBOD config if needed.  We do require some 
resilience so that we can lose at least 1 node in the cluster.  We'd 
also like the ability to add more nodes to grow it as needed, and 
Gluster seems to require adding a either 2 or 3 nodes at a time, which 
makes sense, bit I want to confirm we're not missing something obvious 
that would let us grow 1 node at a time.

We already have a simple 2 node hot/standby HA pair, but that obviously 
doesn't scale beyond the capacity of a single machine (and around 27TB). 
  So this is the next step.  I'm on the edges of this one, it's not my 
project, so I can present ideas or clues but not drive it.  Note the OS 
is Oracle Linux 7.x for better or worse.  (It's worse, but we can't 
change it.)

I know Keith has talked a lot about LizardFS (project activity for which 
seems to have flip-flopped with MooseFS again) and I've suggested them, 
and ZFS, and someone else suggested DRBD.  We already rejected Ceph 
because it flat out wouldn't work on Oracle Linux.  I don't think we've 
even considered BTRFS, and I'm not 100% sure I'd go for ZFS on Linux. 
(ZFS is a no brainer on our FreeNAS, but that's not a fit here.)

Thoughts or clues?

--  -------------------------------------------------------------------
JP Vossen, CISSP | |
Philadelphia Linux Users Group         --
Announcements -
General Discussion  --
Philadelphia Linux Users Group         --
Announcements -
General Discussion  --