Rich Freeman via plug on 4 Jan 2022 09:37:10 -0800

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Moving mdadm RAID-5 to a new PC

On Tue, Jan 4, 2022 at 12:13 PM Keith C. Perry via plug
<> wrote:
> When you consider that, for a NAS, Rich M's idea about an old PC works well because you can get a 4x1Gb NIC and a decent multi-port SATA or proper RAID card for not a lot of money.

No argument on the importance of network performance, especially when
you go to more scalable solutions like Ceph.  The Ceph docs would
basically recommend having two 10Gbps networks in parallel (one for
between-node traffic and one for external traffic).

Note that a NIC that supports multiple ports at 1Gb+ is probably going
to need multiple PCIe lanes depending on the PCIe revision, and of
course HBAs have the same issue.  The big difference between
server-grade hardware and consumer-grade tends to revolve around IO
(and RAM, which I guess is another form of IO in practice).

For example, the latest Zen3 consumer CPUs have 24 PCIe v4 lanes,
while the EPYC versions of Zen3 have 128 lanes.  You can obviously
cram a lot more HBAs/NICs in a board that can give 4-16 lanes to every
slot and they're PCIe v4 besides.

Likewise if you're stacking 4 ports per node you're going to need a
big switch, and those aren't cheap either if you want it to be
managed.  If all your gear is next to each other then that is as far
as the problem goes, but if your stuff is scattered around now you
need to be running more than gigabit networking around.

Ultimately what matters is how much IO you're actually doing.  If
you're just talking about static storage with only a few clients at
most accessing it, then you don't need much hardware at all.  As
Martin pointed out your biggest concern might be rebuild time if you
stick too many drives on one host.  If you're using distributed
storage then rebuild time can be less of an issue since the rebuild is
happening across multiple hosts, so if they're all on the same switch
they can all pass data in and out at 1Gbps between them, and the
drives within a host aren't actually going to get a lot of individual
IO.  That is part of why I can get away with Pis - I have half a dozen
nodes so if a node fails the remaining nodes only have to rebuild
1/5th of a host each between them, and most have multiple drives to
spread that IO across.

Philadelphia Linux Users Group         --
Announcements -
General Discussion  --