Rich Freeman on 7 Dec 2018 11:02:32 -0800


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] How to Store Video Files for 25 Years?


On Fri, Dec 7, 2018 at 1:31 PM Keith C. Perry
<kperry@daotechnologies.com> wrote:
> I strongly recommend RAID 10 because its easy to upgrade mirror sets.  To be clear and avoid storage arguments, what I am recommending in whatever you do, is mirroring so you have more that one complete set of your archives.

While RAID might be a component of this, I want to urge some caution
with relying solely on conventional RAID1/10 implementations.  Most do
not handle silent data corruptions, and if you're talking 25 years I
would not count on those not happening.

With mdadm or most hardware raid implementations, and a lot of
distributed solutions, if the underlying layer returns an error, then
the RAID/distributed implementations will find an alternate copy of
that block and restore everything.  However, if the underlying layer
does not actually return an error, many of these solutions can't tell
if the data is valid or not.  They might offer a scrub feature, which
will detect a discrepancy, but in that case the solution may not be
able to tell which copy is the good one.

Now, you can mitigate that in other layers of your storage solution.
For example, if you're storing par2 files on your RAID, then a RAID
failure isn't the last line of defense, and if you use par2 to
overwrite the old data in-place then the RAID will overwrite the
conflicting blocks with two good copies that were supplied from above.
Likewise, if you're using a distributed filesystem that can record on
top of something like zfs on an underlying layer, then you're much
less likely to have silent corruptions passed along to the distributed
layer.  And of course if your RAID is implemented in ZFS then it uses
checksums to validate which mirrored copy is correct.

There are a bunch of ways to go about it, but the bottom line is to
ask yourself that when your storage solution delivers you 20GB of
video data, how do you know that it is error-free, and if you have
multiple copies, how do you know which copies are good?  That
capability might be built-in to an out of the box solution, or
something you bolt on top.  Just keep in mind that most conventional
RAID solutions do not deliver this.  They handle errors, but they
don't detect them.  They expect the underlying layer (often physical)
to detect/report when it cannot deliver the data that was originally
stored.  Most hard drives cannot do this reliably.

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug