Re: [PLUG] New Hard Drive Testing Practices

Rich Freeman via plug on 13 Jul 2021 07:42:07 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] New Hard Drive Testing Practices

From: Rich Freeman via plug <plug@lists.phillylinux.org>
To: Casey Bralla <MailList@nerdworld.org>
Subject: Re: [PLUG] New Hard Drive Testing Practices
Date: Tue, 13 Jul 2021 10:41:51 -0400
Cc: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>
Reply-to: Rich Freeman <r-plug@thefreemanclan.net>
Sender: "plug" <plug-bounces@lists.phillylinux.org>

On Tue, Jul 13, 2021 at 10:20 AM Casey Bralla via plug
<plug@lists.phillylinux.org> wrote:
>
> Interesting question I never thought about.  For me, without a "mission
> critical" application, I just swap in the drive, format it, copy data to
> it, and go.   Modern drives are so reliable now, I assume any defects
> are caught in final testing at the manufacturer, or will show up almost
> immediately when formated. However, I also assume the big drives you're
> dealing with are less reliable than the typical 2-4 TBytes of a consumer
> drive since they are pushing the technology limits.
>

I do suspect the most likely conveyance issues are going to be to the
heads/etc and are going to show up in any use, or just a short test
(which only takes a few minutes which is why I think that is a
no-brainer).

I'm not sure the big drives are really any less reliable (unless using
a formula that accounts for the fact that a total drive failure
impacts more data).  The biggest practical issue I've seen with large
drives is that it takes a good day or more to transfer their entire
contents (that is at peak sequential transfer speed - obviously any
kind of seeking or inefficiency significantly slows that).  So, when
you're scrubbing, replacing, backing-up, restoring, etc - the downtime
can be significant (or the time at risk for online operations).  This
is why there has been a move towards tolerance for two disk failures
with the increase in drive sizes - your opportunity for a double
failure goes up as the time to replace a drive increases.

This is also why RAID or a similar solution matters a lot for uptime
when you have a lot of data.  Even if you have full backups, and even
if the administration of the restoration process is very efficient
(which is rare in casual setups), just the downtime while a
restoration occurs can be pretty substantial when you get into
multiple TBs of data.  With RAID going to an offline backup is a
worst-case scenario, while the routine loss of hard drives is handled
without downtime.  I don't have a TON of storage at home but at this
point drive replacements seem to happen 1-2x/yr regularly.

-- 
Rich
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

Follow-Ups:
- Re: [PLUG] New Hard Drive Testing Practices
  - From: LeRoy Cressy via plug <plug@lists.phillylinux.org>

References:
- [PLUG] New Hard Drive Testing Practices
  - From: Rich Freeman via plug <plug@lists.phillylinux.org>
- Re: [PLUG] New Hard Drive Testing Practices
  - From: Casey Bralla via plug <plug@lists.phillylinux.org>

Prev by Date: Re: [PLUG] New Hard Drive Testing Practices
Next by Date: Re: [PLUG] New Hard Drive Testing Practices
Previous by thread: Re: [PLUG] New Hard Drive Testing Practices
Next by thread: Re: [PLUG] New Hard Drive Testing Practices
Index(es):
- Date
- Thread