Jeff Abrahamson on 9 Jul 2006 16:14:39 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[PLUG] firewire drive errors


I bought a Maxtor 300 GB hard drive, external firewire, from Staples
yesterday.  This email concerns a hardware or kernel problem with the
drive.  As I have had problems with other Maxtor drives but much more
rarely with non-Maxtor drives, I also wonder if a solution may not be
simply to avoid Maxtor drives.  But my data does not reach the level
of statistical significance, as I don't deal with *that* many drives
in my life.  I'm curious what others think.

Below are the gory details.  Thanks in advance for any thoughts.


Kernel 2.6.15-1 (debian) recognizes the new drive, I write an XFS file
system, and start copying files.  All's well for now.  The copy
completes.  Looks good.

Just after midnight last night, with the disk largely quiescent, the
following appears in kern.log:

    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 63
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 0
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 1
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 2
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 3
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 4
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 5
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 6
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 7
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 8
    Jul  9 00:23:28 astra kernel: Buffer I/O error on device sdb1, logical block 9
    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 191
    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 63
    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 586099263
    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 586099263
    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 586099263
    Jul  9 00:23:28 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:23:28 astra kernel: end_request: I/O error, dev sdb, sector 63

I may have typed "df" at 00:23, but I'm very sure nothing else
happened that would have touched that disk at that time.  I look at
the contents of the drive and it looks corrupted: four of five
directories at the root level do not even look like directories to ls.
I unmount the drive, power cycle the drive, and (this morning)
remount.  To my surprise, the remount is fine, despite the frightening
dismount log message last night:

    Jul  9 00:40:06 astra kernel: usb 1-2: USB disconnect, address 16
    Jul  9 00:42:13 astra kernel: sd 16:0:1:0: Device not ready.
    Jul  9 00:42:13 astra kernel: end_request: I/O error, dev sdb, sector 127
    Jul  9 00:42:13 astra kernel: I/O error in filesystem ("sdb1") meta-data dev sdb1 block 0x40       ("xfs_trans_read_buf") error 5 buf count 8192
    [...block repeats several times]
    Jul  9 00:43:40 astra kernel: end_request: I/O error, dev sdb, sector 63
    Jul  9 00:43:40 astra kernel: xfs_force_shutdown(sdb1,0x1) called from line 339 of file fs/xfs/xfs_rw.c.  Return address = 0xf8a87874
    Jul  9 00:43:40 astra kernel: Filesystem "sdb1": I/O Error Detected.  Shutting down file system: sdb1
    Jul  9 00:43:40 astra kernel: Please umount the filesystem, and rectify the problem(s)
    Jul  9 00:43:40 astra kernel: xfs_force_shutdown(sdb1,0x1) called from line 339 of file fs/xfs/xfs_rw.c.  Return address = 0xf8a87874


The remount nonetheless looked normal enough:

    Jul  9 10:40:52 astra kernel: XFS mounting filesystem sdb1
    Jul  9 10:40:52 astra kernel: Starting XFS recovery on filesystem: sdb1 (logdev: internal)
    Jul  9 10:40:52 astra kernel: Ending XFS recovery on filesystem: sdb1 (logdev: internal)

The volume appears to be intact.  But as I rsync to confirm that the
drive's contents are ok, I see lots of these error messages,
reminiscent of a problem I had with another drive before I turned on
io_serialize.  But io_serialize is still on.  Note that each of these
aborts coincides with an I/O timeout, which means throughput is
unacceptably abysmal.

    Jul  9 12:01:39 astra kernel: ieee1394: sbp2: aborting sbp2 command
    Jul  9 12:01:39 astra kernel: sd 18:0:1:0:
    Jul  9 12:01:39 astra kernel:         command: Write (10): 2a 00 05 87 03 1f 00 00 10 00

I am inclined to consider this a problem with the drive, since I have
another firewire drive on the same controller, and it works fine.
Indeed, of the various drive problems I have had over the last two
years, most have been related to Maxtor drives.  But I'm curious if
anyone has other thoughts.  Blaming Maxtor is the easy way out and may
not really solve anything.

-- 
 Jeff

 Jeff Abrahamson  <http://jeff.purple.com/>          +1 215/837-2287
 GPG fingerprint: 1A1A BA95 D082 A558 A276  63C6 16BF 8C4C 0D1D AE4B

Attachment: signature.asc
Description: Digital signature

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug