Stephen Brown on Tue, 29 Aug 2000 19:54:02 -0400 (EDT)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Odd CRC file errors and md5 checksum mismatches - Any ideas?


Pete Foley wrote:
> Well, Like I said I do not believe it is the network, because the other machines are fine.  And that box that is having problems is also a gateway, so all traffic basically goes through there.  I am thinking that it may be hardware.  The HD is fine ((I think).  It is a pretty new Maxtor drive, and I had it in another box a few months ago.  I don't know if this is on target at all, but I think it may have to do with transfer rate.  Initially, I had 2 HDs on the same IDE port.  The 1st one (a 3.6 gig Western Digital - the drive that the OS is on) as primary and the Maxtor as slave.  I ran same tests and found that the transfer rate on the 1st was at about 6mb/s and only 2mb/s for the 2nd.  So I moved the Maxtor to be the master on th 2nd IDE port, and played with the hard drive config tool (i forget what it is called) and got it up to 6mb/s.  That seemed to lower the errors, but obviously not completely.  The system is a 233 AMD w/ 64 RAM.  Could the system be working faster!
  t!
>  han the HD's can write? (Note that this problem does occur on both HDs.)  I was thinking about getting a Promise UDMA66 controller card to speed things up more and to see if that helps.
> 
> I will mess with the memory a bit and see if that is the cause.  Is there an way that I can definitely rule out the network?  I don't want to have to buy some expensive line tester though.
<snip>
If you were tweaking the hd params, I would start there by
saving  what you have now, and go back to the default (=slow) 
way and see what changes. 

To eliminate the network & cables as a problem, grab a crossover cable
and directly connect the gateway box to a box you know is
good and re-run your tests. If you still have problems, 
that eliminates the cables and hub/switch. (I doubt the network
is the problem if you have problems within the box. That list is my 
standard troubleshooting checklist, in order of frequency of 
problems, and ease of elimination)

Other random things to try
- Swap around the IDE cables & try a new one.
- Copy bulk stuff from one drive to the other, and then the other
  way, see if you get no errors, or errors in only one direction.
- If you have the drives partitioned, try and narrow down problems
  to a set of partitions.
- Clean out any dust in the case
- Take out & re-seat memory
- If you have lm_sensors installed run sensors & check that the power 
  supply voltages are decent. 
- If you are running a 2.3 kernel, try a 2.2 for a while
  if running 2.2, try 2.3 ;) Troll through the kernel lists
  and see if anyone has mentioned DMA, southbridge or IDE
  problems with AMDs for your motherboard/chipset.
- If you have a spare drive, throw it into the mix and test with
  that in addition to your other 2.

If none of those help and you had the memory checked out, you probably
should backup each drive, and then run the drive manufacturer's testing
programs on it to see if you need to get it replaced. When I had to
return drives in the past, I had to run the test program's full test
suite on it and submit the error codes it returned. Unfortunately the
full tests hose all data on the drive so you'll want to try other
things first.

-- 
Stephen Brown                             Data Clarity, Inc. 
steve@dataclarity.net    tel:(877)496-3527 fax:(801)382-1525


______________________________________________________________________
Philadelphia Linux Users Group       -      http://www.phillylinux.org
Announcements-http://lists.phillylinux.org/mail/listinfo/plug-announce
General Discussion  -  http://lists.phillylinux.org/mail/listinfo/plug