Naresh on Mon, 12 Aug 2002 21:38:05 -0400


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] smp/ext3/nfs/raid == solid lockup on 2.4.18 and 2.4.19


Fred,
I don't know if this would help but, I had many problems with 2.4.18 +
ext3, so I moved to 2.4.19+ext3+xfs and didn't have that many problems. I
also read that there was a nasty bug with 2.4.18 + ext3 and you should
upgrade to 2.4.19 ASAP.

Goodluck,
NaRESH

On Mon, 12 Aug 2002, Fred K Ollinger wrote:

> I'm getting a solid lockup on linux 2.4.18 and 2.4.19. I just upgraded to
> 2.4.19 for a short time and got a crash in 24 hours. Now I'm back to
> 2.4.18.
>
> It is an ext3 error (posted below), as these messages keep getting spewed
> to messages right before a crash.
>
> I'm running a dell poweredge 4300 dual smp 500 mhz system. I am sharing
> out a raid 5 array (powervault 2115). All my fs are ext3.
>
> I got a lockup solid. This was in the middle of a backup (dump).
>
> It is nfs mounted on 5 clients.
>
> These lockups have been happening periodically ever since we got new raid
> array. We are using the aacraid driver for this.
>
> Does anyone have any other idea on how to start trouble-shooting this?
>
> I have spent some time reading up on all the pertinent error messages, but
> I am coming up short in the ideas stage:
> 1. what caused it
> 2. how to fix it
>
> ---start log and commentary---
>
> Here's some errors and comments:
>
> Here's a link:
>
> http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=200205141238.11104.kiza%40gmx.net&rnum=2&prev=/groups%3Fq%3D%2Bext3%2B%25222.4%2B18%2B%2522%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3D200205141238.11104.kiza%2540gmx.net%26rnum%3D2
>
> [snipped duplicate errors]
>
> Aug 12 11:13:02 wernicke kernel: EXT3-fs error (device sd(8,18)) in
> ext3_reserve_inode_write: IO failure
>
> Most of these are people w/ similar problems. No solutions. I'm just
> documenting that we are not the only people to have these problems.
>
> http://groups.google.com/groups?q=ext3_reserve_inode_write:&hl=en&lr=lang_en&ie=UTF-8&selm=linux.kernel.Pine.LNX.4.33.0203242328170.2544-100000%40devel.blackstar.nl&rnum=4
>
> Aug 12 11:16:36 wernicke kernel: EXT3-fs error (device sd(8,18)) in
> ext3_new_inode: IO failure
>
> http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=linux.kernel.3D2F3331.376FB6D2%40zip.com.au&rnum=13&prev=/groups%3Fq%3Dext3_new_inode:%26start%3D10%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3Dlinux.kernel.3D2F3331.376FB6D2%2540zip.com.au%26rnum%3D13
>
> This one is particularly weird as this is supposed to mean that we are out
> of
> inodes. We are not. The inode usage on /data is < 3%.
>
> Aug 12 11:16:36 wernicke kernel: EXT3-fs error (device sd(8,18)):
> ext3_add_entry: bad entry in directory #886488: rec_len %% 4 != 0 -
> offset=0, inode=3889333976,
> rec_len=11254, name_len=124
>
> http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=linux.kernel.20020729123706.GC463%40gzp2.gzp.hu&rnum=1&prev=/groups%3Fq%3D%2522bad%2Bentry%2Bin%2Bdirectory%2522%2B2.4.18%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3Dlinux.kernel.20020729123706.GC463%2540gzp2.gzp.hu%26rnum%3D1
>
>
> Here's where the really nasty error occur:
>
> Aug 12 11:13:02 wernicke kernel: SCSI disk error : host 5 channel 2 id 0
> lun 0 return code = 25040001
> Aug 12 11:13:02 wernicke kernel:  I/O error: dev 08:12, sector 588530016
> Aug 12 11:13:02 wernicke kernel: EXT3-fs error (device sd(8,18)):
> ext3_readdir: directory #36853974 contains a hole at offset 0
>
> http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=linux.kernel.20011217161538.GA17099%40spylog.ru&rnum=3&prev=/groups%3Fq%3Dext3_readdir%2Bhole%2B2.4%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3Dlinux.kernel.20011217161538.GA17099%2540spylog.ru%26rnum%3D3
>
>
> Aug 12 11:13:02 wernicke kernel: EXT3-fs error (device sd(8,18)):
> ext3_readdir: bad entry in directory #36853974: rec_len %% 4 != 0 -
> offset=0, inode=1330206934,
> rec_len=51, name_len=1
> Aug 12 11:14:32 wernicke kernel: SCSI disk error : host 5 channel 2 id 0
> lun 0 return code = 25040001
> Aug 12 11:14:32 wernicke kernel:  I/O error: dev 08:12, sector 89147976
>
> [same errors, a lot]
>
> Aug 12 11:17:22 wernicke kernel: EXT3-fs error (device sd(8,18)):
> ext3_readdir: bad entry in directory #36853974: rec_len %% 4 != 0 -
> offset=0, inode=1330206934,
> rec_len=51, name_len=1
> Aug 12 11:17:36 wernicke last message repeated 20 times
>
> restarted
> --end of log---
>
> Thanks so much for your time (lot of reading to get here). :)
>
> Fred Ollinger (follinge@sas.upenn.edu)
> CCN sysadmin
>
>
> _________________________________________________________________________
> Philadelphia Linux Users Group        --       http://www.phillylinux.org
> Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce
> General Discussion  --   http://lists.netisland.net/mailman/listinfo/plug
>
>

_________________________________________________________________________
Philadelphia Linux Users Group        --       http://www.phillylinux.org
Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce
General Discussion  --   http://lists.netisland.net/mailman/listinfo/plug