|
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
|
[PLUG] smp/ext3/nfs/raid == solid lockup on 2.4.18 and 2.4.19
|
I'm getting a solid lockup on linux 2.4.18 and 2.4.19. I just upgraded to
2.4.19 for a short time and got a crash in 24 hours. Now I'm back to
2.4.18.
It is an ext3 error (posted below), as these messages keep getting spewed
to messages right before a crash.
I'm running a dell poweredge 4300 dual smp 500 mhz system. I am sharing
out a raid 5 array (powervault 2115). All my fs are ext3.
I got a lockup solid. This was in the middle of a backup (dump).
It is nfs mounted on 5 clients.
These lockups have been happening periodically ever since we got new raid
array. We are using the aacraid driver for this.
Does anyone have any other idea on how to start trouble-shooting this?
I have spent some time reading up on all the pertinent error messages, but
I am coming up short in the ideas stage:
1. what caused it
2. how to fix it
---start log and commentary---
Here's some errors and comments:
Here's a link:
http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=200205141238.11104.kiza%40gmx.net&rnum=2&prev=/groups%3Fq%3D%2Bext3%2B%25222.4%2B18%2B%2522%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3D200205141238.11104.kiza%2540gmx.net%26rnum%3D2
[snipped duplicate errors]
Aug 12 11:13:02 wernicke kernel: EXT3-fs error (device sd(8,18)) in
ext3_reserve_inode_write: IO failure
Most of these are people w/ similar problems. No solutions. I'm just
documenting that we are not the only people to have these problems.
http://groups.google.com/groups?q=ext3_reserve_inode_write:&hl=en&lr=lang_en&ie=UTF-8&selm=linux.kernel.Pine.LNX.4.33.0203242328170.2544-100000%40devel.blackstar.nl&rnum=4
Aug 12 11:16:36 wernicke kernel: EXT3-fs error (device sd(8,18)) in
ext3_new_inode: IO failure
http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=linux.kernel.3D2F3331.376FB6D2%40zip.com.au&rnum=13&prev=/groups%3Fq%3Dext3_new_inode:%26start%3D10%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3Dlinux.kernel.3D2F3331.376FB6D2%2540zip.com.au%26rnum%3D13
This one is particularly weird as this is supposed to mean that we are out
of
inodes. We are not. The inode usage on /data is < 3%.
Aug 12 11:16:36 wernicke kernel: EXT3-fs error (device sd(8,18)):
ext3_add_entry: bad entry in directory #886488: rec_len %% 4 != 0 -
offset=0, inode=3889333976,
rec_len=11254, name_len=124
http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=linux.kernel.20020729123706.GC463%40gzp2.gzp.hu&rnum=1&prev=/groups%3Fq%3D%2522bad%2Bentry%2Bin%2Bdirectory%2522%2B2.4.18%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3Dlinux.kernel.20020729123706.GC463%2540gzp2.gzp.hu%26rnum%3D1
Here's where the really nasty error occur:
Aug 12 11:13:02 wernicke kernel: SCSI disk error : host 5 channel 2 id 0
lun 0 return code = 25040001
Aug 12 11:13:02 wernicke kernel: I/O error: dev 08:12, sector 588530016
Aug 12 11:13:02 wernicke kernel: EXT3-fs error (device sd(8,18)):
ext3_readdir: directory #36853974 contains a hole at offset 0
http://groups.google.com/groups?hl=en&lr=lang_en&ie=UTF-8&threadm=linux.kernel.20011217161538.GA17099%40spylog.ru&rnum=3&prev=/groups%3Fq%3Dext3_readdir%2Bhole%2B2.4%26hl%3Den%26lr%3Dlang_en%26ie%3DUTF-8%26selm%3Dlinux.kernel.20011217161538.GA17099%2540spylog.ru%26rnum%3D3
Aug 12 11:13:02 wernicke kernel: EXT3-fs error (device sd(8,18)):
ext3_readdir: bad entry in directory #36853974: rec_len %% 4 != 0 -
offset=0, inode=1330206934,
rec_len=51, name_len=1
Aug 12 11:14:32 wernicke kernel: SCSI disk error : host 5 channel 2 id 0
lun 0 return code = 25040001
Aug 12 11:14:32 wernicke kernel: I/O error: dev 08:12, sector 89147976
[same errors, a lot]
Aug 12 11:17:22 wernicke kernel: EXT3-fs error (device sd(8,18)):
ext3_readdir: bad entry in directory #36853974: rec_len %% 4 != 0 -
offset=0, inode=1330206934,
rec_len=51, name_len=1
Aug 12 11:17:36 wernicke last message repeated 20 times
restarted
--end of log---
Thanks so much for your time (lot of reading to get here). :)
Fred Ollinger (follinge@sas.upenn.edu)
CCN sysadmin
_________________________________________________________________________
Philadelphia Linux Users Group -- http://www.phillylinux.org
Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce
General Discussion -- http://lists.netisland.net/mailman/listinfo/plug
|
|