Eric Allan Lucas on Mon, 25 Sep 2000 07:13:22 -0400 (EDT)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

[PLUG] Postmortem


I had a strange occurence with my Linux workstation and I was wondering if anyone could advise me on how to determine what happened (therefore - how to prevent a repeat).

The computer is a 200 MHz MMX pentium - 64 Meg RAM and an 18 GB SCSI hard drive (there are several other ide drives - mostly old windows stuff).  It's running Red Hat 6.1 with the latest Helix-Gnome update.

I walked into the office about 8:15 last evening and the disk access light is going steadily - I can hear a drive working.
The X screen saver is frozen and V E R Y slowly starts to paint the unlock password box when I move the mouse.  There is not discernable network traffic so it's all internal and it's obviously _very_ busy.  I switch to a console and after 2 tries (others time out after 60 seconds) manage to log in as root.

At this point I want to run top or ps but nothing works - bus error for everything.  I type "cat /proc/meminfo" and I see something like this:

        total:    used:    free:  
Mem:   64573440  64573440    0 
Swap: 139821056  139821056   0

At this point I just want my workstation back to do some work so I try the reboot command... It gives me the buserror message but still manages to work well enough to get the box rebooted.  After this, everything appears normal.  I poked around in the /var/log directory looking for clues and the only thing I find is 160 of these strange messages at the end of the dmesg file:  

VFS: Disk change detected on device ide1(22,64)

My conclusion is that some process "ran away", consuming all available memory, including swap.  Maybe it was triggered by some hardware glitch on the ide1 port (SWAG *)?  The problem is I don't know how to find out which process it was, why it happend, or most importantly, how to prevent it (if possible) in the future.

Any ideas?

Does anybody have suggestions about handling a situation like this?  Was I too quick on the reboot command?  

Any advice is appreaciated (except the advice to buy myself a better computer... that's already in the works :-) )

Thanks
Eric Lucas

* SWAG -> Scientific Wild Ass Guess !

______________________________________________
FREE Personalized Email at Mail.com
Sign up at http://www.mail.com/?sr=signup



______________________________________________________________________
Philadelphia Linux Users Group       -      http://www.phillylinux.org
Announcements-http://lists.phillylinux.org/mail/listinfo/plug-announce
General Discussion  -  http://lists.phillylinux.org/mail/listinfo/plug