eric@lucii.org on Sat, 19 Jul 2003 17:56:27 -0400 |
- Lessons learned - Well, thanks to those who replied to this email... I really had a tough time with this until last week. The "freezes" were so random and so rare (about once a week at the _most_) that I was at a loss to pinpoint a single cause. Then, last week I went away for 2 days and left the computer off. When I returned and powered it up there was a green LED but no activity... no video, no whirring of disks... zilch. Dissassembly revealed a foul oder from the power supply which leads me to believe that it was "going - going - GONE!". I replaced it on Friday the 11th of July and so far (knock on wood) there is no sign of the "freezes". Had this not happened, I was preparing (if the problem did not cease) to replace the memory and the power supply as a SWAG* solution. Thanks again. Eric * "Scientific Wild Ass Guess" On Mon, Apr 14, 2003 at 12:22:01AM -0400, Martin DiViaio wrote: > > I had a similar problem with a server I use to work on. I eventually > tracked it down to a problem with one of the device modules and SMP > support. I recompiled the kernel without SMP and the problem went away. > > -- > GPG Fingerprint: C900 18EF 0C36 4EAF A93C F073 85D4 8B3C F3D8 077B > > > On the 13th day of April in the year 2003 you wrote: > > > Date: Sun, 13 Apr 2003 17:13:46 -0400 > > From: "eric@lucii.org" <eric@lucii.org> > > To: PLUG <plug@lists.phillylinux.org> > > X-Spam-Status: No, hits=-4.3 required=5.0 > > tests=SIGNATURE_LONG_DENSE,SPAM_PHRASE_00_01, > > TO_LOCALPART_EQ_REAL,USER_AGENT,USER_AGENT_MUTT > > version=2.44 > > Subject: [PLUG] Mysterious system freeze > > > > Occasionally in the past three months, my primary Linux workstation > > (sol), will, for no apparent reason, stop functioning. > > > > It would stop responding to commands, not start new logins, xterms, or > > shells and eventually I would have to press the reset button or power > > off. It might do this once every two weeks or so (quite infrequently). > > > > Friday, it went one step further and simply "froze" No mouse movement, > > no keyboard input - cannot even switch to a VC. I tried to ssh to the > > workstation from another computer to look at the logs but there was no > > reponse. It would not even respond to a "ping". > > > > Dmesg appears to only hold current information. The (I hope) relevant > > portion of /var/log/messages is at the bottom of this message. Note > > the odd time shift (syslogd restarts 37 minutes BEFORE the prevous > > crontab entry :-P ) > > > > One thing I do notice is that in the reboot process the reiser fsck > > finds a number of things to correct: > > > > > clm-6006: writing inode 7644 on readonly FS > > > clm-6006: writing inode 7644 on readonly FS > > > clm-6006: writing inode 7644 on readonly FS > > > clm-6006: writing inode 7644 on readonly FS > > > clm-6006: writing inode 7644 on readonly FS > > etc. (about 300+ times). > > > > (This is also visible in the /var/log/messages snippet below.) > > > > I don't know if that is a problem, or the result of the problem. > > > > The system is SuSE 7.3 with some upgrades from Yast Online Update. It's > > a 800 MHz Athlon system with 256 Meg RAM. It runs on a three year old > > Fujitsu 18 GB SCSI hard drive divided up like this (in relevant part): > > > > [eric@sol eric]$ df -h > > Filesystem Size Used Avail Use% Mounted on > > /dev/sda6 5.6G 4.4G 1.2G 78% / > > /dev/sda2 23M 3.9M 17M 18% /boot > > /dev/sda7 5.5G 3.1G 2.1G 59% /home > > shmfs 125M 0 124M 0% /dev/shm > > > > Any help/suggestions/hints are appreciated. > > > > Eric > > > > > > --------------- portion of /var/log/messages follows -------------- > > Apr 11 21:41:47 sol PAM-unix2[9477]: session started for user eric, service xdm > > Apr 11 21:50:00 sol /USR/SBIN/CRON[12698]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 21:59:00 sol /USR/SBIN/CRON[12710]: (root) CMD ( rm -f /var/spool/cron/lastrun/cron.hourly) > > Apr 11 22:00:00 sol /USR/SBIN/CRON[12715]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:10:00 sol /USR/SBIN/CRON[12758]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:20:01 sol /USR/SBIN/CRON[12816]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:30:00 sol /USR/SBIN/CRON[12876]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:40:00 sol /USR/SBIN/CRON[12938]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:03:06 sol syslogd 1.4.1: restart. > > Apr 11 22:03:09 sol webmin[343]: Webmin starting > > Apr 11 22:03:11 sol kernel: klogd 1.4.1, log source = /proc/kmsg started. > > Apr 11 22:03:11 sol kernel: Inspecting /boot/System.map-2.4.10-4GB > > Apr 11 22:03:11 sol kernel: Loaded 11709 symbols from /boot/System.map-2.4.10-4GB. > > Apr 11 22:03:11 sol kernel: Symbols match kernel version 2.4.10. > > Apr 11 22:03:11 sol kernel: Loaded 439 symbols from 13 modules. > > Apr 11 22:03:11 sol kernel: g inode 7644 on readonly FS > > Apr 11 22:03:11 sol kernel: clm-6006: writing inode 7644 on readonly FS > > Apr 11 22:03:11 sol last message repeated 234 times > > Apr 11 22:03:11 sol kernel: clm-6005: writing inode 7644 on readonly FS > > Apr 11 22:03:11 sol kernel: clm-6006: writing inode 7644 on readonly FS > > Apr 11 22:03:11 sol last message repeated 110 times > > Apr 11 22:03:11 sol kernel: clm-6005: writing inode 7644 on readonly FS > > Apr 11 22:03:11 sol kernel: ip_tables: (c)2000 Netfilter core team > > Apr 11 22:03:11 sol kernel: ip_conntrack (2047 buckets, 16376 max) > > Apr 11 22:03:11 sol kernel: PCI: Found IRQ 10 for device 00:09.0 > > Apr 11 22:03:11 sol kernel: 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html > > Apr 11 22:03:11 sol kernel: 00:09.0: 3Com PCI 3c905B Cyclone 100baseTx at 0xdc00. Vers LK1.1.16 > > Apr 11 22:03:11 sol kernel: IPv6 v0.8 for NET4.0 > > Apr 11 22:03:11 sol kernel: IPv6 over IPv4 tunneling driver > > Apr 11 22:03:16 sol kernel: eth0: no IPv6 routers present > > Apr 11 22:03:41 sol /usr/sbin/cron[778]: (CRON) STARTUP (fork ok) > > Apr 11 22:03:45 sol kernel: isapnp: Scanning for PnP cards... > > Apr 11 22:03:45 sol kernel: isapnp: Calling quirk for 01:00 > > Apr 11 22:03:45 sol kernel: isapnp: SB audio device quirk - increasing port range > > Apr 11 22:03:45 sol kernel: isapnp: Card 'Creative ViBRA16X PnP' > > Apr 11 22:03:45 sol kernel: isapnp: 1 Plug & Play card detected total > > Apr 11 22:03:50 sol kernel: nvidia: loading NVIDIA Linux x86 NVdriver Kernel Module 1.0-3123 Tue Aug 27 15:56:48 PDT 2002 > > Apr 11 22:03:51 sol kernel: Linux agpgart interface v0.99 (c) Jeff Hartmann > > Apr 11 22:03:51 sol kernel: agpgart: Maximum main memory to use for agp memory: 203M > > Apr 11 22:03:51 sol kernel: agpgart: Detected Via Apollo Pro KT133 chipset > > Apr 11 22:03:51 sol kernel: agpgart: AGP aperture is 64M @ 0xd0000000 > > Apr 11 22:03:51 sol kernel: NVRM: AGPGART: VIA Apollo KT133 chipset > > Apr 11 22:03:51 sol kernel: NVRM: AGPGART: aperture: 64M @ 0xd0000000 > > Apr 11 22:03:51 sol kernel: NVRM: AGPGART: aperture mapped from 0xd0000000 to 0xd3adf000 > > Apr 11 22:03:51 sol kernel: NVRM: AGPGART: mode 2x > > Apr 11 22:03:51 sol kernel: NVRM: AGPGART: allocated 16 pages > > Apr 11 22:03:56 sol kernel: Switching off penguin. > > Apr 11 22:08:33 sol kdm[906]: Abnormal helper termination, code 1, signal 0 > > Apr 11 22:08:33 sol kdm[906]: fatal IO error 32 (Broken pipe) > > Apr 11 22:08:33 sol kernel: NVRM: AGPGART: freed 16 pages > > Apr 11 22:08:33 sol kernel: NVRM: AGPGART: backend released > > Apr 11 22:08:34 sol kernel: NVRM: AGPGART: VIA Apollo KT133 chipset > > Apr 11 22:08:34 sol kernel: NVRM: AGPGART: aperture: 64M @ 0xd0000000 > > Apr 11 22:08:34 sol kernel: NVRM: AGPGART: aperture mapped from 0xd0000000 to 0xd3adf000 > > Apr 11 22:08:34 sol kernel: NVRM: AGPGART: mode 2x > > Apr 11 22:08:34 sol kernel: NVRM: AGPGART: allocated 16 pages > > Apr 11 22:10:00 sol /USR/SBIN/CRON[1054]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:20:00 sol /USR/SBIN/CRON[1085]: (root) CMD ( /usr/lib/sa/sa1 ) > > Apr 11 22:30:00 sol /USR/SBIN/CRON[1093]: (root) CMD ( /usr/lib/sa/sa1 ) > > > > > > > > _________________________________________________________________________ > Philadelphia Linux Users Group -- http://www.phillylinux.org > Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce > General Discussion -- http://lists.netisland.net/mailman/listinfo/plug > > -- ------------------------------------------------------------------------ # Eric Lucas ======================================================================== Today, wanting someone else's money is called "need", wanting to keep your own money is called "greed", and "compassion" is when politicians arrange the transfer. -- Joseph Sobran _________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.netisland.net/mailman/listinfo/plug-announce General Discussion -- http://lists.netisland.net/mailman/listinfo/plug
|
|