Carlos Konstanski on 28 Jul 2005 05:12:32 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] APIC errors, weird crashes


I dealt with this kind of thing before once, where we got 5 new boxes
that were not our usual build.  We had to use "nolapic" in the kernel boot
arguments.  These were single-processor machines.  I forget which CPU
and motherboard they were, but I think the boards were Asus.

With "nolapic" specified, these machines have been reliable, one even
running as a qa server for a huge tomcat app.

Carlos

On Thu, 28 Jul 2005, Jeff Abrahamson wrote:

Date: Thu, 28 Jul 2005 01:07:28 -0400
From: Jeff Abrahamson <jeff@purple.com>
Reply-To: Philadelphia Linux User's Group Discussion List
    <plug@lists.phillylinux.org>
To: PLUG <plug@lists.phillylinux.org>
Subject: [PLUG] APIC errors, weird crashes

I set up a new machine that has been getting weird crashes.  (So far
gnome terminal, mozilla, emacs21, exim4, X, clock applet, workspace
applet, xterm, and ogg123 have crashed.)

At first I thought this was APIC related, as I saw a few kernel log
messages to this effect (see below).  But, for the most part, the
crashes have not been accompanied by anything tell-tale in the logs.
It happens often enough to be annoying but not so often that it's
feasible to sit around and watch it crash.

I'm running Debian testing, but no updates have been posted for
several days.  I'm hoping it's not hardware.

Any suggestions what might be going on or what to do?


[ The remainder of this message details the APIC kernel error, for those who are interested and for posterity. Most can stop reading now. ]

Here is an example of the APIC kernel error, but this is relatively rare:

   jeff@astra:kernel-source-2.6.8 $ dmesg | grep -i apic
   ENABLING IO-APIC IRQs
   init IO_APIC IRQs
    IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
   Using local APIC timer interrupts.
   calibrating APIC timer ...
   ACPI: Using IOAPIC for interrupt routing
   number of IO-APIC #2 registers: 24.
   testing the IO APIC.......................
   IO APIC #2......
   .......    : physical APIC id: 02
   .......     : IO APIC version: 0003
   APIC error on CPU0: 00(60)
    <6>APIC error on CPU0: 60(60)
   APIC error on CPU0: 60(60)
   jeff@astra:kernel-source-2.6.8 $ uname -a
   Linux astra 2.6.8-2-686 #1 Thu May 19 17:53:30 JST 2005 i686 GNU/Linux
   jeff@astra:kernel-source-2.6.8 $

APIC errors, though, seem like they should only happen on SMP
machines.  (Cf. arch/i386/kernel/apic.c, function
smp_error_interrupt().)  My kernel is not compiled for SMP (see uname,
above) and I only have one processor.

   jeff@astra:kernel-source-2.6.8 $ cat /proc/cpuinfo
   processor       : 0
   vendor_id       : GenuineIntel
   cpu family      : 15
   model           : 4
   model name      : Intel(R) Pentium(R) 4 CPU 3.00GHz
   [...]
   jeff@astra:kernel-source-2.6.8 $ cat /proc/cpuinfo | grep processor
   processor       : 0
   jeff@astra:kernel-source-2.6.8 $

BTW, I found a cool info site here:

   http://wiki.linuxquestions.org/wiki/APIC

--
Jeff

Jeff Abrahamson  <http://www.purple.com/jeff/>    +1 215/837-2287
GPG fingerprint: 1A1A BA95 D082 A558 A276  63C6 16BF 8C4C 0D1D AE4B

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug