Subject: NetBSD GENERIC.MP hang
To: None <current-users@netbsd.org>
From: Milos Urbanek <urbanek@openbsd.cz>
List: current-users
Date: 06/13/2003 11:59:50
Hello,

I have some rather old SMP machine. I have no problem running GENERIC UP
NetBSD kernel. However I tried to run GENERIC.MP (current sources from may 2003)
and it hangs during the boot. Here is the serial console output
(generated with mp_verbose set).


>> NetBSD/i386 BIOS Boot, Revision 2.16
>> (root@satai, Wed Mar  5 13:10:46 UTC 2003)
>> Memory: 639/97280 k
Press return to boot now, any other key for boot menu
booting hd0a:netbsd - starting in 0 
type "?" or "help" for help.
> boot netbsd.mp -d
booting hd0a:netbsd.mp (howto 0x40)
6141460+132680+323112 [309344+284464]=0x6dd614
Loaded initial symtab at 0xc074c684, strtab at 0xc0797ee4, # entries 19334
kgdb waiting...connected.
BIOS CFG: Model-SubM-Rev: fc-01-00, 0x74<EBDA,KBDINT,RTC,IC2>
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 1.6R (GENERIC.MP) #0: Fri Jun 13 00:04:44 UTC 2003
        root@crash:/usr/src/sys/arch/i386/compile/GENERIC.MP
total memory = 97916 KB
avail memory = 83536 KB
using 1249 buffers containing 4996 KB of memory
BIOS32 rev. 0 found at 0xfdb80
mainbus0 (root)
mainbus0: scanning 0x9fc00 to 0x9fff0 for MP signature
mainbus0: scanning 0x9f800 to 0x9fbf0 for MP signature
mainbus0: scanning 0xf0000 to 0xffff0 for MP signature
mainbus0: MP floating pointer found in bios at 0xfb5c0
mainbus0: MP config table at 0xf5740, 260 bytes long
mainbus0: Intel MP Specification (Version 1.1) (INTEL    430HX       )
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Pentium (P54C) (586-class), 132.97 MHz, id 0x52c
cpu0: features 3bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,APIC>
cpu0: calibrating local timer
cpu0: apic clock running at 66 MHz
cpu0: kstack at 0xc88b0000 for 16384 bytes
cpu0: idle pcb at 0xc88b0000, idle sp at 0xc88b3f98
cpu1 at mainbus0: apid 1 (application processor)
cpu1: starting
cpu1: Intel Dual Pentium (P54C) (586-class), 166.19 MHz, id 0x252c
cpu1: features 3bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,APIC>
cpu1: kstack at 0xc88b8000 for 16384 bytes
cpu1: idle pcb at 0xc88b8000, idle sp at 0xc88bbf98
mpbios: bus 0 is type PCI   
mpbios: bus 1 is type ISA   
mpbios: bus 2 is type EISA  
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000, virtual wire mode, version 11, 16 pins
ioapic0: int0 attached to ExtINT (type 3<type=3=ExtINT> flags 0<pol=0,trig=0>)
ioapic0: int1 attached to isa0 irq 1 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int2 attached to isa0 irq 0 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int3 attached to isa0 irq 3 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int4 attached to isa0 irq 4 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int5 attached to isa0 irq 5 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int6 attached to isa0 irq 6 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int7 attached to isa0 irq 7 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int8 attached to isa0 irq 8 (type 0<type=0> flags 5<pol=1=Act Hi,trig=1=Edge>)
ioapic0: int9 attached to isa0 irq 9 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int10 attached to isa0 irq 10 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int11 attached to isa0 irq 11 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int12 attached to isa0 irq 12 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int13 attached to isa0 irq 13 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int14 attached to isa0 irq 14 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int15 attached to isa0 irq 15 (type 0<type=0> flags 0<pol=0,trig=0>)
local apic: int0 attached to ExtINT (type 3<type=3=ExtINT> flags 0<pol=0,trig=0>)
local apic: int1 attached to NMI (type 1<type=1=NMI> flags 0<pol=0,trig=0>)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82439HX System Controller (TXC) (rev. 0x02)
pceb0 at pci0 dev 7 function 0
pceb0: Intel 82375EB/SB PCI-EISA Bridge (PCEB) (rev. 0x05)
ahc1 at pci0 dev 8 function 0: unable to map registers
vga1 at pci0 dev 10 function 0: Silicon Integrated System product 0x0204 (rev. 0x21)
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
ne2 at pci0 dev 12 function 0: RealTek 8029 Ethernet
ne2: Ethernet address 00:e0:7d:76:94:d0
ne2: 10base2, 10baseT, 10baseT-FDX, auto, default [0x00 0x30] auto
ne2: interrupting at apic 2 int 11 (irq 11)
eisa0 at pceb0
eisa0: can't map I/O space for slot 15
isa0 at pceb0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns8250 or ns16450, no fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns8250 or ns16450, no fifo
com1: kgdb
pckbc0 at isa0 port 0x60-0x64
pckbdprobe: reset error 5
pmsprobe: reset error 5
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 channel 0 drive 0: <ST310211A>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 9641 MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sect x 19746720 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
cpu0: prelint0 700<vector=0,delmode=7,dest=0> 0<target=0>
cpu0: prelint1 400<vector=0,delmode=4,dest=0> 0<target=0>
cpu0: timer0 300c0<vector=c0,delmode=0,masked,dest=0> 0<target=0>
cpu0: pcint0 0<vector=0,delmode=0,dest=0> 0<target=0>
cpu0: lint0 10700<vector=0,delmode=7,masked,dest=0> 0<target=0>
cpu0: lint1 400<vector=0,delmode=4,dest=0> 0<target=0>
cpu0: err0 1000f<vector=f,delmode=0,masked,dest=0> 0<target=0>
ioapic0: enabling
ioapic0: int3 1d1<vector=d1,delmode=1,dest=0> 0<target=0>
ioapic0: int4 1d0<vector=d0,delmode=1,dest=0> 0<target=0>
ioapic0: int6 161<vector=61,delmode=1,dest=0> 0<target=0>
ioapic0: int11 f170<vector=70,delmode=1,pending,actlo,irrpending,level,dest=0> 0<target=0>
ioapic0: int14 160<vector=60,delmode=1,dest=0> 0<target=0>

here the boot hangs. when I get to kgdb, I can see the machine running in:

(gdb) bt
#0  kgdb_connect (verbose=1) at /usr/src/sys/arch/i386/i386/kgdb_machdep.c:258
#1  0xc0231f46 in comintr (arg=0xc0ad1a00) at /usr/src/sys/dev/ic/com.c:2013
#2  0xc01033c4 in Xintr_ioapic3 ()
#3  0x3 in ?? ()


looks like servicing com1 interrupt. Just in case I attach the gdb single
stepping log..

Cannot access memory at address 0xd
(gdb) s
259                     printf("connected.\n");
(gdb) n
261             kgdb_debug_panic = 1;
(gdb) n
262     }
(gdb) n
comintr (arg=0xc0ad1a00) at /usr/src/sys/dev/ic/com.c:2014
2014                                    continue;
(gdb) s
2152                || ((iir & IIR_IMASK) == 0));
(gdb) bt
#0  comintr (arg=0xc0ad1a00) at /usr/src/sys/dev/ic/com.c:2152
#1  0xc01033c4 in Xintr_ioapic3 ()
#2  0x3 in ?? ()
Cannot access memory at address 0xd
(gdb) s
2159            if (ISSET(lsr, LSR_TXRDY)) {
(gdb) s
2164                    if (sc->sc_heldchange) {
(gdb) s
2172                    if (sc->sc_tbc > 0) {
(gdb) print sc->sc_heldchange
$1 = 0 '\000'
(gdb) print *sc
$2 = {sc_dev = {dv_class = DV_TTY, dv_list = {tqe_next = 0xc0aac380, 
      tqe_prev = 0xc0ad1c04}, dv_cfdata = 0xc06e208c, 
    dv_cfdriver = 0xc06df040, dv_cfattach = 0xc06f3ba0, dv_unit = 1, 
    dv_xname = "com1", '\000' <repeats 11 times>, dv_parent = 0xc0aac700, 
    dv_flags = 1}, sc_si = 0xc0ad2e80, sc_tty = 0xc88ae108, sc_diag_callout = {
    c_list = {cq_next = 0x0, cq_prev = 0x0}, c_func = 0, c_arg = 0x0, 
    c_time = 0, c_flags = 0}, sc_iobase = 760, sc_frequency = 1843200, 
  sc_iot = 0, sc_ioh = 760, sc_hayespioh = 0, sc_overflows = 0, sc_floods = 0, 
  sc_errors = 0, sc_hwflags = 160, sc_swflags = 0, sc_fifolen = 1, 
  sc_r_hiwat = 0, sc_r_lowat = 0, sc_rbget = 0xc0ad9000 "", 
  sc_rbput = 0xc0ad9000 "", sc_rbavail = 2048, sc_rbuf = 0xc0ad9000 "", 
  sc_ebuf = 0xc0ada000 "ďž­Ţ4 ­Ŕ௭Ŕďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţďž­Ţh ­Ŕ\004 ­Ŕ", sc_tba = 0x0, sc_tbc = 0, sc_heldtbc = 0, 
  sc_rx_flags = 0 '\000', sc_tx_busy = 0 '\000', sc_tx_done = 0 '\000', 
  sc_tx_stopped = 0 '\000', sc_st_check = 0 '\000', sc_rx_ready = 0 '\000', 
  sc_heldchange = 0 '\000', sc_msr = 0 '\000', sc_msr_delta = 0 '\000', 
  sc_msr_mask = 0 '\000', sc_mcr = 11 '\013', sc_mcr_active = 0 '\000', 
  sc_lcr = 0 '\000', sc_ier = 1 '\001', sc_fifo = 0 '\000', 
  sc_dlbl = 0 '\000', sc_dlbh = 0 '\000', sc_efr = 0 '\000', 
  sc_mcr_dtr = 0 '\000', sc_mcr_rts = 0 '\000', sc_msr_cts = 0 '\000', 
  sc_msr_dcd = 0 '\000', enable = 0, disable = 0, enabled = 1, 
  sc_ppsmask = 0 '\000', sc_ppsassert = 0 '\000', sc_ppsclear = 0 '\000', 
  ppsinfo = {assert_sequence = 0, clear_sequence = 0, assert_tu = {tspec = {
---Type <return> to continue, or q <return> to quit---
        tv_sec = 0, tv_nsec = 0}, ntplfp = {integral = 0, fractional = 0}, 
      longpair = {0, 0}}, clear_tu = {tspec = {tv_sec = 0, tv_nsec = 0}, 
      ntplfp = {integral = 0, fractional = 0}, longpair = {0, 0}}, 
    current_mode = 0}, ppsparam = {api_version = 0, mode = 0, assert_off_tu = {
      tspec = {tv_sec = 0, tv_nsec = 0}, ntplfp = {integral = 0, 
        fractional = 0}, longpair = {0, 0}}, clear_off_tu = {tspec = {
        tv_sec = 0, tv_nsec = 0}, ntplfp = {integral = 0, fractional = 0}, 
      longpair = {0, 0}}}, sc_lock = {lock_data = 1}}
(gdb) s
2183                            if (ISSET(sc->sc_ier, IER_ETXRDY)) {
(gdb) s
2187                            if (sc->sc_tx_busy) {
(gdb) s
2194            if (!ISSET((iir = bus_space_read_1(iot, ioh, com_iir)), IIR_NOPEND))
(gdb) n
2201            softintr_schedule(sc->sc_si);
(gdb) n
2217            return (1);
(gdb) s
0xc01033c4 in Xintr_ioapic3 ()
(gdb) s
Single stepping until exit from function Xintr_ioapic3, 
which has no line number information.
x86_intunlock (iframe={if_ppl = 7, if_gs = 4194320, if_fs = -1070596048, 
      if_es = 16, if_ds = 16, if_edi = -1066209792, if_esi = -1072689508, 
      if_ebp = -1065472308, if_ebx = 7, if_edx = -1066439360, if_ecx = 8, 
      if_eax = -1062550528, __if_trapno = 3, __if_err = 0, 
      if_eip = -1072689508, if_cs = 8, if_eflags = 535, if_esp = -1066209792, 
      if_ss = 7}) at /usr/src/sys/arch/x86/x86/intr.c:635
635             if (iframe.if_ppl < IPL_SCHED)
(gdb) s
636                     spinlockmgr(&kernel_lock, LK_RELEASE, 0);
(gdb) n
0xc01033d3 in Xintr_ioapic3 ()
(gdb) s
Single stepping until exit from function Xintr_ioapic3, 
which has no line number information.
warning: Cannot insert breakpoint 0:
Cannot access memory at address 0x7
(gdb) n
Single stepping until exit from function Xdoreti, 
which has no line number information.

here the machine hangs, with

kgdb: caught trap 0x6 at 0xc03b5799
kgdb: caught trap 0x1 at 0xc01009f9
kgdb: caught trap 0x1 at 0xc01009f9
kgdb: caught trap 0x1 at 0xc01009f9
kgdb: caught trap 0x1 at 0xc01009f9
kgdb: caught trap 0x1 at 0xc01009f9
kgdb: caught trap 0x1 at 0xc01009f9
kgdb: caught trap 0x1 at 0xc01009f9

etc.etc..

can please some ioapic guru look at it/get some hints?

Milos
--