Subject: First real (non-crashing) problem with SMP kernel
To: None <tech-smp@netbsd.org>
From: Jeff Rizzo <riz@boogers.sf.ca.us>
List: tech-smp
Date: 08/31/2000 16:36:15
I've been running this dual-proc box for a couple days now, and it's
been remarkably stable.  I *have* seen a few apparently softdep-related
silent reboots (similar to what Sean Doran reports), but no major
problems, until this afternoon.

Basically, I was using scp to copy a binary to another host, and this
particular binary (a perl package tarfile) has data corruption when
I try to copy it from the multiprocessor box to any other.  I can copy
it between other boxes just fine, and to/from the multiproc box
when booting a non-SMP kernel (today's sources) configured
identically... (except the ioapic and CPU bits)

Here's what it looks like from the commandline, and it's quite repeatable:


t12# scp perl-5.00503.tgz riz-test80:/tmp
root@riz-test80's password: 
perl-5.00503.tgz          |        392 KB | 196.0 kB/s | ETA: 00:00:14 |  12%Corrupted check bytes on input.
lost connection
test12# 

Data corruption concerns me, and it does seem to be related to the MP kernel.
I'm happy to help resolve this any way I can...

The boot messages from the MP kernel are at the end of this message.  
This is the same info that I posted yesterday.

Thanks! 
+j
--

>> (riz@dhcp.nj.equinix.net, Wed Jan 26 15:32:41 PST 2000)
>> Memory: 636/261120 k
Use hd1a:netbsd to boot sd0 when wd0 is also installed
Press return to boot now, any other key for boot menu
booting wd0a:netbsd - starting in 0
2377674+150616+290896 [65+169024+140104]=0x2fd638
[ using 309648 bytes of netbsd ELF symbol table ]
Copyright (c) 1996, 1997, 1998, 1999, 2000
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

bank 0 has 60562 pages
bank 1 has 3067 pages
bank 2 has 155 pages
202 pool entries; page size 4096
WARNING: static pool `pmaptlbpl' dropped below low water mark
NetBSD 1.5E (TEST.MP) #1: Tue Aug 29 12:33:41 PDT 2000
    riz@hubba.boogers.sf.ca.us:/usr/work/netbsd/src/sys/arch/i386/compile/TEST.MP
total memory = 255 MB
avail memory = 230 MB
using 3297 buffers containing 13188 KB of memory
biostramp installed @ 1000
BIOS32 rev. 0 found at 0xfdb50
PCI BIOS rev. 2.1 found at 0xfdb71
PCI BIOS has 12 Interrupt Routing table entries
mainbus0 (root)
mainbus0: scanning 0x9f000 to 0x9f3f0 for MP signature
mainbus0: scanning 0x9ec00 to 0x9eff0 for MP signature
mainbus0: scanning 0xf0000 to 0xffff0 for MP signature
mainbus0: MP floating pointer found in bios at 0xfb560
mainbus0: MP config table at 0xf6470, 284 bytes long
mainbus0: Intel MP Specification (Version 1.1)
mainbus0: MP OEM INTEL    Product 440GX       
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: family 6 model 8 step 1
cpu0: Intel Pentium III (E) (686-class)
cpu0: features 387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 387fbff<PGE,MCA,CMOV,FGPAT,PSE36,PN,MMX,FXSR,XMM>
cpu0: calibrating local timer
cpu0: apic clock running at 100 MHz
cpu0: kstack at 0xd24ca000 for 8192 bytes
cpu0: idle pcb at 0xd24ca000, idle sp at 0xd24cbfa0
cpu1 at mainbus0: apid 1 (application processor)
cpu1: family 6 model 7 step 3
cpu1: Intel Pentium III (686-class)
cpu1: features 387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu1: features 387fbff<PGE,MCA,CMOV,FGPAT,PSE36,PN,MMX,FXSR,XMM>
cpu1: kstack at 0xd24cc000 for 8192 bytes
cpu1: idle pcb at 0xd24cc000, idle sp at 0xd24cdfa0
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000, virtual wire mode, version 11, 24 pins
ioapic0: int0 attached to ExtINT (type 3<type=3=ExtINT> flags 0<pol=0,trig=0>)
ioapic0: int1 attached to isa0 irq 1 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int2 attached to isa0 irq 0 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int3 attached to isa0 irq 3 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int4 attached to isa0 irq 4 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int6 attached to isa0 irq 6 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int7 attached to isa0 irq 7 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int8 attached to isa0 irq 8 (type 0<type=0> flags 5<pol=1=Act Hi,trig=1=Edge>)
ioapic0: int12 attached to isa0 irq 12 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int13 attached to isa0 irq 13 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int14 attached to isa0 irq 14 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int15 attached to isa0 irq 15 (type 0<type=0> flags 0<pol=0,trig=0>)
ioapic0: int16 attached to pci0 device 14 INT_A (type 0<type=0> flags f<pol=3=Act Lo,trig=3=Level>)
ioapic0: int17 attached to pci0 device 11 INT_A (type 0<type=0> flags f<pol=3=Act Lo,trig=3=Level>)
ioapic0: int18 attached to pci0 device 12 INT_A (type 0<type=0> flags f<pol=3=Act Lo,trig=3=Level>)
ioapic0: int18 attached to pci0 device 11 INT_B (type 0<type=0> flags f<pol=3=Act Lo,trig=3=Level>)
ioapic0: int19 attached to pci0 device 13 INT_A (type 0<type=0> flags f<pol=3=Act Lo,trig=3=Level>)
ioapic0: int19 attached to pci0 device 7 INT_D (type 0<type=0> flags f<pol=3=Act Lo,trig=3=Level>)
ioapic0: int23 attached to SMI (type 2<type=2=SMI> flags 0<pol=0,trig=0>)
local apic: int0 attached to ExtINT (type 3<type=3=ExtINT> flags 0<pol=0,trig=0>)
local apic: int1 attached to NMI (type 1<type=1=NMI> flags 0<pol=0,trig=0>)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: Intel product 0x71a0 (rev. 0x00)
ppb0 at pci0 dev 1 function 0: Intel product 0x71a1 (rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
pcib0 at pci0 dev 7 function 0
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
pciide0 at pci0 dev 7 function 1: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
atapibus0 at pciide0 channel 0
cd0 at atapibus0 drive 0: <CD-540E, , 1.0A> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2
pciide0: primary channel interrupting at irq 14
cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfers)
pciide0: secondary channel wired to compatibility mode
pciide0: disabling secondary channel (no drives)
uhci0 at pci0 dev 7 function 2: Intel 82371AB USB Host Controller (PIIX4) (rev. 0x01)
uhci0: interrupting at apic 2 int 19 (irq 9)
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous bridge, revision 0x02) at pci0 dev 7 function 3 not configured
siop0 at pci0 dev 11 function 0: Symbios Logic 53c896 (ultra2-wide scsi)
siop0: can't map on-board RAM
siop0: interrupting at apic 2 int 17 (irq 5)
scsibus0 at siop0: 16 targets, 8 luns per target
siop1 at pci0 dev 11 function 1: Symbios Logic 53c896 (ultra2-wide scsi)
siop1: can't map on-board RAM
siop1: interrupting at apic 2 int 18 (irq 11)
scsibus1 at siop1: 16 targets, 8 luns per target
eap0 at pci0 dev 12 function 0: Ensoniq AudioPCI 97 (rev. 0x08)
eap0: interrupting at apic 2 int 18 (irq 11)
eap0: Crystal CS4297 codec; headphone, 18 bit DAC, 18 bit ADC, no 3D stereo
audio0 at eap0: full duplex, mmap, independent
midi0 at eap0: AudioPCI MIDI UART
fxp0 at pci0 dev 13 function 0: Intel i82557 Ethernet, rev 8
fxp0: interrupting at apic 2 int 19 (irq 9)
fxp0: detected 64 word EEPROM
fxp0: Ethernet address 00:e0:81:10:93:d7, 10/100 Mb/s
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vga1 at pci0 dev 14 function 0: ATI Technologies Mach64 GV (rev. 0x3a)
wsdisplay0 at vga1
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbdprobe: reset error 5
pmsprobe: reset error 5
pmsiprobe: reset error 5
lm0 at isa0 port 0x290-0x297: W83782D
lpt0 at isa0 port 0x378-0x37b irq 7
pcppi0 at isa0 port 0x61
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
biomask 0 netmask 0 ttymask 0
cpu0: prelint0 700<vector=0,delmode=7,dest=0> 0<target=0>
cpu0: prelint1 400<vector=0,delmode=4,dest=0> 0<target=0>
cpu0: timer0 300d0<vector=d0,delmode=0,masked,dest=0> 0<target=0>
cpu0: pcint0 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
cpu0: lint0 10700<vector=0,delmode=7,masked,dest=0> 0<target=0>
cpu0: lint1 400<vector=0,delmode=4,dest=0> 0<target=0>
cpu0: err0 1000f<vector=f,delmode=0,masked,dest=0> 0<target=0>
ioapic0: enabling
ioapic0: int0 10700<vector=0,delmode=7,masked,dest=0> 0<target=0>
ioapic0: int1 10100<vector=0,delmode=1,masked,dest=0> 0<target=0>
ioapic0: int2 10100<vector=0,delmode=1,masked,dest=0> 0<target=0>
ioapic0: int3 1e1<vector=e1,delmode=1,dest=0> 0<target=0>
ioapic0: int4 1e2<vector=e2,delmode=1,dest=0> 0<target=0>
ioapic0: int5 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int6 170<vector=70,delmode=1,dest=0> 0<target=0>
ioapic0: int7 1a0<vector=a0,delmode=1,dest=0> 0<target=0>
ioapic0: int8 10100<vector=0,delmode=1,masked,dest=0> 0<target=0>
ioapic0: int9 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int10 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int11 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int12 10100<vector=0,delmode=1,masked,dest=0> 0<target=0>
ioapic0: int13 10100<vector=0,delmode=1,masked,dest=0> 0<target=0>
ioapic0: int14 171<vector=71,delmode=1,dest=0> 0<target=0>
ioapic0: int15 10100<vector=0,delmode=1,masked,dest=0> 0<target=0>
ioapic0: int16 1a100<vector=0,delmode=1,actlo,level,masked,dest=0> 0<target=0>
ioapic0: int17 a172<vector=72,delmode=1,actlo,level,dest=0> 0<target=0>
ioapic0: int18 a1c0<vector=c0,delmode=1,actlo,level,dest=0> 0<target=0>
ioapic0: int19 a181<vector=81,delmode=1,actlo,level,dest=0> 0<target=0>
ioapic0: int20 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int21 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int22 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
ioapic0: int23 200<vector=0,delmode=2,dest=0> 0<target=0>
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 0 lun 0: <SEAGATE, ST318275LW, 0001> SCSI2 0/direct fixed
siop0: target 0 using 16bit transfers
siop0: target 0 now synchronous at 40.0Mhz, offset 15
sd0: 17366 MB, 11721 cyl, 10 head, 303 sec, 512 bytes/sect x 35566480 sectors
scsibus1: waiting 2 seconds for devices to settle...
IPsec: Initialized Security Association Processing.
boot device: sd0
root on sd0a dumps on sd0b
mountroot: trying msdos...
mountroot: trying cd9660...
mountroot: trying nfs...
mountroot: trying ffs...
root file system type: ffs
cpu1: starting
cpu1: prelint0 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
cpu1: prelint1 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
cpu1: timer0 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
cpu1: pcint0 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
cpu1: lint0 10700<vector=0,delmode=7,masked,dest=0> 0<target=0>
cpu1: lint1 400<vector=0,delmode=4,dest=0> 0<target=0>
cpu1: err0 10000<vector=0,delmode=0,masked,dest=0> 0<target=0>
cpu1: CPU 1 running
init: copying out path `/sbin/init' 11
swapctl: adding /dev/sd0b as swap device at priority 0
Automatic boot in progress: starting file system checks.
/dev/rsd0a: file system is clean; not checking
/dev/rsd0e: file system is clean; not checking
Setting tty flags.
Setting securelevel: kern.securelevel: 0 -> 1
Setting sysctl variables:
Starting network.
Hostname: test12
add net 127.0.0.0: gateway 127.0.0.1
Configuring network interfaces:.
add net fe80::: gateway ::1
add net fec0::: gateway ::1
add net ::ffff:0.0.0.0: gateway ::1
add net ::224.0.0.0: gateway ::1
add net ::127.0.0.0: gateway ::1
add net ::0.0.0.0: gateway ::1
add net ::255.0.0.0: gateway ::1
add net 2002:e000::: gateway ::1
add net 2002:7f00::: gateway ::1
add net 2002:0000::: gateway ::1
add net 2002:ff00::: gateway ::1
add net ::0.0.0.0: gateway ::1
IPv6 mode: host
Starting syslogd.
Checking for core dump...
savecore: no core dump
Starting rpcbind.
Mounting all filesystems...
Checking quotas: done.
Starting mountd.
Starting nfsd.
Building databases...
Clearing /tmp.
starting local daemons: sshd.
Updating motd.
Starting ntpd.
Starting inetd.
Starting dhclient.
Internet Software Consortium DHCP Client V3.0b2pl0-20000708
Copyright 1995-2000 Internet Software Consortium.
All rights reserved.

Please contribute if you find this software useful.
For info, please visit http://www.isc.org/dhcp-contrib.html

Listening on BPF/fxp0/00:e0:81:10:93:d7
Sending on   BPF/fxp0/00:e0:81:10:93:d7
Sending on   Socket/fallback
DHCPREQUEST on fxp0 to 255.255.255.255 port 67
DHCPACK from 172.16.3.254
New Network Number: 172.16.2.0
New Broadcast Address: 172.16.3.255
bound to 172.16.2.188 -- renewal in 3127 seconds.
Starting cron.
Tue Aug 29 15:27:29 PDT 2000

NetBSD/i386 (test12) (tty00)

login:
-- 
Jeff Rizzo                                         http://boogers.sf.ca.us/~riz