Subject: kern/34751: regular panics in tcp_sack_option on NetBSD/alpha 3.0_STABLE
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Eric Schnoebelen <eric@cirr.com>
List: netbsd-bugs
Date: 10/08/2006 01:50:01
>Number: 34751
>Category: kern
>Synopsis: panics in tcp_sack_option on NetBSD/alpha 3.0_STABLE
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Oct 08 01:50:00 +0000 2006
>Originator: Eric Schnoebelen
>Release: NetBSD 3.0_STABLE
>Organization:
>Environment:
System: NetBSD milo.cirr.com 3.0_STABLE NetBSD 3.0_STABLE (Milo: based on ALPHA-$Revision: 1.202.2.3 $) #2: Wed Jul 26 08:30:51 CDT 2006 root@milo.cirr.com:/usr/src/sys/arch/alpha/compile/MILO alpha
Architecture: alpha
Machine: alpha
>Description:
'm running NetBSD/alpha on an assortment of alpha
hardware, but mostly DS10L's. One of them, running 3.0_STABLE
(circa 26 July 2006) is seeing the following panics on a
semi-regular basis:
[-- eric@localhost attached -- Tue Sep 26 19:09:14 2006]
db> bt
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x1f8
trap() at netbsd:trap+0x120
XentUna() at netbsd:XentUna+0x20
--- unaligned access fault (from ipl 1) ---
tcp_sack_option() at netbsd:tcp_sack_option+0x13c
tcp_dooptions() at netbsd:tcp_dooptions+0x278
tcp_input() at netbsd:tcp_input+0xa20
ip_input() at netbsd:ip_input+0xb4c
ipintr() at netbsd:ipintr+0xa0
netintr() at netbsd:netintr+0x158
softintr_dispatch() at netbsd:softintr_dispatch+0x160
exception_return() at netbsd:exception_return+0x7c
--- root of call graph ---
[-- eric@localhost attached -- Mon Aug 21 00:39:48 2006]
db> bt
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x1f8
pool_get() at netbsd:pool_get+0x1b8
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x170
pmap_lev1map_create() at netbsd:pmap_lev1map_create+0x80
pmap_create() at netbsd:pmap_create+0xe4
uvmspace_init() at netbsd:uvmspace_init+0xa8
uvmspace_alloc() at netbsd:uvmspace_alloc+0x58
uvmspace_exec() at netbsd:uvmspace_exec+0x54
sys_execve() at netbsd:sys_execve+0x6e0
syscall_plain() at netbsd:syscall_plain+0xc4
XentSys() at netbsd:XentSys+0x5c
--- syscall (59) ---
--- user mode ---
[-- eric@localhost attached -- Thu Aug 10 22:37:52 2006]
db> bt
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x1f8
trap() at netbsd:trap+0x120
XentUna() at netbsd:XentUna+0x20
--- unaligned access fault (from ipl 1) ---
tcp_sack_option() at netbsd:tcp_sack_option+0x13c
tcp_dooptions() at netbsd:tcp_dooptions+0x278
tcp_input() at netbsd:tcp_input+0xa20
ip_input() at netbsd:ip_input+0xb4c
ipintr() at netbsd:ipintr+0xa0
netintr() at netbsd:netintr+0x158
softintr_dispatch() at netbsd:softintr_dispatch+0x160
exception_return() at netbsd:exception_return+0x7c
--- root of call graph ---
[-- eric@localhost attached -- Thu Aug 10 14:33:47 2006]
db> bt
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x1f8
trap() at netbsd:trap+0x120
XentUna() at netbsd:XentUna+0x20
--- unaligned access fault (from ipl 1) ---
tcp_sack_option() at netbsd:tcp_sack_option+0x13c
tcp_dooptions() at netbsd:tcp_dooptions+0x278
tcp_input() at netbsd:tcp_input+0xa20
ip_input() at netbsd:ip_input+0xb4c
ipintr() at netbsd:ipintr+0xa0
netintr() at netbsd:netintr+0x158
softintr_dispatch() at netbsd:softintr_dispatch+0x160
exception_return() at netbsd:exception_return+0x7c
--- root of call graph ---
[-- eric@localhost attached -- Mon Jul 24 17:52:22 2006]
db> bt
cpu_Debugger() at netbsd:cpu_Debugger+0x4
panic() at netbsd:panic+0x1f8
trap() at netbsd:trap+0x120
XentUna() at netbsd:XentUna+0x20
--- unaligned access fault (from ipl 1) ---
tcp_sack_option() at netbsd:tcp_sack_option+0x13c
tcp_dooptions() at netbsd:tcp_dooptions+0x278
tcp_input() at netbsd:tcp_input+0xa20
ip_input() at netbsd:ip_input+0xb4c
ipintr() at netbsd:ipintr+0xa0
netintr() at netbsd:netintr+0x158
softintr_dispatch() at netbsd:softintr_dispatch+0x160
exception_return() at netbsd:exception_return+0x7c
--- root of call graph ---
dmesg:
Loaded initial symtab at 0xfffffc0000b9ccc0, strtab at 0xfffffc0000c0ec50, # entries 19369
consinit: not using prom console
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 3.0_STABLE (Milo: based on ALPHA-$Revision: 1.202.2.3 $) #2: Wed Jul 26 08:30:51 CDT 2006
root@milo.cirr.com:/usr/src/sys/arch/alpha/compile/MILO
AlphaServer DS10L 617 MHz, s/n AY10605785
8192 byte page size, 1 processor.
total memory = 1024 MB
(2912 KB reserved for PROM, 1021 MB used by NetBSD)
avail memory = 993 MB
mainbus0 (root)
cpu0 at mainbus0: ID 0 (primary), 21264A-9
cpu0: VAX FP support, IEEE FP support, Primary Eligible
cpu0: Architecture extensions: 307<PAT,MVI,CIX,FIX,BWX>
tsc0 at mainbus0: 21272 Core Logic Chipset, Cchip rev 0
tsc0: 2 Dchips, 1 memory bus of 16 bytes
tsc0: arrays present: 1024MB (split), 0MB, 0MB, 0MB, Dchip 0 rev 1
tsp0 at tsc0
pci0 at tsp0 bus 0
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
sio0 at pci0 dev 7 function 0: Acer Labs M1543 PCI-ISA Bridge (rev. 0xc3)
tlp0 at pci0 dev 9 function 0: DECchip 21143 Ethernet, pass 4.1
tlp0: interrupting at dec 6600 irq 29
tlp0: Ethernet address 00:10:64:30:1c:a9
tlp0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlp1 at pci0 dev 11 function 0: DECchip 21143 Ethernet, pass 4.1
tlp1: interrupting at dec 6600 irq 30
tlp1: Ethernet address 00:10:64:30:1c:ab
tlp1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
aceride0 at pci0 dev 13 function 0
aceride0: Acer Labs M5229 UDMA IDE Controller (rev. 0xc1)
aceride0: bus-master DMA support present
aceride0: primary channel wired to compatibility mode
aceride0: primary channel interrupting at isa irq 14
atabus0 at aceride0 channel 0
aceride0: secondary channel wired to compatibility mode
aceride0: secondary channel interrupting at isa irq 15
atabus1 at aceride0 channel 1
isa0 at sio0
lpt0 at isa0 port 0x3bc-0x3bf irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbdprobe: reset error 5
pmsprobe: reset error 5
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
spkr0 at pcppi0
isabeep0 at pcppi0
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
mcclock0 at isa0 port 0x70-0x71: mc146818 or compatible
raidattach: Asked for 8 units
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
wd0 at atabus0 drive 0: <Maxtor 53073H4>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 28629 MB, 58168 cyl, 16 head, 63 sec, 512 bytes/sect x 58633344 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
wd1 at atabus1 drive 0: <Maxtor 6B250R0>
wd1: drive supports 16-sector PIO transfers, LBA48 addressing
wd1: 233 GB, 486344 cyl, 16 head, 63 sec, 512 bytes/sect x 490234752 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd1(aceride0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
Searching for RAID components...
stray isa irq 14
stray isa irq 15
root on wd0a dumps on wd0b
mountroot: trying nfs...
mountroot: trying msdos...
mountroot: trying cd9660...
wd0: transfer error, downgrading to Ultra-DMA mode 1
wd0(aceride0:0:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA)
wd0a: DMA error reading fsbn 64 of 64-67 (wd0 bn 64; cn 0 tn 1 sn 1), retrying
stray isa irq 14
wd0: soft error (corrected)
mountroot: trying lfs...
mountroot: trying ffs...
root file system type: ffs
readclock: 6/9/27/0/17/44=>1159316264 (1159311193)
init: copying out path `/sbin/init' 11
stray isa irq 15
wd1: transfer error, downgrading to Ultra-DMA mode 1
wd1(aceride0:1:0): using PIO mode 4, Ultra-DMA mode 1 (using DMA)
wd1a: DMA error reading fsbn 16 of 16-31 (wd1 bn 16; cn 0 tn 0 sn 16), retrying
stray isa irq 15
stray isa irq 15
wd1: soft error (corrected)
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
aceride0:1:0: lost interrupt
type: ata tc_bcount: 8192 tc_skip: 0
aceride0:1:0: bus-master DMA error: missing interrupt, status=0x21
wd1: transfer error, downgrading to PIO mode 4
wd1(aceride0:1:0): using PIO mode 4
wd1f: DMA error reading fsbn 16 of 16-31 (wd1 bn 280132624; cn 312648 tn 0 sn 16), retrying
stray isa irq 15
stray isa irq 15
wd1: soft error (corrected)
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
stray isa irq 15
setclock: 6/9/27/0/59/47
stray isa irq 15
stray isa irq 15
>How-To-Repeat:
Let run under networking load?
>Fix:
Simon Burge says:
This looks like it happened in netinet/tcp_sack.c at:
for (i = 0; i < num_sack_blks; i++, lp += 2) {
memcpy(&left, lp, sizeof(*lp));
memcpy(&right, lp + 1, sizeof(*lp));
---> left = ntohl(left);
right = ntohl(right);
Disassembly of tcp_sack.o shows:
../../../../netinet/tcp_sack.c:225
168: a2 09 e4 43 cmplt zero,t3,t1
../../../../netinet/tcp_sack.c:224
16c: 8f 0c 61 44 cmovle t2,t0,fp
../../../../netinet/tcp_sack.c:225
170: 0e 04 ff 47 clr s5
174: 20 00 40 e4 beq t1,1f8 <tcp_sack_option+0x1b8>
../../../../netinet/tcp_sack.c:228
178: 00 00 0c a2 ldl a0,0(s3)
../../../../netinet/tcp_sack.c:227
17c: 04 00 2c a1 ldl s0,4(s3)
I think that it looks like gcc is optimising the memcpy out and doing an
unaligned load directly. We probably need some sort of qualifier on a
variable somewhere?