Subject: kern/37400: panic in ath_rate_findrate(): ndx is 0
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <j+nbsd@2007.salmi.ch>
List: netbsd-bugs
Date: 11/17/2007 14:10:01
>Number:         37400
>Category:       kern
>Synopsis:       panic in ath_rate_findrate(): ndx is 0
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Nov 17 14:10:01 +0000 2007
>Originator:     Jukka Salmi
>Release:        NetBSD 4.0_RC4
>Environment:
System: NetBSD clam.salmi.ch 4.0_RC4 NetBSD 4.0_RC4 (CLAM) #0: Fri Nov  9 21:40:09 UTC 2007  root@moray.salmi.ch:/b/build/nbsd/4/i386/sys/arch/i386/compile/CLAM i386
Architecture: i386
Machine: i386
Some sysctl settings:
net.link.ieee80211.vap0.parent = ath0
hw.ath.dwell = 200
hw.ath.calibrate = 30
hw.ath.outdoor = 1
hw.ath.countrycode = 0
hw.ath.regdomain = 0
hw.ath.debug = 0
hw.ath.rxbuf = 40
hw.ath.txbuf = 100
hw.ath.hal.version = 0.9.17.2
hw.ath.hal.dma_brt = 2
hw.ath.hal.sw_brt = 10
hw.ath.hal.swba_backoff = 0
hw.ath0.smoothing_rate = 95
hw.ath0.sample_rate = 10
hw.ath0.countrycode = 0
hw.ath0.debug = 0
hw.ath0.slottime = 9
hw.ath0.acktimeout = 48
hw.ath0.ctstimeout = 48
hw.ath0.softled = 0
hw.ath0.ledpin = 0
hw.ath0.ledon = 0
hw.ath0.ledidle = 270
hw.ath0.txantenna = 0
hw.ath0.rxantenna = 1
hw.ath0.diversity = 1
hw.ath0.txintrperiod = 5
hw.ath0.diag = 0
hw.ath0.tpscale = 0
hw.ath0.tpc = 0
hw.ath0.tpack = 63
hw.ath0.tpcts = 63
hw.ath0.regdomain = 0

>Description:
This system mainly acts as a WLAN access point, routing traffic between
three IPv4 networks. About once or twice a week the system panics as
described below. Slightly modifying sys/dev/ic/athrate-sample.c and
waiting for the next panic revealed that both ndx and sn->num_rates
indeed were zero.

>How-To-Repeat:
I haven't yet found out how to deliberately force the panic. When it
happens, the panic message is `panic: ndx is 0', and ddb shows:

panic: ndx is 0
Stopped at      netbsd:cpu_Debugger+0x4:        popl    %ebp
db> bt
cpu_Debugger(c0ee88c0,0,0,c0ee8854,0) at netbsd:cpu_Debugger+0x4
panic(c026dee7,0,0,0,20) at netbsd:panic+0x12b
ath_rate_findrate(c10e5000,c10f4000,0,108,c0ee896f) at netbsd:ath_rate_findrate+0x3de
ath_start(c10e503c,c1185700,2,5,c1185700) at netbsd:ath_start+0x941
ifq_enqueue(c10e503c,c118dc00,c10e5160,c10e5160,c0ee8a6c) at netbsd:ifq_enqueue+0xb5
ether_output(c10e503c,c1185700,c0e962f8,c1048908,0) at netbsd:ether_output+0x3bc
ip_output(c118dc00,0,c0e962f4,1,0) at netbsd:ip_output+0x996
ip_forward(c118dc00,1,8004e39,8004e39,c0ee8b10) at netbsd:ip_forward+0x1b2
ip_input(c118dc00,0,c0ee8b48,c012202e,c106ea80) at netbsd:ip_input+0x4fb
ipintr(c0ee0010,30,10,c0ee0010,c0ee8ce0) at netbsd:ipintr+0x59
DDB lost frame for netbsd:Xsoftnet+0x41, trying 0xc0ee8b50
Xsoftnet() at netbsd:Xsoftnet+0x41
--- interrupt ---
0x246:
db> show registers
ds          0x10
es          0x10
fs          0x30
gs          0x10
edi         0xc10f41f8
esi         0xc026dee7  copyright+0x9d67
ebp         0xc0ee8828  _prop_array_pool+0x46e28
ebx         0
edx         0x7
ecx         0x286
eax         0x1
eip         0xc0205411  cpu_Debugger+0x4
cs          0x8
eflags      0x246
esp         0xc0ee8828  _prop_array_pool+0x46e28
ss          0x10
netbsd:cpu_Debugger+0x4:        popl    %ebp
db> reboot
syncing disks... sip1: receive ring overrun
sip0: receive ring overrun
done

>Fix:
...would be most appreciated.