Subject: kern/29839: kernel panic (with the help of pf) on diskless sun4m
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <fanch@enki.dyndns.org>
List: netbsd-bugs
Date: 03/30/2005 19:08:01
>Number: 29839
>Category: kern
>Synopsis: kernel panic (with the help of pf) on diskless sun4m
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Mar 30 19:08:00 +0000 2005
>Originator: fanch
>Release: NetBSD 3.99.1 (-current 26/03/05)
>Organization:
>Environment:
System: NetBSD bean.reso 3.99.1 NetBSD 3.99.1 (SUN4MC) #1: Tue Mar 29 21:56:44 CEST 2005 root@ptitordi:/usr/tmp/src/usr/src/sys/arch/sparc/compile/SUN4MC
Architecture: sparc
Machine: sparc
>Description:
On a diskless sun4m (SS10, SuperSparc II, 96Mo) used for setting up a
firewall, kernel crash a few seconds/minutes after enabling pf. pf is
disabled (pass in/out on le0 all).
The firewall is intended to be a sun4c (SS1+, 32Mo), so the kernel is
stripped to a minimun, sun4m/c compatible.
A ddb session :
panic: lockmgr: no context
Stopped at netbsd:cpu_Debugger+0x4: or %o7, %g0, %g1
db> bt
cpu_Debugger(0xf01dd528, 0xf020b634, 0xf04e7000, 0x2, 0x100, 0xf0215000) at netbsd:lockmgr+0x28c
lockmgr(0xf022e2bc, 0x1, 0x0, 0xf048b2a8, 0x0, 0xf052605c) at netbsd:uvmfault_lookup+0x18c
uvmfault_lookup(0xf020b780, 0x0, 0xf020b6b0, 0xf04ac8f0, 0x0, 0xf0526048) at netbsd:uvm_fault+0x58
uvm_fault(0xf022e2b8, 0x0, 0x0, 0x1, 0x1236, 0xeb0) at netbsd:mem_access_fault4m+0x350
mem_access_fault4m(0x9, 0x326, 0x14, 0xf020b8e8, 0x0, 0xf006ec00) at 0xf0006408
0xf0006408(0x0, 0xf04e7000, 0xf020ba04, 0x0, 0x6, 0xf020b8e0) at netbsd:pfil4_wrapper+0x34
pfil4_wrapper(0x0, 0xf020ba04, 0xf04e7000, 0x1, 0xf04c46e0, 0xffff) at netbsd:pfil_run_hooks+0x88
pfil_run_hooks(0xf02263c0, 0xf020bac4, 0xf04e7000, 0x1, 0x0, 0xf0480c64) at netbsd:ip_input+0x228
ip_input(0xf0480c00, 0xf0480c00, 0x440, 0xeedb, 0x100, 0xf0223a98) at netbsd:ipintr+0x88
ipintr(0x0, 0xfe029010, 0x0, 0x6a23, 0x100, 0x400) at netbsd:softnet+0x78
softnet(0xf020bbb0, 0xf019fd68, 0x100, 0x408000e7, 0x37, 0x424ae261) at 0xf0006870
0xf0006870(0x1, 0x0, 0xf00e6fa8, 0xf0232000, 0x200d2950, 0x60) at netbsd:cpu_exit+0xb8
db> ps
PID PPID PGRP UID S FLAGS LWPS COMMAND WAIT
436 479 436 0 2 0x4002 1 ksh ttyin
479 478 479 1000 2 0x4102 1 su wait
478 467 478 1000 2 0x4002 1 ksh pause
467 471 467 0 2 0x4103 1 login wait
471 441 441 0 2 0x4000 1 telnetd poll
438 315 438 0 2 0x4002 1 tcpdump bpf
315 428 315 0 2 0x4002 1 ksh pause
428 1 428 0 2 0x4103 1 login wait
405 1 405 0 2 0 1 cron nanosle
441 1 441 0 2 0 1 inetd kqread
397 1 397 0 2 0x100 1 sendmail select
363 1 363 0 2 0 1 sshd select
317 1 317 0 2 0 1 ntpd pause
122 1 122 0 2 0 1 ifwatchd netio
120 1 120 0 2 0 1 syslogd kqread
9 0 0 0 2 0x20200 1 aiodoned aiodone
8 0 0 0 2 0x20200 1 ioflush syncer
7 0 0 0 2 0x20200 1 pagedaemon pgdaemo
6 0 0 0 2 0x20200 1 nfsio nfsidl
5 0 0 0 2 0x20200 1 nfsio nfsidl
4 0 0 0 2 0x20200 1 nfsio nfsidl
3 0 0 0 2 0x20200 1 nfsio nfsidl
2 0 0 0 2 0x20200 1 scsibus0 sccomp
1 0 1 0 2 0x4000 1 init wait
0 -1 0 0 2 0x20200 1 swapper schedul
db> show uvmexp
Current UVM status:
pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
23293 VM pages: 4539 active, 0 inactive, 889 wired, 16380 free
min 10% (25) anon, 10% (25) file, 5% (12) exec
max 80% (204) anon, 50% (128) file, 30% (76) exec
pages 2321 anon, 2028 file, 1417 exec
freemin=64, free-target=85, inactive-target=0, wired-max=7764
faults=83541, traps=42953, intrs=156516, ctxswitch=19434
softint=19057, syscalls=43360, swapins=0, swapouts=0
fault counts:
noram=0, noanon=0, pgwait=0, pgrele=0
ok relocks(total)=704(704), anget(retrys)=9669(0), amapcopy=5585
neighbor anon/obj pg=8034/60898, gets(lock/unlock)=26345/704
cases: anon=6264, anoncow=3405, obj=13427, prcopy=12918, przero=6190
daemon and swap counts:
woke=0, revs=0, scans=0, obscans=0, anscans=0
busy=0, freed=0, reactivate=0, deactivate=0
pageouts=0, pending=0, nswget=0
nswapdev=1, nanon=25929, nanonneeded=25929 nfreeanon=24000
swpages=4096, swpginuse=0, swpgonly=0 paging=0
db>
When I take a look at the code of cpu_exit, I see a curlwp=NULL. The exact
condition which trigger the lockmgr message seen above. The offset in cpu_exit
is different at each panic, the others function offsets remain the same.
If it can help :
$ nm -n netbsd
[...]
f0006324 t memfault_sun4m
f0006444 t normal_mem_fault
[...]
f0006758 t softintr_common
f00068a0 T sparc_interrupt4m
[...]
>How-To-Repeat:
Enable pf on a diskless host ?
>Fix:
- intr not disabled ?
- non-pageable text made pageable ?