Subject: repear removal fallout in pmap
To: None <port-sparc64@netbsd.org>
From: john heasley <heas@shrubbery.net>
List: port-sparc64
Date: 02/09/2004 19:04:09
I have problem where exiting processes (eg: tset from single-user shell)
are triggering a panic in ctx_alloc().  This appears to be resulting from
the repear removal.

The best I can tell, the exiting process' ctx is removed as it's pmap is
switched to proc0's in preparation for the pmap removal.  the panic is in
ctx_alloc, a KASSERT of pmap != kernel_pmap.

panic: kernel diagnostic assertion "pm != pmap_kernel()" failed: file "../../../../arch/sparc64/sparc64/pmap.c", line 3124
kdb breakpoint at 115c278
Stopped in pid 9.1 (tset) atT    netbsd:cpu_Debugger+0x4:        nop
db> tra
__assert(1205690, 1239e58, c34, 123aab8, 1ca1f40, 0) at netbsd:__assert+0x18
ctx_alloc(1897960, 1211f98, 356, 0, 40100, f) at netbsd:ctx_alloc+0x1e8
pmap_activate_pmap(1897960, 400006, 6, 1212d28, 140, 1823c00) at netbsd:pmap_act
ivate_pmap+0x14
uvm_proc_exit(f4e44a0, 0, 0, 1, 12, 800) at netbsd:uvm_proc_exit+0x34
exit1(f4e8800, 0, 402060e0, ffffffffffffd7c8, 12, ffffffffffffd4c0) at netbsd:ex
it1+0x2e8
sys_exit(0, f547dd0, f547dc0, 40700000, 9182009200, ff000000000000) at netbsd:sy
s_exit+0x38
syscall(f547ed0, 1, 40730654, 800, 1875800, 0) at netbsd:syscall+0x2d4
?(0, 81c06000, ffffffffffffdaa8, 2080, 0, 800) at 0x1009614

db> ps
 PID           PPID     PGRP        UID S   FLAGS LWPS          COMMAND    WAIT
>9                8        8          0 2  0x6002    1             tset
 8                1        8          0 2  0x4002    1               sh

db> mach ctx
process 0xf4d9680:pid:9 pmap:0x1897960 ctx:0
        lwp 0xf4e8800: lid:1 tf:0xf547ed0 fpstate 0x4211800 lastcall:cpu_lwp_for
k()
process 0x188bd80:pid:0 pmap:0x1897960 ctx:0
        lwp 0x188c058: lid:1 tf:0x0 fpstate 0x0 lastcall:cpu_lwp_fork()

I'm not sure how to fix this.  I'd have a go at it, but can't do it right
now.

btw, a commit to pmap.c of a few days ago suggested that cpu_switch() would
re-allocate a ctx if the pmap remove (possibly other things?) were
interrupted.  a novice i am, but cpu_switch doesnt appear to do that if
the it's using proc0's.