Subject: Re: kern/22080: switching with held simple_lock
To: NetBSD GNATS submissions and followups <gnats-bugs@gnats.netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: port-alpha
Date: 08/25/2003 19:03:08
I finally (my alpha was powered off during the mini Ontario energy
crisis last week :-) got a custom kernel built with both MULTIPROCESSING
and LOCKDEBUG options.

I had been doing a wee bit of compiling and messing around with it using
the new kernel and while manually copying some files from NFS to local
disk it just hung, rock solid.

I think this may relate to this PR#22080, though given what I've been
reading about LOCKDEBUG I find it surprising that there was no error
message printed on the console....

(I still haven't looked into why it doesn't see BREAK on the serial
console, but luckily you can halt it from the RCM.)

Aug 25 18:36:54 building su: woods to root on /dev/ttyp0
[halt sent]

RCM>status

Firmware Rev: V1.1
Escape Sequence: ^]^]RCM
Remote Access: DISABLE
Alerts: DISABLE
Alert Pending: NO
Temp (C): 37.0
RCM Power Control: ON
External Power: OFF
Server Power: ON

RCM>halt

Focus returned to COM port

halted CPU 0
CPU 1 is not halted

halt code = 1
operator initiated halt
PC = fffffc000044f51c
P00>>>cont

continuing CPU 0
CP - RESTORE_TERM routine to be called
panic: user requested console halt
Stopped in pid 10838 (sh) at    cpu_Debugger+0x4:       ret     zero,(ra)
db{0}> trace
cpu_Debugger() at cpu_Debugger+0x4
panic() at panic+0x168
console_restart() at console_restart+0x74
XentRestart() at XentRestart+0x90
--- console restart (from ipl 4) ---
_simple_lock() at _simple_lock+0x15c
wakeup() at wakeup+0xcc
schedcpu() at schedcpu+0x34c
softclock() at softclock+0x2b4
hardclock() at hardclock+0x7c0
interrupt() at interrupt+0x180
XentInt() at XentInt+0x1c
--- interrupt (from ipl 0) ---
pmap_enter() at pmap_enter+0xb08
uvm_fault() at uvm_fault+0x2020
uvm_fault_wire() at uvm_fault_wire+0x74
uvm_fork() at uvm_fork+0x98
fork1() at fork1+0x548
sys_fork() at sys_fork+0x38
syscall_plain() at syscall_plain+0x164
XentSys() at XentSys+0x5c
--- syscall (2) ---
--- user mode ---
db{0}> ps
 PID             PPID       PGRP        UID S   FLAGS          COMMAND    WAIT
>10838          10837      10837          0 7 0x84006               sh
 10837          10720      10837          0 3 0x84086           nbmake  piperd
 10720            251      10720          0 3 0x84086              ksh   pause
 686              676        686       1000 3 0x84086              ksh   ttyin
 676              240        676          0 3 0x84084          rlogind  select
 251              249        251       1000 3 0x84086              ksh   pause
 249              240        249          0 3 0x84084          rlogind  select
 248                1        248          0 3 0x84086            getty   ttyin
 246                1        246          0 3 0x80084             cron nanosle
 240                1        240          0 3 0x80084            inetd  select
 211                1        211          0 3 0x80084             ntpd   pause
 176                1        163          0 3 0x80084             nfsd    nfsd
 175                1        163          0 3 0x80084             nfsd    nfsd
 174                1        163          0 3 0x80084             nfsd    nfsd
 173                1        163          0 3 0x80084             nfsd    nfsd
 135                0          0          0 3 0xa0284            nfsio  nfsidl
 134                0          0          0 3 0xa0284            nfsio  nfsidl
 133                0          0          0 3 0xa0284            nfsio  nfsidl
 132                0          0          0 3 0xa0284            nfsio  nfsidl
 130                1        130          0 3 0x80084        mount_mfs  mfsidl
 121                1        121          0 3 0x80084          rpcbind  select
 106                1        106          0 3 0x80084            ipmon nanosle
 99                 1         99          0 3 0x80084          syslogd  select
 9                  0          0          0 3 0xa0204         aiodoned aiodone
 8                  0          0          0 3 0xa0204          ioflush  syncer
 7                  0          0          0 3 0x20204           reaper  reaper
 6                  0          0          0 3 0xa0204       pagedaemon pgdaemo
 5                  0          0          0 3 0xa0204             pms0 pmsrese
 4                  0          0          0 3 0xa0204          mlxtask  mlxzzz
 3                  0          0          0 3 0xa0204         scsibus1  sccomp
 2                  0          0          0 3 0xa0204         scsibus0  sccomp
 1                  0          1          0 3 0x84084             init    wait
 0                 -1          0          0 2 0xa0204          swapper
db{0}> cont
syncing disks... [halt sent]

RCM>halt

Focus returned to COM port
CP - RESTORE_TERM exited with hlt_req = 0, r0 = 00000007.00000000

halted CPU 0
CPU 1 is not halted

halt code = 0
PC = fffffc000044f51c
P00>>>cont
Slot context is not valid
P00>>>init
Initializing...                                                                  


-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>