Port-alpha archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: isp panic with 64-bit card



In article <20141229222448.GB4064%knightrider.shangtai.net@localhost>,
Staffan Thomén  <duck%shangtai.net@localhost> wrote:
>> We'll see if I have an opportunity to get a stack trace.
>
> Well, that didn't take long. I discovered that the clock battery was dead flat
> and replacing that cleared the "no timer interrupts on CPU 0" problem at least.
> Let's hope that fixes the random resets.
>
> Here's a stack trace as promised, while running in single-user, I (without
> reflecting on it being possibly a panic trigger) ran fsck without issue but:
>
># find . > /dev/null
>(nothing happens)
>
># find . > testfil  
>isp1: Unhandled Response Type 0x2
>isp1: Not RESPONSE in RESPONSE Queue (type 0x2) @ idx 13 (next 14) nlooked 1
>isp1: Request Queue Entry:                                                  
>isp1: 0x00000000: 02 01 00 00 00 fc ff ff 06 00 00 00 00 00 00 00
>isp1: 0x00000010: 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00
>isp1: 0x00000020: cc 01 43 00 00 fc ff ff 60 57 25 00 00 fc ff ff
>isp1: 0x00000030: b8 38 cc 00 00 fc ff ff 06 00 00 00 00 00 00 00
>
>(long pause)
>
>panic: kernel diagnostic assertion "(l->l_pflag & LP_INTR) == 0" failed:
>file "/data/tmp/src/sys/kern/kern_synch.c", line 187 
>Stopped in pid 0.5 (system) at  netbsd:cpu_Debugger+0x14:       or      zero,s6,
>sp                                                                             
>db> bt
>cpu_Debugger() at netbsd:cpu_Debugger+0x14
>vpanic() at netbsd:vpanic+0x28c           
>kern_assert() at netbsd:kern_assert+0x74
>tsleep() at netbsd:tsleep+0x80          
>isp_mbox_wait_complete() at netbsd:isp_mbox_wait_complete+0x338
>isp_mboxcmd() at netbsd:isp_mboxcmd+0x52c                      
>isp_control() at netbsd:isp_control+0x1560
>isp_dog() at netbsd:isp_dog+0x2a8         
>callout_softclock() at netbsd:callout_softclock+0x6e0
>softint_execute() at netbsd:softint_execute+0x2d0    
>softint_thread() at netbsd:softint_thread+0x80   
>exception_return() at netbsd:exception_return 
>--- root of call graph ---                   
>db> 

Yes, it is trying to sleep from the watchdog interrupt which is a soft
interrupt. This untested patch at least will probably work around the
problem...

Index: isp_netbsd.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/isp_netbsd.c,v
retrieving revision 1.87
diff -u -u -r1.87 isp_netbsd.c
--- isp_netbsd.c	18 Oct 2014 08:33:27 -0000	1.87
+++ isp_netbsd.c	30 Dec 2014 02:44:26 -0000
@@ -819,9 +819,12 @@
 	XS_T *xs = arg;
 	struct ispsoftc *isp = XS_ISP(xs);
 	uint32_t handle;
+	int sok;
 
 
 	ISP_ILOCK(isp);
+	sok = isp->isp_osinfo.mbox_sleep_ok;
+	isp->isp_osinfo.mbox_sleep_ok = 0;
 	/*
 	 * We've decided this command is dead. Make sure we're not trying
 	 * to kill a command that's already dead by getting its handle and
@@ -835,15 +838,13 @@
 		if (XS_CMD_DONE_P(xs)) {
 			isp_prt(isp, ISP_LOGDEBUG1,
 			    "watchdog found done cmd (handle 0x%x)", handle);
-			ISP_IUNLOCK(isp);
-			return;
+			goto out;
 		}
 
 		if (XS_CMD_WDOG_P(xs)) {
 			isp_prt(isp, ISP_LOGDEBUG1,
 			    "recursive watchdog (handle 0x%x)", handle);
-			ISP_IUNLOCK(isp);
-			return;
+			goto out;
 		}
 
 		XS_CMD_S_WDOG(xs);
@@ -884,10 +885,8 @@
 			XS_CMD_C_WDOG(xs);
 			callout_reset(&xs->xs_callout, hz, isp_dog, xs);
 			qe = isp_getrqentry(isp);
-			if (qe == NULL) {
-				ISP_UNLOCK(isp);
-				return;
-			}
+			if (qe == NULL)
+				goto out;
 			XS_CMD_S_GRACE(xs);
 			ISP_MEMZERO((void *) mp, sizeof (*mp));
 			mp->mrk_header.rqs_entry_count = 1;
@@ -900,6 +899,8 @@
 	} else {
 		isp_prt(isp, ISP_LOGDEBUG0, "watchdog with no command");
 	}
+out:
+	isp->isp_osinfo.mbox_sleep_ok = sok;
 	ISP_IUNLOCK(isp);
 }
 



Home | Main Index | Thread Index | Old Index