Subject: Re: kern/30398: panic in ohci_softintr
To: None <gnats-bugs@netbsd.org>
From: Karl Janmar <karl@utopiafoundation.org>
List: netbsd-bugs
Date: 08/24/2005 15:11:43
This is a multi-part message in MIME format.
--------------070803040104010807080506
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

(This PR refers to the same problem as PR:
kern/30331 [serious/medium]: kernel panic somethimes with USB printer)

I have posted a message to tech-kern@ some time ago, but I didn't got any reply.
http://mail-index.netbsd.org/tech-kern/2005/07/13/0007.html

Since the USB guys havn't answered my question: "what happens if we just ignore
this condition and not issuing a panic?"
I guess they don't think this is a general problem that is prioritized.

As I write in my post to tech-kern I think Linux just print some error message
when this condition occur and continue. I have made a patch for netbsd that do
the same. Now my machine doesn't reboot ever 3 day or so, but I a lot of error
messages....but anyway it's still better than having a constantly panic'ing system.
I don't claim this is the best fix for the problem, but at least it stop the
system from constantly panic'ing. And it looks like I'm not the only one running
in to this problem. Why prevent people using netbsd on the same hardware that
they could run linux on (if this isn't a sw bug). I have been running linux on
this particular machine for about 1 year without any USB related panics.

Patch against netbsd-3 appended.

Here is the kernel messages from my system:
Jul 16 18:16:01 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306c7d0 not found
Jul 27 09:01:17 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306c7a0 not found
Jul 27 10:50:54 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306c620 not found
Jul 27 14:09:20 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306c620 not found
Jul 27 20:21:08 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306c7d0 not found
Jul 27 22:13:46 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306c5f0 not found
Jul 28 17:35:17 ngong /netbsd: ohci_softintr: WARNING!! addr 0x0306b7d0 not found
Jul 29 22:17:24 ngong /netbsd: ohci_softintr: WARNING!! addr 0x030697d0 not found
Jul 29 22:43:46 ngong /netbsd: ohci_softintr: WARNING!! addr 0x03069740 not found
Jul 30 13:20:35 ngong /netbsd: ohci_softintr: WARNING!! addr 0x03069680 not found
Jul 30 13:39:40 ngong /netbsd: ohci_softintr: WARNING!! addr 0x030697d0 not
found



Regards,
Karl Janmar
karl@utopiafoundation.org


dogcow@babymeat.com wrote:
>>Number:         30398
>>Category:       kern
>>Synopsis:       panic in ohci_softintr
>>Confidential:   no
>>Severity:       serious
>>Priority:       low
>>Responsible:    kern-bug-people
>>State:          open
>>Class:          sw-bug
>>Submitter-Id:   net
>>Arrival-Date:   Thu Jun 02 05:48:00 +0000 2005
>>Originator:     Tom Spindler
>>Release:        NetBSD 3.99.5
>>Organization:
> 
> 	
> 
>>Environment:
> 
> 	
> 	
> System: NetBSD mercury.babymeat.com 3.99.5 NetBSD 3.99.5 (MERCURY) #19: Mon May 23 22:37:22 PDT 2005 dogcow@mercury.babymeat.com:/media/tmp/obj/obj/usr/src/sys/arch/i386/compile/MERCURY i386
> Architecture: i386
> Machine: i386
> 
>>Description:
> 
> 	
> While trying to get gphoto2 to recognize my camera, I ended up
> (apparently) unplugging something at the wrong time. I have no idea
> if having simultaneous ehci and ohci devices active exacerbated
> the problem.
> 
> panic: ohci_softintr: addr 0x%08lx not found
> #0  0xc0450000 in ?? ()
> (gdb) bt
> #0  0xc0450000 in ?? ()
> #1  0xc026d822 in cpu_reboot (howto=256, bootstr=0x0)
>     at /usr/src/sys/arch/i386/i386/machdep.c:752
> #2  0xc01eed8b in panic (
>     fmt=0xc034a600 "ohci_softintr: addr 0x%08lx not found")
>     at /usr/src/sys/kern/subr_prf.c:245
> #3  0xc015969a in ohci_softintr (v=0xc11c5000)
>     at /usr/src/sys/dev/usb/ohci.c:1287
> #4  0xc0267b6c in softintr_dispatch (which=1)
>     at /usr/src/sys/arch/x86/x86/softintr.c:104
> (gdb) up
> #1  0xc026d822 in cpu_reboot (howto=256, bootstr=0x0)
>     at /usr/src/sys/arch/i386/i386/machdep.c:752
> 752                     dumpsys();
> (gdb) up
> #2  0xc01eed8b in panic (
>     fmt=0xc034a600 "ohci_softintr: addr 0x%08lx not found")
>     at /usr/src/sys/kern/subr_prf.c:245
> 245             cpu_reboot(bootopt, NULL);
> (gdb) up
> #3  0xc015969a in ohci_softintr (v=0xc11c5000)
>     at /usr/src/sys/dev/usb/ohci.c:1287
> 1287                    panic("ohci_softintr: addr 0x%08lx not found", (u_long)done);
> (gdb) print done
> $1 = 1097136
> (gdb) l
> 1282                            done = le32toh(sitd->itd.itd_nextitd);
> 1283                            sidone = sitd;
> 1284                            DPRINTFN(5,("add ITD %p\n", sitd));
> 1285                            continue;
> 1286                    }
> 1287                    panic("ohci_softintr: addr 0x%08lx not found", (u_long)done);
> 1288            }
> 1289
> 1290            DPRINTFN(10,("ohci_softintr: sdone=%p sidone=%p\n", sdone, sidone));
> 1291
> (gdb) print sitd
> $2 = (ohci_soft_itd_t *) 0xcb58bd54
> (gdb) print *sitd
> $3 = {itd = {itd_flags = 239044224, itd_bp0 = 0, itd_nextitd = 3411590852, 
>     itd_be = 3411590664, itd_offset = {1, 0, 54204, 49546, 48556, 52056, 
>       54635, 49180}}, nextitd = 0xcb58bdb4, dnext = 0x206, 
>   physaddr = 3228564800, hnext = {le_next = 0x2, le_prev = 0xcb58bdb4}, 
>   xfer = 0x0, flags = 65280, isdone = 7 '\a'}
> (gdb) up
> #4  0xc0267b6c in softintr_dispatch (which=1)
>     at /usr/src/sys/arch/x86/x86/softintr.c:104
> 104                     (*sih->sih_fn)(sih->sih_arg);
> (gdb) print *sih
> $4 = {sih_q = {tqe_next = 0x0, tqe_prev = 0xc0391f70}, 
>   sih_intrhead = 0xc0391f70, sih_fn = 0xc01591e0 <ohci_softintr>, 
>   sih_arg = 0xc11c5000, sih_pending = 0}
> 
> 
>>How-To-Repeat:
> 
> Unknown.
> 	
> 
>>Fix:
> 
> 	
> 
> 
>>Unformatted:
> 
>  	
>  	

--------------070803040104010807080506
Content-Type: text/plain;
 name="ohci_nopanic.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ohci_nopanic.patch"

Index: dev/usb/ohci.c
===================================================================
RCS file: /cvsroot/src/sys/dev/usb/ohci.c,v
retrieving revision 1.157
diff -u -r1.157 ohci.c
--- dev/usb/ohci.c	11 Mar 2005 19:25:22 -0000	1.157
+++ dev/usb/ohci.c	24 Aug 2005 13:00:29 -0000
@@ -1275,7 +1275,8 @@
 			DPRINTFN(5,("add ITD %p\n", sitd));
 			continue;
 		}
-		panic("ohci_softintr: addr 0x%08lx not found", (u_long)done);
+		printf("ohci_softintr: WARNING!! addr 0x%08lx not found\n", (u_long)done);
+		break;
 	}
 
 	DPRINTFN(10,("ohci_softintr: sdone=%p sidone=%p\n", sdone, sidone));

--------------070803040104010807080506--