NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/59497: Panic in ucompoll



On Sat, Jul 05, 2025 at 09:20:02AM +0000, Christoph Badura via gnats wrote:
> 
>  On Sat, Jul 05, 2025 at 07:45:02AM +0000, Paul Ripke via gnats wrote:
>  >  Re msgbuf, after dumping it out, I realise this crash was actually due to
>  >  a failing USB hub flaking out intermittently. I have seen this device
>  >  intermittently disconnect/reconnect without that hub, so there's still
>  >  something going on.
>  
>  I'm confused.  Does the device also intermittently disconnect/reconnect
>  without the hub?

Yes, it does - but it seems the only crash dump I had was one due to the
failing hub.

>  Anyway, even a flaky USB hub shouldn't cause a panic.

Indeed.

>  >  [ 3804001.2307417] ucom1: detached
>  >  [ 3804001.2307417] uplcom0: detached
>  >  [ 3804001.2307417] uplcom0: at uhub9 port 1 (addr 13) disconnected
>  >  [ 3804001.2407419] ucom0: detached
>  >  [ 3804001.2537420] uhub8: detached
>  >  [ 3804001.2537420] uhub8: at uhub1 port 5 (addr 1) disconnected
>  >  [ 3804001.7737508] uhub8 at uhub1 port 5: GenesysLogic (0x05e3) USB2.0 Hub (0x0610), class 9/0, rev 2.10/92.26, addr 14
>  >  [ 3804001.7737508] uhub8: multiple transaction translators
>  >  [ 3804001.7887511] uhub8: 4 ports with 1 removable, self powered
>  >  [ 3804002.1207568] uvm_fault(0xffffb1c7ac104780, 0x0, 1) -> e
>  >  [ 3804002.1207568] fatal page fault in supervisor mode
>  
>  This looks to me like the trap happens while uplcom0 (did it move from
>  uplcom1?) was disconnected/detached.

It may have moved - I do have two of them, and have used both on occasion.

>  If you are testing the suggested changes (check for sc != NULL and/or the
>  change for spec_poll()) could you add a printf when it triggers so that we
>  can verify that this happens while the uplcom/ucom is disconnected?

I ran a test - with the 'if (sc == null) return POLLHUP' patch, with drivewire
running on /dev/dtyU0, and pulling the USB:

[ 2724.7698958] xhci0: xhci_reset_endpoint: endpoint 0x0: timed out
[ 2724.7738960] WARNING: pipe closed with active xfers on addr 4
[ 2724.7808961] ucom0: detached
[ 2724.7808961] uplcom0: detached
[ 2724.7808961] uplcom0: at uhub8 port 3 (addr 4) disconnected
[ 2725.6829076] ucompoll: sc == NULL
[ 2725.6829076] uvm_fault(0xffff80e5b04f1848, 0x0, 1) -> e
[ 2725.6829076] fatal page fault in supervisor mode
[ 2725.6829076] trap type 6 code 0 rip 0xffffffff80497ab7 cs 0x8 rflags 0x10246 cr2 0xe8 ilevel 0 rsp 0xffff881237c78cd0
[ 2725.6829076] curlwp 0xffff80e6239de680 pid 6220.6226 lowest kstack 0xffff881237c742c0
[ 2725.6829076] panic: trap
[ 2725.6829076] cpu1: Begin traceback...
[ 2725.6829076] vpanic() at netbsd:vpanic+0x183
[ 2725.6839076] panic() at netbsd:panic+0x3c
[ 2725.6849078] trap() at netbsd:trap+0xbaf
[ 2725.6849078] --- trap (number 6) ---
[ 2725.6849078] ucomread() at netbsd:ucomread+0x2a
[ 2725.6849078] cdev_read() at netbsd:cdev_read+0x87
[ 2725.6859078] spec_read() at netbsd:spec_read+0x2d3
[ 2725.6859078] VOP_READ() at netbsd:VOP_READ+0x42
[ 2725.6869079] vn_read() at netbsd:vn_read+0x18e
[ 2725.6869079] dofileread() at netbsd:dofileread+0x79
[ 2725.6869079] sys_read() at netbsd:sys_read+0x49
[ 2725.6879078] syscall() at netbsd:syscall+0x1fc
[ 2725.6879078] --- syscall (number 3) ---
[ 2725.6879078] netbsd:syscall+0x1fc:
[ 2725.6879078] cpu1: End traceback...

So, this exited the poll, and died in read, which I guess is an
improvement? If I get the chance, I'll try to figure out how this
is supposed to work.

btw: this is still on the 10.1 branch.

Cheers,
-- 
Paul Ripke
"Great minds discuss ideas, average minds discuss events, small minds
 discuss people."
-- Disputed: Often attributed to Eleanor Roosevelt. 1948.


Home | Main Index | Thread Index | Old Index