Re: Current port status

To: matthew green <mrg%eterna.com.au@localhost>
Subject: Re: Current port status
From: Frank Wille <frank%phoenix.owl.de@localhost>
Date: Sun, 15 Apr 2012 13:33:09 +0200

matthew green wrote:

>> > panic: kernel diagnostic assertion "cur->pcg_avail == cur->pcg_size"
>> > failed: file "/usr/src/sys/kern/subr_pool.c", line 2573 [...]
> what kernel options do you use?  i am using these, plus more:
>
> include "arch/ofppc/conf/GENERIC"
> options         DEBUG
> options         LOCKDEBUG
> options         DIAGNOSTIC
> options         KMEM_GUARD_DEPTH=0x1000

Ok, now I can reproduce it! Sorry, I should have noticed that this output
can only occur with DEBUG and/or DIAGNOSTIC. :)

I don't think it has something to do with pools or NFS, but more likely with
the vr(4) driver. I could reproduce all kinds of kernel-assertion- and
lockdebug-panics with a simple FTP transfer over vr(4).

For example I could reproduce a similar diagnostic assertion as yours after
running into the infamous "vr0: device timeout" during an FTP file
transfer. Then I pressed CTRL-C to break, and this happened (manually
copied from screen):

ftp> get etc.tgz
vr0: device timeout
...
CTRL-C
receive aborted. Waiting for remote to finish abort.
panic: kernel diagnostic assertion "cur->pcg_avail == cur->pcg_size" failed:
file "/home/frank/netbsd/current/src/sys/kern/subr_pool.c", line 2573
...
db> bt
vpanic+0x21c
kern_assert+0x68
pool_cache_put_slow+0x30c
pool_cache_put_paddr+0x18c
m_ext_free+0xf0
soreceive+0xcc4
soo_read+0x28
dofileread+0x8c
syscall_plain+0x1f8
user SC trap #3 by 0xfde30b3c

Not a real indication of vr(4) there, but here is another one, also during
an FTP receive (there is a vr_start in it):

panic: kernel diagnostic assertion "object != NULL" failed: file
"/home/frank/netbsd/current/src/sys/kern/subr_pool.c", line 2475 
cpu0: Begin traceback...
0xa941d970: at kern_assert+0x68
0xa941d9b0: at pool_cache_get_paddr+0x214
0xa941da00: at m_get+0x3c
0xa941da10: at m_gethdr+0xc
0xa941da20: at vr_start+0xe4
0xa941da80: at ifq_enqueue+0xd0
0xa941daa0: at ether_output+0x324
0xa941daf0: at ip_output+0xab8
0xa941dba0: at tcp_output+0x11f4
0xa941dc70: at tcp_input+0x1038
0xa941de50: at ip_input+0x368
0xa941de80: at ipintr+0xdc
0xa941dec0: at softint_dispatch+0x158
0xa941df20: at softint_fast_dispatch+0xdc
0xa941dfe4: at 0x59e3ba90
trap: kernel read DSI trap @ 0xa932cfe4 by 0x140d58 (DSISR 0x40000000,
err=14), lr 0x141370
Press a key to panic.

And finally the most frequent panic for me is a lockdebug-panic, which might
indicate that vr(4) is locking the same mutex during a soft-interrupt
(vr_start), and then again during a hardware-interrupt (vr_intr)?

---8<---
Mutex error: lockdebug_wantlock: locking against myself

lock address : 0x00000000a002b074 type     :               spin
initialized  : 0x0000000000329424
shared holds :                  0 exclusive:                  1
shares wanted:                  0 exclusive:                  1
current cpu  :                  0 last held:                  0
current lwp  : 0x00000000a0094840 last held: 0x00000000a0094840
last locked* : 0x0000000000329950 unlocked : 0x0000000000329acc
owner field  : 000000000000000000 wait/spin:                0/1

panic: LOCKDEBUG
cpu0: Begin traceback...
0xa941d540: at panic+0x4c
0xa941d580: at lockdebug_abort1+0xdc
0xa941d5a0: at mutex_enter+0x32c
0xa941d5e0: at pool_get+0x7c
0xa941d620: at pool_cache_get_slow+0x214
0xa941d650: at pool_cache_get_paddr+0x290
0xa941d6a0: at m_get+0x3c
0xa941d6b0: at m_gethdr+0xc
0xa941d6c0: at vr_rxeof+0x3c8
0xa941d730: at vr_intr+0x314
0xa941d770: at intr_deliver+0x7c
0xa941d7b0: at pic_handle_intr+0x1ac
0xa941d800: at trapstart+0x684
0xa941d8d0: at lockdebug_locked+0x238
0xa941d8f0: at lockdebug_unlocked+0x88
0xa941d920: at mutex_exit+0x194
0xa941d940: at pool_get+0x1f8
0xa941d980: at pool_cache_get_slow+0x214
0xa941d9b0: at pool_cache_get_paddr+0x290
0xa941da00: at m_get+0x3c
0xa941da10: at m_gethdr+0xc
0xa941da20: at vr_start+0xe4
0xa941da80: at ifq_enqueue+0xd0
0xa941daa0: at ether_output+0x324
0xa941daf0: at ip_output+0xab8
0xa941dba0: at tcp_output+0x11f4
0xa941dc70: at tcp_input+0x1038
0xa941de50: at ip_input+0x368
0xa941de80: at ipintr+0xdc
0xa941dec0: at softint_dispatch+0x158
0xa941df20: at softint_fast_dispatch+0xdc
0xa941dfe4: at 0x59e3ba90
trap: kernel read DSI trap @ 0xa932cfe4 by 0x140d58 (DSISR 0x40000000,
err=14), lr 0x141370
Press a key to panic.
---8<---

The final prove for me that it is a vr(4)-related problem is that when I
switch to mvgbe(4) the problems disappear!

I can reproduce the vr(4) panics 100%. Just connect to a (local) FTP server
and start transfering the binary/sets archives.

-- 
Frank Wille

Follow-Ups:
- re: Current port status
  - From: matthew green

References:
- Re: Current port status
  - From: Frank Wille
- re: Current port status
  - From: matthew green

Prev by Date: re: Current port status
Next by Date: re: Current port status
Previous by Thread: re: Current port status
Next by Thread: re: Current port status
Indexes:

Home | Main Index | Thread Index | Old Index