tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Panic / wedge from bridging tap and vlan
Le 12/02/2025 à 21:26, John Klos a écrit :
Because this is used for NAT for Internet, I can't test whenever I'd
like. I'm still looking for a window.
Can you take all of that information -- the panic message, the stack
trace, the steps to reproduce -- and file a PR with it so it doesn't
get lost and we can track pullups and close it when done?
Certainly. As soon as I can test again, I will.
FWIW I think I just hit the same issue on one of my host here (also
10.1), and I can capture the serial via BMC before the reboot happens.
stacktrace looked like so:
[373.4520985] panic: lock error: Mutex: mutex_vector_enter, 548: locking
against myself: lock 0xffff8526c7f79888 cpu 2 lwp 0xffff851f8a7856c0
[373.4528985] cpu2: Begin traceback...
[373.4528985] vpanic() at netbsd:upanic+0x183
[373.4628981] panic() at netbsd: panic+0x3c
[373.4720975] lockdebug_abort() at netbsd:lockdebug_abort+0x114
[373.4828972] mutex_vector_enter() at netbsd:mutex_vector_enter+0x32b
[373.4828972] bridge_input() at netbsd:bridge_input+0x9f1
[373.4928971] vlan_input() at netbsd:vlan_input+0x143
[373.5828969] ether_input() at netbsd:ether_input+0x4c2
[373.5828969] bridge_input() at netbsd:bridge_input+0xa10
[373.5128967] if percpuq_softint() at netbsd:if_percpuq_softint+8x8d
[373.5220965] softint_dispatch() at netbsd:softint_dispatch+0x95
[373.5228965] DDB lost frame for netbsd:Xsoftintr+0x4c, trying
0xffff888939eda0f0
[373.5328966] Xsoftintr() at netbsd:Xsoftintr+0x4c
[373.5328966] --- interrupt ---
[373.5328966] 0:
[373.5320966] cpu2: End traceback...
[373.5328966] dumping to dev 18,17 (offset=8, size=8359657):
[373.5320966] dump device bad
I cannot extensively test steps to reproduce, the host is a critical
fileserver. But looking at the ifconfig.if files:
- create/up bridge0
- attach a real PHY to it (in my case, wm0)
- create/up a second bridge(4) (bridge10)
- create vlan10
- ifconfig vlan10 vlan 10 vlanif wm0
- vlan10 up
- brconfig bridge10 add vlan10
A few seconds later host paniced, which happens before rc.d reaches
login prompt. Commenting out the last brconfig(8) line from single user
is enough to have everything working again.
This looks like locking around entering the bridge code twice, once for
the PHY, then again in vlan handling (pure speculation though)
--
jym@
Home |
Main Index |
Thread Index |
Old Index