NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [10.0_STABLE] Hard lock



Michael van Elst a écrit :
> On Fri, Nov 22, 2024 at 12:42:59PM +0100, BERTRAND Joël wrote:
>> Michael van Elst a écrit :
>>> On Thu, Nov 21, 2024 at 08:26:11PM +0100, BERTRAND Joël wrote:
>>> Except for the wild mixup of early kernel messages (at time ~2.204)
>>> with iscsi debug messages (at ~77.724) this only hints to a cleanup
>>> of a session.
>>
>> 	I understand. But how can this session be cleaned up ?
>> iSCSI target is used and fsck has just started.
> 
> When the network fails (e.g. timeout), a connection will be closed.
> The code tries to reconnect automatically, but after some retries
> it gives up. When it was the last connection (usually there is only
> one per session), then the session is closed and the disk device
> is detached from the system.
> 
> We have already seen some network problem and it may even cause this.
> But the result should be a failure when accessing the disk. If
> that's fsck, it should report some I/O error. In no way should it
> lock up the system.

	New hard lock this night (but system reboots has it ran with a lock
debug kernel) :


Nov 24 04:31:42 legendre /netbsd: [ 195210.6117279] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 195605.2522802] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 196621.8736806] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 197170.5744073] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 197263.6445312] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 197369.3246735] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 198077.3456182] S3C1: freeing UNUSED pdu
Nov 24 04:31:42 legendre /netbsd: [ 200062.4366678] ccb_timeout: num=1
total=0 d
isp=0 invalid ccb=0xffff928021d2d408
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] cpu0[3 softnet/0]:
hogging k
ernel lock
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560]
ipi_msg_cpu_handler() at net
bsd:ipi_msg_cpu_handler+0x68
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] ipi_cpu_handler() at
netbsd:
ipi_cpu_handler+0x99
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] x86_ipi_handler() at
netbsd:
x86_ipi_handler+0x79
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] Xresume_lapic_ipi()
at netbs
d:Xresume_lapic_ipi+0x18
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] --- interrupt ---
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] mutex_enter() at
netbsd:mute
x_enter+0x50f
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] pool_get() at
netbsd:pool_ge
t+0x78
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560]
pool_cache_get_slow() at net
bsd:pool_cache_get_slow+0x13d
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560]
pool_cache_get_paddr() at ne
tbsd:pool_cache_get_paddr+0x264
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] m_get() at
netbsd:m_get+0x37
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] m_gethdr() at
netbsd:m_gethd
r+0x9
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] tcp_output() at
netbsd:tcp_o
utput+0x135c
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] tcp_input() at
netbsd:tcp_in
put+0x29dc
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] ipintr() at
netbsd:ipintr+0x
993
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] softint_dispatch()
at netbsd:softint_dispatch+0x11c
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] DDB lost frame for
netbsd:Xsoftintr+0x4c, trying 0xffff9284371ae0f0
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] Xsoftintr() at
netbsd:Xsoftintr+0x4c
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] --- interrupt ---
Nov 24 04:31:42 legendre /netbsd: [ 207938.8837560] 4406abff750ac8e5:
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] Kernel lock error:
_kernel_lock,266: spinout
Nov 24 04:31:42 legendre /netbsd:
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] lock address :
netbsd:kernel_lock
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] type         : spin
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] initialized  :
netbsd:main+0x72
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] shared holds :
            0 exclusive:                  1
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] shares wanted:
            0 exclusive:                  5
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] relevant cpu :
            1 last held:                  0
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] relevant lwp :
0xffff8041b670e640 last held: 0xffff8044b4e1d480
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] last locked* :
netbsd:ipintr+0x44
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] unlocked     :
netbsd:tcp_rcvd_wrapper+0x51
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] curcpu holds :
            0 wanted by: 0xffff8041b670e640
Nov 24 04:31:42 legendre /netbsd:
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] panic: LOCKDEBUG:
Kernel lock error: _kernel_lock,266: spinout
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] cpu1: Begin traceback...
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] vpanic() at
netbsd:vpanic+0x183
Nov 24 04:31:42 legendre /netbsd: [ 207938.8998476] panic() at
netbsd:panic+0x3c
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] lockdebug_abort1()
at netbsd:lockdebug_abort1+0xe6
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] _kernel_lock() at
netbsd:_kernel_lock+0x2a7
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] mb_drain() at
netbsd:mb_drain+0x17
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] pool_grow() at
netbsd:pool_grow+0x3b9
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] pool_get() at
netbsd:pool_get+0x3e5
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543]
pool_cache_get_slow() at netbsd:pool_cache_get_slow+0x13d
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543]
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0x264
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] m_get() at
netbsd:m_get+0x37
Nov 24 04:31:42 legendre /netbsd: [ 207938.9037543] m_gethdr() at
netbsd:m_gethdr+0x9
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] wm_add_rxbuf() at
netbsd:wm_add_rxbuf+0x3a
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] wm_rxeof() at
netbsd:wm_rxeof+0x114
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] wm_handle_queue() at
netbsd:wm_handle_queue+0xff
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] softint_dispatch()
at netbsd:softint_dispatch+0x11c
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] DDB lost frame for
netbsd:Xsoftintr+0x4c, trying 0xffff9284376cd0f0
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] Xsoftintr() at
netbsd:Xsoftintr+0x4c
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] --- interrupt ---
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] 0:
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] cpu1: End traceback...
Nov 24 04:31:42 legendre /netbsd:
Nov 24 04:31:42 legendre /netbsd: [ 207938.9137542] dumping to dev 18,1
(offset=253015, size=4162677):

	Kernel writes it dumps a core in /var/crash. But...

legendre# pwd
/var/crash
legendre# ls -l
total 2
-rw-------  1 root  wheel  2 Sep 30  2017 bounds
legendre#

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index