NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Re: kern/58422: kernel crash when using the iscsi initiator



On Mon, 15 Jul 2024, Michael van Elst wrote:

Date: Mon, 15 Jul 2024 04:45:01 +0000 (UTC)
From: Michael van Elst <mlelstv%serpens.de@localhost>
Reply-To: gnats-bugs%netbsd.org@localhost
To: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
    netbsd-bugs%netbsd.org@localhost, 6bone%6bone.informatik.uni-leipzig.de@localhost
Subject: [Extern] Re: kern/58422: kernel crash when using the iscsi initiator

The following reply was made to PR kern/58422; it has been noted by GNATS.

From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: kern/58422: kernel crash when using the iscsi initiator
Date: Mon, 15 Jul 2024 04:42:28 -0000 (UTC)

6bone%6bone.informatik.uni-leipzig.de@localhost writes:

>The dump from the bug report already comes from a kernel with DIAGNOSTIC
>enabled. I will try to cause the crash with hw.iscsi.debug=9.

This is a bit strange. A DIAGNOSTIC kernel should have triggered an
assertion in wake_ccb() instead of crashing with an UVM fault.

The reason is probably that your kernel doesn't have iscsi builtin,
but loaded as a module (which is built without DIAGNOSTIC).


You're right. The iscsi is used as a module. I'm creating a new kernel with iscsi built in.

You're probably right in your assumption that it's a network problem. I noticed that when the iscsi load is high, the load on CPU 0 due to interrupts is extremely high.

load averages:  3.73,  3.73,  2.93;               up 0+11:53:45        08:59:41
53 processes: 1 runnable, 50 sleeping, 2 on CPU
CPU0 states:  0.0% user,  0.0% nice,  4.8% system, 95.2% interrupt,  0.0% idle
CPU1 states:  0.0% user,  0.0% nice, 95.6% system,  0.0% interrupt,  4.4% idle
CPU2 states:  0.0% user,  0.0% nice, 89.5% system,  1.0% interrupt,  9.5% idle
CPU3 states:  0.0% user,  0.0% nice, 96.8% system,  0.0% interrupt,  3.2% idle
Memory: 24G Act, 12G Inact, 17M Wired, 24M Exec, 36G File, 19M Free
Swap: 59G Total, 59G Free / Pools: 8105M Used / Network: 1400K In, 3312K Out

  PID USERNAME PRI NICE   SIZE   RES STATE       TIME   WCPU    CPU  COMMAND
0     root     123    0     0K  422M CPU/3     805:15   238%   238%  [system]
10668 mirror    25    0   298M  248M RUN/1       6:42 56.69% 56.69%  rsync
...

With the high CPU load, network packets can probably be lost.


Thank you for your efforts

Regards
Uwe


Home | Main Index | Thread Index | Old Index