Taylor R Campbell a écrit : >> Date: Fri, 4 Oct 2024 10:37:24 +0200 >> From: BERTRAND Joël <joel.bertrand%systella.fr@localhost> >> >> -tco* at tcoichbus? # TCO watch dog timer >> +tco* at ichlpcib? # TCO watch dog timer > > This a curious change to make; what prompted it? Are you using the > watchdog timer? I'm slightly surprised this builds at all, and I'm > not sure it will work. I don't remember I have done this configuration... My config fiel was written a long time ago. For your information, server has crashed last night. >> I have upgraded my tree maybe 10 days ago. Before this upgrade system >> was stable (uptime greater than 120 days). > > When was your tree previously updated? This might help to narrow down > which change might have introduced the problem. (And, if you can > bisect, that would be even more helpful!) Last running kernel has a uptime greater than 100 days. I have rebooted with a up to date -10.0 kernel. Thus, I think faulty patch was introduced after may 2024. >> I've just rebuild a new kernel. I don't know if someone use a system >> with a similar configuration (I suspect a bad interaction between ccd >> and iscsi). But how can I found more information to debug ? I have rebuilt a kernel (same tree) with all diagnostic options. It panics in iscsi routines when iscsictl tries to connect to first iscsi volume. [ 74.238270] panic: mutex_vector_enter,517: uninitialized lock (lock=0xffff938021d86010, from=ffffffff80f71234) [ 74.238270] cpu1: Begin traceback... [ 74.238270] vpanic() at netbsd:vpanic+0x183 [ 74.238270] panic() at netbsd:panic+0x3c [ 74.238270] lockdebug_wantlock() at netbsd:lockdebug_wantlock+0x180 [ 74.248268] mutex_enter() at netbsd:mutex_enter+0x23f [ 74.248268] send_pdu() at netbsd:send_pdu+0x1b5 [ 74.248268] send_logout() at netbsd:send_logout+0x1d4 [ 74.248268] kill_connection() at netbsd:kill_connection+0x2fa [ 74.248268] kill_session() at netbsd:kill_session+0x134 [ 74.248268] iscsiioctl() at netbsd:iscsiioctl+0x30f [ 74.248268] sys_ioctl() at netbsd:sys_ioctl+0x56d [ 74.248268] syscall() at netbsd:syscall+0x196 [ 74.248268] --- syscall (number 54) --- [ 74.248268] netbsd:syscall+0x196: [ 74.248268] cpu1: End traceback... You can download faulty kernel (with and without debug option) at ftp://newton.systella.fr. (files NETBSD.;1 and NETBSD.GDB;1). Please note that this server runs OpenVMS and use binary transfer. > You could try a current kernel. If the problem is there in current, > it may be detected -- and reported in a more obvious way -- by the new > heartbeat(9) diagnostic where each CPU's progress is periodically > checked on by some other CPU I will try. Please note also last I cannot reboot my server with shutdown -r now if I haven't killed (with kill -9) altqd. For me, it's not a real issue as this server is two floors below my office, but for some users, if server was far away... Best regartds, JB
Attachment:
signature.asc
Description: OpenPGP digital signature