NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Re: bin/56728: iscsi redundancy doesn't work



Hello,

I tested with kernel version 9.99.96. The problem is still there. I now think there are three independent problems.

Problem 1: it is possible to reproduce a kernel crash with iscsi. This is independent of the iscsi redundancy.

Problem 2: in my environment it is not possible to set up a redundant environment.

Problem 3: the manpage is wrong.


Details on problem 1: If you use iscsictl to open and close connections to the iscsi target, the kernel crashes. iscsi devices are not mounted at this time. Here is a kernel dump:


[ 40275.994983] dk0 at sd3 (NetApp01) deleted
[ 40275.994983] sd3: detached
[ 40275.994983] scsibus4: detached
[ 40282.903575] uvm_fault(0xffffffff81903e00, 0xffffa08067901000, 2) -> e
[ 40282.903575] fatal page fault in supervisor mode
[ 40282.903575] trap type 6 code 0x2 rip 0xffffffff8022d84c cs 0x8 rflags 0x10246 cr2 0xffffa080679011c0 ilevel 0 rsp 0xffffa08397034f08 [ 40282.903575] curlwp 0xffffecdb7be4db00 pid 0.581 lowest kstack 0xffffa083970302c0
[ 40282.903575] panic: trap
[ 40282.903575] cpu0: Begin traceback...
[ 40282.903575] vpanic() at netbsd:vpanic+0x183
[ 40282.903575] panic() at netbsd:panic+0x3c
[ 40282.903575] trap() at netbsd:trap+0xb27
[ 40282.903575] --- trap (number 6) ---
[ 40282.903575] mutex_enter() at netbsd:mutex_enter+0xc
[ 40282.903575] send_nop_out() at iscsi:send_nop_out+0x133
[ 40282.903575] connection_timeout() at iscsi:connection_timeout+0x4d
[ 40282.903575] iscsi_cleanup_thread() at iscsi:iscsi_cleanup_thread+0x7b2
[ 40282.903575] cpu0: End traceback...
[ 40282.903575] dumping to dev 4,1 (offset=22227071, size=12581616):
[ 40282.903575] dump <7>lagg0: link state DOWN (was UP)



Problem 2 details: We are using a NetApp iscsi cluster. Multiple Netapp controllers are used for redundancy. These provide the same device (iqn). If you establish a session to the second controller, NetBSD does not recognize that it is the same iqn. NetBSD thinks it's a new device:

[ 39843.450611] scsibus5 at iscsi0: 1 target, 16 luns per target
[ 39843.450611] sd4 at scsibus5 target 0 lun 11: <NETAPP, LUN C-Mode, 9700> disk fixed
[ 39843.450611] sd4: 10240 GB, 65129 cyl, 16 head, 20607 sec, 512 bytes/sect x 21474836480 sectors
[ 39843.450611] sd4: GPT GUID: d644c65b-110e-4dd0-9500-0cfc70900463
[ 39843.450611] dk1 at sd4: "02203a72-215e-4912-9815-a5025bdd8124", 21474836413blocks at 34, type: ffs
[ 39843.450611] autoconfiguration error: sd4: wedge named 'NetApp01' already existed, using '02203a72-215e-4912-9815-a5025bdd8124'
[ 39843.450611] sd4: async, 8-bit transfers, tagged queueing
[ 39929.604877] dk1 at sd4 (02203a72-215e-4912-9815-a5025bdd8124) deleted
[ 39929.604877] sd4: detached
[ 39929.604877] scsibus5: detached

Problem 3: the man page for iscsictl contains errors. For example, the add_connection command requires an -I parameter that is not described. The meaning of -m is not described. There seem to be errors with other commands as well.

Thank you for your efforts


Regards
Uwe


On Wed, 2 Mar 2022, 6bone%6bone.informatik.uni-leipzig.de@localhost wrote:

Date: Wed, 2 Mar 2022 13:42:19 +0100 (CET)
From: 6bone%6bone.informatik.uni-leipzig.de@localhost
To: gnats-bugs%netbsd.org@localhost
Cc: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: [Extern] Re: bin/56728: iscsi redundancy doesn't work

Hello,

if it helps with troubleshooting, here is a screenshot of recent iscsi crashes. I can't provoke the crash. The problem has happened to me several times. A dump is not written. Therefore I can only offer the image.

It is always the position with "conn != zero".

https://speicherwolke.uni-leipzig.de/index.php/s/fqjRTwTXszL2pXA


Thank you for your efforts


Regards
Uwe



Home | Main Index | Thread Index | Old Index