NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Re: bin/56728: iscsi redundancy doesn't work



The following reply was made to PR bin/56728; it has been noted by GNATS.

From: 6bone%6bone.informatik.uni-leipzig.de@localhost
To: gnats-bugs%netbsd.org@localhost
Cc: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: Re: bin/56728: iscsi redundancy doesn't work
Date: Sat, 30 Apr 2022 11:19:41 +0200 (CEST)

 Hello,
 
 I tested with kernel version 9.99.96. The problem is still there. I now 
 think there are three independent problems.
 
 Problem 1: it is possible to reproduce a kernel crash with iscsi. This is 
 independent of the iscsi redundancy.
 
 Problem 2: in my environment it is not possible to set up a redundant 
 environment.
 
 Problem 3: the manpage is wrong.
 
 
 Details on problem 1: If you use iscsictl to open and close connections to 
 the iscsi target, the kernel crashes. iscsi devices are not mounted at 
 this time. Here is a kernel dump:
 
 
 [ 40275.994983] dk0 at sd3 (NetApp01) deleted
 [ 40275.994983] sd3: detached
 [ 40275.994983] scsibus4: detached
 [ 40282.903575] uvm_fault(0xffffffff81903e00, 0xffffa08067901000, 2) -> e
 [ 40282.903575] fatal page fault in supervisor mode
 [ 40282.903575] trap type 6 code 0x2 rip 0xffffffff8022d84c cs 0x8 rflags 
 0x10246 cr2 0xffffa080679011c0 ilevel 0 rsp 0xffffa08397034f08
 [ 40282.903575] curlwp 0xffffecdb7be4db00 pid 0.581 lowest kstack 
 0xffffa083970302c0
 [ 40282.903575] panic: trap
 [ 40282.903575] cpu0: Begin traceback...
 [ 40282.903575] vpanic() at netbsd:vpanic+0x183
 [ 40282.903575] panic() at netbsd:panic+0x3c
 [ 40282.903575] trap() at netbsd:trap+0xb27
 [ 40282.903575] --- trap (number 6) ---
 [ 40282.903575] mutex_enter() at netbsd:mutex_enter+0xc
 [ 40282.903575] send_nop_out() at iscsi:send_nop_out+0x133
 [ 40282.903575] connection_timeout() at iscsi:connection_timeout+0x4d
 [ 40282.903575] iscsi_cleanup_thread() at iscsi:iscsi_cleanup_thread+0x7b2
 [ 40282.903575] cpu0: End traceback...
 [ 40282.903575] dumping to dev 4,1 (offset=22227071, size=12581616):
 [ 40282.903575] dump <7>lagg0: link state DOWN (was UP)
 
 
 
 Problem 2 details: We are using a NetApp iscsi cluster. Multiple Netapp 
 controllers are used for redundancy. These provide the same device (iqn). 
 If you establish a session to the second controller, NetBSD does not 
 recognize that it is the same iqn. NetBSD thinks it's a new device:
 
 [ 39843.450611] scsibus5 at iscsi0: 1 target, 16 luns per target
 [ 39843.450611] sd4 at scsibus5 target 0 lun 11: <NETAPP, LUN C-Mode, 9700> disk fixed
 [ 39843.450611] sd4: 10240 GB, 65129 cyl, 16 head, 20607 sec, 512 bytes/sect x 21474836480 sectors
 [ 39843.450611] sd4: GPT GUID: d644c65b-110e-4dd0-9500-0cfc70900463
 [ 39843.450611] dk1 at sd4: "02203a72-215e-4912-9815-a5025bdd8124", 21474836413blocks at 34, type: ffs
 [ 39843.450611] autoconfiguration error: sd4: wedge named 'NetApp01' already existed, using '02203a72-215e-4912-9815-a5025bdd8124'
 [ 39843.450611] sd4: async, 8-bit transfers, tagged queueing
 [ 39929.604877] dk1 at sd4 (02203a72-215e-4912-9815-a5025bdd8124) deleted
 [ 39929.604877] sd4: detached
 [ 39929.604877] scsibus5: detached
 
 Problem 3: the man page for iscsictl contains errors. For example, the 
 add_connection command requires an -I parameter that is not described. The 
 meaning of -m is not described. There seem to be errors with other 
 commands as well.
 
 Thank you for your efforts
 
 
 Regards
 Uwe
 
 
 On Wed, 2 Mar 2022, 6bone%6bone.informatik.uni-leipzig.de@localhost wrote:
 
 > Date: Wed, 2 Mar 2022 13:42:19 +0100 (CET)
 > From: 6bone%6bone.informatik.uni-leipzig.de@localhost
 > To: gnats-bugs%netbsd.org@localhost
 > Cc: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
 > Subject: Re: [Extern] Re: bin/56728: iscsi redundancy doesn't work
 > 
 > Hello,
 >
 > if it helps with troubleshooting, here is a screenshot of recent iscsi 
 > crashes. I can't provoke the crash. The problem has happened to me several 
 > times. A dump is not written. Therefore I can only offer the image.
 >
 > It is always the position with "conn != zero".
 >
 > https://speicherwolke.uni-leipzig.de/index.php/s/fqjRTwTXszL2pXA
 >
 >
 > Thank you for your efforts
 >
 >
 > Regards
 > Uwe
 >
 


Home | Main Index | Thread Index | Old Index