NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Re: bin/56728: iscsi redundancy doesn't work



Hello,

iscsictl add_send_target -a 172.18.86.130
iscsictl add_send_target -a 172.18.86.131
iscsictl refresh_targets
OK

iscsictl list_targets
     1: iqn.1992-08.com.netapp:naclaug
        2: 172.18.86.130:3260,1026
        3: 172.18.86.131:3260,1027

iscsictl login -P 2
Created Session 3, Connection 1

iscsictl list_sessions
Session 3: Target iqn.1992-08.com.netapp:naclaug

iscsictl add_connection -I 3
iscsictl: add_connection: The login failed
-> crash (no drive is muted yet!)


[  1326.174188] scsibus4 at iscsi0: 1 target, 16 luns per target
[ 1326.174188] sd3 at scsibus4 target 0 lun 11: <NETAPP, LUN C-Mode, 9700> disk fixed [ 1326.174188] sd3: 10240 GB, 65129 cyl, 16 head, 20607 sec, 512 bytes/sect x 21474836480 sectors
[  1326.174188] sd3: GPT GUID: d644c65b-110e-4dd0-9500-0cfc70900463
[ 1326.174188] dk0 at sd3: "NetApp01", 21474836413 blocks at 34, type: ffs
[  1326.174188] sd3: async, 8-bit transfers, tagged queueing
[  1355.946583] S3C2: Login failed (rc 4)
[  1355.946583] S3C2: *** Connection Error, status=18, logout=2, state=6
[  1356.950793] dk0 at sd3 (NetApp01) deleted
[  1356.950793] sd3: detached
[  1356.950793] scsibus4: detached
[  1358.296346] uvm_fault(0xffffffff819014c0, 0xffffd68067902000, 2) -> e
[  1358.296346] fatal page fault in supervisor mode
[ 1358.306388] trap type 6 code 0x2 rip 0xffffffff8022d80c cs 0x8 rflags 0x10246 cr2 0xffffd680679021c0 ilevel 0 rsp 0xffffd68396627f08 [ 1358.306388] curlwp 0xfffff80d32d67a00 pid 0.554 lowest kstack 0xffffd683966232c0
[  1358.306388] panic: trap
[  1358.306388] cpu1: Begin traceback...
[  1358.306388] vpanic() at netbsd:vpanic+0x156
[  1358.306388] panic() at netbsd:panic+0x3c
[  1358.306388] trap() at netbsd:trap+0xb27
[  1358.306388] --- trap (number 6) ---
[  1358.306388] mutex_enter() at netbsd:mutex_enter+0xc
[  1358.306388] send_nop_out() at iscsi:send_nop_out+0x133
[  1358.306388] connection_timeout() at iscsi:connection_timeout+0x4d
[  1358.306388] iscsi_cleanup_thread() at iscsi:iscsi_cleanup_thread+0x7b2
[  1358.306388] cpu1: End traceback...

[  1358.306388] dumping to dev 4,1 (offset=22227071, size=12581616):
[  1358.306388] dump <4>mfi0: workqueue busy: updates stopped
[  1390.048733] coretemp0: workqueue busy: updates stopped
[  1390.048733] coretemp1: workqueue busy: updates stopped
[  1390.048733] coretemp2: workqueue busy: updates stopped
[  1390.048733] coretemp3: workqueue busy: updates stopped
ipmi0: workqueue busy: updates stopped

(gdb) target kvm netbsd.31.core
0xffffffff80226145 in cpu_reboot (howto=howto@entry=260,
    bootstr=bootstr@entry=0x0)
at /mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/arch/amd64/amd64/machdep.c:720
720                     dumpsys();

(gdb) bt
#0  0xffffffff80226145 in cpu_reboot (howto=howto@entry=260,
    bootstr=bootstr@entry=0x0)
at /mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/arch/amd64/amd64/machdep.c:720
#1  0xffffffff80d37917 in kern_reboot (howto=howto@entry=260,
    bootstr=bootstr@entry=0x0)
    at /mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/kern/kern_reboot.c:73
#2  0xffffffff80d7afe2 in vpanic (fmt=fmt@entry=0xffffffff81390116 "trap",
    ap=ap@entry=0xffffd68396627cc8)
    at /mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/kern/subr_prf.c:290
#3  0xffffffff80d7b0a7 in panic (fmt=fmt@entry=0xffffffff81390116 "trap")
    at /mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/kern/subr_prf.c:209
#4  0xffffffff80228f67 in trap (frame=0xffffd68396627e10)
at /mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/arch/amd64/amd64/trap.c:326
#5  0xffffffff80221023 in alltraps ()
#6  0xffffd680679021c0 in ?? ()
#7  0x0000000000000000 in ?? ()

list *(0xffffffff80226145)
0xffffffff80226145 is in cpu_reboot (/mnt/iscsi_iqn.1992-08.com.netapp/usr/src/sys/arch/amd64/amd64/machdep.c:720).
715             /* Disable interrupts. */
716             s = splhigh();
717
718             /* Do a dump if requested. */
719             if ((howto & (RB_DUMP | RB_HALT)) == RB_DUMP)
720                     dumpsys();
721
722     haltsys:
723             doshutdownhooks();
724


Thank you for your efforts

Regards
Uwe



On Wed, 23 Feb 2022, Michael van Elst wrote:

Date: Wed, 23 Feb 2022 04:00:02 +0000 (UTC)
From: Michael van Elst <mlelstv%serpens.de@localhost>
Reply-To: gnats-bugs%netbsd.org@localhost
To: gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost,
    6bone%6bone.informatik.uni-leipzig.de@localhost
Subject: [Extern] Re: bin/56728: iscsi redundancy doesn't work

The following reply was made to PR bin/56728; it has been noted by GNATS.

From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: bin/56728: iscsi redundancy doesn't work
Date: Wed, 23 Feb 2022 03:58:57 -0000 (UTC)

6bone%6bone.informatik.uni-leipzig.de@localhost writes:

>On Tue, 22 Feb 2022, Michael van Elst wrote:

>>
>> Do you have any messages from that crash? A Backtrace ?
>>

>Does that help?


>[ 98560.538286] sd3d: error reading fsbn 7415496930 of
>7415496930-7415496993 (sd3 bn 7415496930; cn 22490 tn 13 sn 6159)
>[ 98591.063909] uvm_fault(0xffffffff81901bc0, 0xffff860067907000, 1) -> e
>[ 98591.063909] fatal page fault in supervisor mode
>[ 98591.063909] trap type 6 code 0 rip 0xffffffff8025381b cs 0x8 rflags
>0x10282 cr2 0xffff860067907070 ilevel 0 rsp 0xffff860396d57c20
>[ 98591.063909] curlwp 0xffff8186f0a7c940 pid 0.390 lowest kstack


Looks like sd_diskstart is running with a NULL periph pointer.
I don't see how that happens yet, but it is probably the result of
detaching the sd device while in use. The detach message is
probably not yet printed.

So that's one problem.

Detaching the sd device happens when no connection to the iscsi
server exists and no connection cannot be re-established
either.

For multiple connections to an iscsi server you need to do something
like:

add_send_target    -> add target to list
refresh_targets    -> get portals
login              -> establish session (creates sd)
add_connection     -> add redundant connection to session

the man page doesn't look correct.

# iscsictl add_send_target -a x.x.x.x
Added Send Target 1
# iscsictl refresh_targets
OK
# iscsictl list_targets
     1: iqn.2007-09.jp.ne.peach.istgt:pbulk1
        2: x.x.x.x:3260,1
     3: iqn.2007-09.jp.ne.peach.istgt:test
        4: x.x.x.x:3260,1
# iscsictl login -P 4
Created Session 2, Connection 1
# iscsictl list_sessions
Session 2: Target iqn.2007-09.jp.ne.peach.istgt:test
# iscsictl add_connection -I 2
Added Connection 2

tcp        0      0  y.y.y.y.65330       x.x.x.x.3260         ESTABLISHED
tcp        0      0  y.y.y.y.65331       x.x.x.x.3260         ESTABLISHED

# tcpdrop y.y.y.y 65530 x.x.x.x 3260

[ 793856.693477] S2C2: *** Connection Error, status=18, logout=2, state=3
[ 793856.693477] S2C2: Write failed sock 0xffff8524621c5480 (ret: 32, req: 48, resid: 48)
[ 793856.693477] S2C2: *** Connection Error, status=18, logout=-1, state=5
[ 793858.693531] S2C2: Connection ReCreated successfully - status 0


tcp        0      0  y.y.y.y.65329       x.x.x.x.3260         ESTABLISHED
tcp        0      0  y.y.y.y.65331       x.x.x.x.3260         ESTABLISHED

Not exactly the same (only a single target IP), but it shows how the
connection gets re-established.




Home | Main Index | Thread Index | Old Index