Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

iscsi-target problem with VirtualBox initiator



(long post, sorry)

I've been playing around with iscsi-target (current CVS version) and
have found what seems to be a problem regarding its handling of
sequence numbers on outgoing iscsi responses.  I've attached some
tcpdumps of attempts to boot a VirtualBox instance off an iscsi target. 
If you load them up in Wireshark, it'll decode the iscsi packets. 
Basically, it doesn't get past the login stage, and pops up an error
box with a generic "VERR_TIMEOUT" message.  No more detail is in the
VBox.log file.

What I found interesting about the dump was the sequence numbers:
( ">" are from VirtualBox, "<" are from iscsi-target)

1> Login Command:      CmdSN: 0x00000001  ExpStatSN: 0x00000001
2< Login Response:  ExpCmdSN: 0x00000001     StatSN: 0x00000001
3> Login Command:      CmdSN: 0x00000001  ExpStatSN: 0x00000002
4< Login Response:  ExpCmdSN: 0x00000001     StatSN: 0x00000001
... timeout

Packet 4 is responding to a packet that has ExpStatSN=2, so shouldn't
it's StatSN be 2 instead of 1?

Here are the numbers from a (successful) virtualbox login to a Solaris
iscsi target:

1> Login Command:      CmdSN: 0x00000001  ExpStatSN: 0x00000001
2< Login Response:  ExpCmdSN: 0x00000001     StatSN: 0x00000001
3> Login Command:      CmdSN: 0x00000001  ExpStatSN: 0x00000002
4< Login Response:  ExpCmdSN: 0x00000001     StatSN: 0x00000002
5> SCSI Command:       CmdSN: 0x00000001  ExpStatSN: 0x00000003
6< SCSI Data In:    ExpCmdSN: 0x00000002     StatSN: 0x00000003

... etc.  Each end seems to take the Exp*SN from the previous packet
and use it for its next outgoing *SN.

The code responsible for setting StatSN on login packets is in
target.c:login_command_t(), around line 940.  First it sets rsp.StatSN
to cmd.ExpStatSN, and then if the response is not an error response, it
resets it to ++(sess->StatSN).  With the following patch, logins
succeed:

--- target.c    2008-10-21 14:31:39.964698894 -0500
+++ target.c    2008-10-21 13:25:50.713985000 -0500
@@ -952,7 +952,6 @@
                if (rsp.transit && (rsp.nsg == ISCSI_LOGIN_STAGE_FULL_FEATURE)) 
{
                        rsp.version_max = ISCSI_VERSION;
                        rsp.version_active = ISCSI_VERSION;
-                       rsp.StatSN = ++(sess->StatSN);
                        rsp.tsih = sess->tsih;
                }
        }


I then get tripped up on regular SCSI commands:

1> Login Command:      CmdSN: 0x00000001  ExpStatSN: 0x00000001
2< Login Response:  ExpCmdSN: 0x00000001     StatSN: 0x00000001
3> Login Command:      CmdSN: 0x00000001  ExpStatSN: 0x00000002
4< Login Response:  ExpCmdSN: 0x00000001     StatSN: 0x00000002
5> SCSI Command:       CmdSN: 0x00000001  ExpStatSN: 0x00000003
6< SCSI Data In:    ExpCmdSN: 0x00000002     StatSN: 0x00000000
7< SCSI Response:   ExpCmdSN: 0x00000002     StatSN: 0x00000001

Packet 7 should have a StatSN of 3, but it's 1 instead.

The code that sets StatSN for regular commands is in
target.c:scsi_command_t(), around line 387.  Applying the following
patch in addition to my first patch lets me fully boot a VirtualBox vm:

--- target.c    2008-10-21 14:31:39.964698894 -0500
+++ target.c    2008-10-21 13:25:50.713985000 -0500
@@ -391,10 +391,10 @@ response:
                scsi_rsp.length = scsi_cmd.status ? scsi_cmd.length : 0;
                scsi_rsp.tag = scsi_cmd.tag;
                /* If r2t send, then the StatSN is already incremented */
                if (sess->StatSN < scsi_cmd.ExpStatSN) {
-                       ++sess->StatSN;
+                       sess->StatSN = scsi_cmd.ExpStatSN;
                }
                scsi_rsp.StatSN = sess->StatSN;
                scsi_rsp.ExpCmdSN = sess->ExpCmdSN;
                scsi_rsp.MaxCmdSN = sess->MaxCmdSN;
                scsi_rsp.ExpDataSN = (!scsi_cmd.status && scsi_cmd.input) ? 
DataSN : 0;


The problem is, I don't understand RFC 3720 enough to say which of
virtualbox or iscsi-target is behaving badly (or whether they both
are).  My patches work (and don't break either the Windows or Solaris
native initiators' ability to connect), but my guess is that there's a
more-correct fix.  Help!


Description of attachments:

vbox-solaris.pcap : successful login from vbox to solaris as reference
vbox-bsd-broken1.pcap : failed login to iscsi-target before patching
vbox-bsd-broken2.pcap : failed login to iscsi-target with first patch
vbox-bsd-works.pcap : successful login to iscsi-target with both patches

172.16.0.207 - XP running VirtualBox 2.0.4
172.16.0.101 - BSD running iscsi-target
172.16.0.137 - Solaris running iscsitgtd

In case the attachments get stripped, they are also available at
http://www.evoy.net/iscsi-target/ .

-- 
        Dan Nelson
        dnelson%allantgroup.com@localhost

Attachment: vbox-bsd-broken1.pcap
Description: application/cap

Attachment: vbox-bsd-broken2.pcap
Description: application/cap

Attachment: vbox-bsd-works.pcap
Description: application/cap

Attachment: vbox-solaris.pcap
Description: application/cap



Home | Main Index | Thread Index | Old Index