Subject: Re: (Sort of ?) bug found in blk_drv/scsi/scsi.c, sd.
To: Linux-Activists <linux-activists@Niksula.hut.fi>
From: Drew Eckhardt <drew@ophelia.cs.Colorado.EDU>
List: macbsd-development
Date: 01/14/1994 11:17:48
    Quite a couple of recompilations and printk's later, I discovered that the
    reason was the multiple LUN support. I'm not sure yet if this is a bug
    in the code, 

It's usually a feature, but there may be a bug in the low level driver 
for the Amiga onboard SCSI or problems in the drives' firmware (very
unlikely, I haven't seen any problems with either vendors' devices).

A few vendors have broken firmware in their drives which 
responds to an inquiry command on all luns with a type of something
other than 0x7f in the inquiry data (ie, there's no device on this
LUN of the target).  So, we added the blacklist[] to scsi.c and 
modified scan_scsis() so that non-zero luns on these devices wouldn't
be tried and multi-lun support would remain for people with SCSI->SMD
bridges, etc.

   or maybe my disks are too dumb  
    (although the Fujitsu is report
   ed
    as having the SCSI-1 CCS command set, and the Quantum even understands 
    SCSI-2).

Unlikely.
    
    In my own words, here's what I believe is wrong:
    During startup phase, the scsi0 host adapter is scanned for devices
    (scsi.c/scan_scsis()). This fills up an array of scsi_devices[] structures.
    Similarly, in sd.c/sd_init_onedisk() and sd.c/sd_init(), rscsi_disks[]
    structures are filled up (I haven't yet fully understood the relationship
    or difference between these two). 

Since different structures are needed for different devices (CD, tape, disk,
tape robot), I needed to either use a union or have separate structures 
for the different device types.  Since a union would break the abstraction
(Ultimately (using the C++ global initializer linker features that Eric 
pointed out) you should be able to relink devices into the kernel without
recompiling anything, meaning the internal structures can't change), I
couldn't use one.  So, I needed separate structures for the generic 'all
device' fields, and device specific fields.

Since at boot time, all memory must be allocated contiguously and I had 
decided to use arrays for the sake of simplicity, I couldn't allocate a 
generic structure and then a disk structure.

So, the initialization/detection is done in several passes-  a scan for 
SCSI devices, allocation for all of the SCSI disk structures needed for
all SCSI disks, (sd_init), and finally per-slave initialization.

    Then, at the end of scsi.c in 
    scsi_dev_init(), a scsi_devices[] structure is attached to a rscsi_disks[]
    structure by calling sd.c/sd_attach(). This is done for all disks and all
    LUNs. Problem is, I only have one disk at each scsi id/LUN, so there should
    be only 2 disks. However, scsi.c/scan_scsis() reported 16 disks. 

    So, the only thing that seems to be missing is the handling of multiple
    LUNs in scsi.c and sd.c. However, I don't know how this should be done
    correctly - maybe someone with more understanding in the SCSI standard coul
   d
    fill this in.

The code allready handles this correctly - I know of a number of users running
multiple SCSI->ST-506 and SMD bridge boards, with multiple drives, or CD
changers (one LUN per disc), off of this.

The problem is probably in the low level driver.

Linux initializes the LUN field (byte 1, bits 7-5) as appropriate
before sending a Scsi_Cmnd structure to the low level SCSI driver. However,
the mid level code does nothin to insure that the right messages get sent - 
it's up to the low level driver to pick out appropriate messages (ie,
allow or disallow disconnect/reconnect in the IDENTIFY message, etc)

The original SCSI spec allowed the user to specify the LUN in the 
top three bits of byte 1 in the CDB -or- in an IDENTIFY message sent
to the drive, specifying that if an IDENTIFY message was sent, the
LUN field is ignored.

The low level driver is probably sending an IDENTIFY message of 
0x80 or 0xc0, which has the low three bits set to 0 indicating 
LUN 0 irregardless of what the LUN field in the CDB says.

If the low level driver is simply sending an IDENTIFY message of 
0x80 or 0xc0, all commands are being handled for LUN 0 on the 
device, and scan_scsis() gets back a valid response to the INQUIRY 
on what it thinks are all eight luns.

When an identify message is being sent, you should be using the
IDENTIFY macro from scsi.h - 

	IDENTIFY(can_disconnect, lun)

Where can_disconnect is non-zero if you wish to allow the device to
disconnect, and lun is the lun (not bit fielded, 1 = lun 1, 7 = lun 7).




------------------------------

End of 680X0 Digest
*******************
-------
========================================
From: "Linux Activists" <linux-activists@Niksula.hut.fi>
To: "Linux-Activists" <linux-activists@Niksula.hut.fi>
Reply-To: "Linux-Activists" <linux-activists@Niksula.hut.fi>
Subject: Linux-Activists - 680X0 Channel digest. 94-0-16-3:57
Message-Id: <94Jan16.112219eet.24225-3@niksula.hut.fi>
Date: Sun, 16 Jan 1994 11:22:03 +0200


Topics:
	  re:(Sort of ?) bug found in blk_drv/scsi/scsi.c, sd.c 


----------------------------------------------------------------------