Subject: kern/3022: scsi_interpret_sense prints nonsense for "extra_len" data
To: None <gnats-bugs@gnats.netbsd.org>
From: John F. Woods <jfw@jfwhome.funhouse.com>
List: netbsd-bugs
Date: 12/11/1996 22:29:18
>Number:         3022
>Category:       kern
>Synopsis:       scsi_interpret_sense prints nonsense for "extra_len" data
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Dec 12 17:20:02 1996
>Last-Modified:
>Originator:     John F. Woods
>Organization:
Misanthropes-R-Us
>Release:        1.2_BETA
>Environment:
	
System: NetBSD jfwhome.funhouse.com 1.2 NetBSD 1.2 (JFW) #5: Sun Nov 3 10:33:41 EST 1996 jfw@jfwhome.funhouse.com:/usr/src/sys/arch/i386/compile/JFW i386


>Description:
The routine scsi_interpret_sense prints nonsense for the "extra_len" data
(error code 0x70 handling code, down at the end).
The sense block Additional Sense Length field counts the number of bytes past
the Additional Sense Length field, covering Command-specific Information,
Additional Sense Code, Additional Sense Code Qualifier, FRU Code, Sense-key
Specific, and "Additional sense bytes".  (section 8.2.14)  The driver, however,
prints "extra_len" bytes starting at "extra_bytes[]", which misses all the
useful information, and prints uninitialized crap to boot.

>How-To-Repeat:
1. Get a disk error.
2. Try to understand the disk error by looking at the SCSI standard and
assuming that the bytes printed mean something.
3. Read the code to discover that they don't, and that the ASC/ASQ fields
didn't even come close to being printed.

>Fix:
I haven't upgraded my sources to -current yet, I don't have the original
scsi_base.c to diff against, and the code I have is a bit rough, intended
just to enable me to see the actual error.

At the end of the code where error code 0x70 is handled, I changed the printf
statement like so:

                        if ((sense->error_code & SSD_ERRCODE_VALID) != 0) {
				switch(key) {
				...
				}
			}
			if (sense->extra_len != 0) {
				int n;
!!				printf(", ASC/Q %x %x",
!!					sense->add_sense_code,
!!					sense->add_sense_code_qual);
				printf(", data = ");
				...

That's not precisely correct, but it worked to reveal the actual disk error
I was getting.  :-)  An example precisely correct code, if overkill, is from
the SCSI_DEBUG code well before that point, which I changed like so:


                printf("info: %x %x %x %x followed by %d extra bytes\n",
                    sense->info[0],
                    sense->info[1],
                    sense->info[2],
                    sense->info[3],
                    sense->extra_len);
                if (sense->extra_len >= 4)
                    printf("cmd spec info: %x %x %x %x; ",
                           sense->cmd_spec_info[0],sense->cmd_spec_info[1],
                           sense->cmd_spec_info[2],sense->cmd_spec_info[3]);
                if (sense->extra_len >=6)
                    printf("ASC/Q: %x %x; ",sense->add_sense_code,
                            sense->add_sense_code_qual);
                if (sense->extra_len >= 7)
                    printf("FRU: %x\n", sense->fru);
                if (sense->extra_len >= 10)
                    printf("SKSV: %x  SKS: %x %x %x\n",
                           !!(sense->sense_key_spec_1 & SSD_SCS_VALID),
                           sense->sense_key_spec_1,
                           sense->sense_key_spec_2,
                           sense->sense_key_spec_3);
                printf("extra: ");
                for (count = 0; count < sense->extra_len - 10; count++)
                        printf("%x ", sense->extra_bytes[count]);
                printf("\n");

The minimum change that would at least get the correct information printed
would be to hex-dump extra_len bytes starting at (char *)&sense->cmd_spec_info;
somewhat nicer would be to adopt the first change with some fiddling to
guarantee that the ASC/Q bytes actually are present; much nicer would be some
reduced version of the above debug code moved down to replace the current bogus
printf.
>Audit-Trail:
>Unformatted: