Subject: Re: kern/36716: cd(4) problem with transfers exceeding 65535 bytes
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Stephen M. Rumble <stephen.rumble@utoronto.ca>
List: netbsd-bugs
Date: 08/01/2007 16:35:03
The following reply was made to PR kern/36716; it has been noted by GNATS.

From: "Stephen M. Rumble" <stephen.rumble@utoronto.ca>
To: gnats-bugs@NetBSD.org, Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
	netbsd-bugs@NetBSD.org
Subject: Re: kern/36716: cd(4) problem with transfers exceeding 65535 bytes
Date: Wed, 01 Aug 2007 12:29:59 -0400

 Quoting Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>:
 
 > The following reply was made to PR kern/36716; it has been noted by GNATS.
 >
 > From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
 > To: gnats-bugs@NetBSD.org
 > Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org,
 > 	netbsd-bugs@NetBSD.org, tsutsui@ceres.dti.ne.jp
 > Subject: Re: kern/36716: cd(4) problem with transfers exceeding 65535 bytes
 > Date: Wed, 1 Aug 2007 21:26:54 +0900
 >
 >  rumble@ephemeral.org wrote:
 >
 >  > >How-To-Repeat:
 >  > It should be easily repeatable with an atapi cdrom when used
 >  > with a disklabel where d_secsize != 2048.
 >
 >  I saw the similar problem on sgimips with CD-ROM which had sgivolhdr:
 >  http://mail-index.netbsd.org/port-sgimips/2005/09/12/0000.html
 >  http://mail-index.netbsd.org/port-sgimips/2005/09/14/0000.html
 >
 >  It seems the proper workaround to fix readdisklabel(9) to
 >  always return d_secsize = 2048 for CDs, but I'm not sure.
 
 If I understand things correctly, the problem is that non-2048b  
 sectors weren't working properly and your solution was to use 2048b  
 sectors and update the disklabel offsets/lengths in memory, yes? If  
 that's the case, I think it's only hiding the problem. We need  
 d_secsize == 512 for EFS images, and it should 'just work'.
 
 If you're able to repeat the tests, I'd be interested to know if my  
 aforementioned cd(4) bounce patch resolves the issue for you.
 
 The problem as I see it is that cd(4) will make requests > MAXPHYS  
 when bounce buffers are used and we need to round up one more sector  
 to accommodate unaligned accesses. Is sending such a request to scsipi  
 an invalid thing to do? If it's not permitted, it should be asserted  
 somewhere, and I think my patch is the proper solution.
 
 If, however, it is entirely valid, we have problems elsewhere. So long  
 as in cd.c:796 nblks is <= 32 (corresponding to a maximum transfer of  
 MAXPHYS), the underlying subsystems seem happy. However, if b_bcount  
 is > (2^16), we can't just cap nblks, rather we appear to need to  
 split up requests. I'm assuming nblks is passed to the device, and  
 b_bcount is used to initiate DMA transfers until everything is read.  
 Is it illegal to send a device a read command that cannot be handled  
 in one DMA?
 
 What's really confusing to me is that, in the atapi case, if we leave  
 everything else alone (letting nblks be 33, which is 66k) and instead  
 limit the DMA transfer size in atapi_wdc.c to 0xfffd, everything  
 works. However, if we permit transfers of 0xfffe or 0xffff, it breaks.  
 If we cap nblks at 32, then 0xffff-sized transfers are fine in that  
 interrupts don't get lost, but we don't end up getting all of the  
 requested data from the drive.
 
 It seems that restricting the DMA size in atapi_wdc.c just happens to  
 work by chance and wouldn't fix the real problem (which appears to  
 exist for SCSI as well).
 
 So, in short, questions I'd love to have answered:
    1) Is issuing requests > MAXPHYS to scsipi inherently bad or unsupported?
    2) To satisfy my curiosity, can anybody tell me why limiting the ATAPI DMA
       transfer size to 0xfffd or lower avoids the problem, while limiting to
       0xfffe or 0xffff does not?
 
 Steve