Subject: Re: kern/3460: block io unit for sd should be sector size
To: None <cgd@cs.cmu.edu>
From: Koji Imada - je4wob/2 <koji@math.human.nagoya-u.ac.jp>
List: netbsd-bugs
Date: 04/10/1997 06:19:03
>>>>> "Chris" == Chris G Demetriou <cgd@cs.cmu.edu> writes:

 Chris> OK, I wasn't clear on what changes, exactly, had to be made in each
 Chris> case.

I will add diffs for each case at last of this mail. Check it if you want.

Diff for "first method" doesn't include msdosfs
modification. kern/2896 should be applied too.

Anyway, i noticed new problems.

One  about disklabel which sector size != physical sector size.

How can we get disklabel? if you dd'ed diskimage including disklabel
from 1k/sector disk to 512/sector disk, you can't read disklabel.

The readdisklabel in sys/arch/i386/i386/disksubr.c read LABELSECTOR
sector in condition 'disklabel sector size == physical sector
size'. But there isn't. So copied disklabel from 1k/sector disk can't
be read. To read this, supplemental information will be necessary or
we should try to read from multiples of physical sector size.

Another problem is about cd.c and cd9660 file system. In that,
disklabel sector size and physical io unit is inconsistent. So we
need to decide policy. My idea is:

	1. disklabel information is based on disklabel sector size
		unit. And disklabel sector size maybe integer multiple
		of physical sector size if position problem of
		disklabel is resolved.

	2. Block io unit for disk device should be standardized.
		(maybe DEV_BSIZE or disklabel sector size. It's policy 
		problem).

For 2, DEV_BSIZE corresponds to my "first method" and requires file
system modification. The disklabel sector size corresponds to my
"second method". This is simpler but limited(enough?).

Current cd9660 implementation assumes DEV_BSIZE case, it will require
some modification if disklabel sector size is selected. And on the
other hand(DEV_BSIZE case), disklabel initialization code of cd.c
needs modification.

If these problem is cleared, It maybe possible to make disklabel
partitions and ffs for each partition on CD-ROM.

 Chris> In this case, i'd say "avoid modifying the FS code where possible, and
 Chris> require that fs sector size == disklabel sector size."

To select this policy, problem to read disklabel should be
resolved. 

"fs sector size != disklabel sector size" condition means file system
image(partition?) from other physical sector size disk. In this
case(my "first method"), "disklabel sector size == physical sector
size" and problem reading disklabel won't happen.

Following are diffs.

For first method. this add new field in struct fs which is calclated
when mounting.
--- start here ---
Index: sys/arch/i386/i386/disksubr.c
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/arch/i386/i386/disksubr.c,v
retrieving revision 1.1.1.2
diff -c -r1.1.1.2 disksubr.c
*** disksubr.c	1997/04/01 07:21:40	1.1.1.2
--- disksubr.c	1997/04/09 20:12:09
***************
*** 100,105 ****
--- 100,109 ----
  	if (osdep && (dp = osdep->dosparts) != NULL) {
  		/* read master boot record */
  		bp->b_blkno = DOSBBSECTOR;
+ 		if (lp->d_secsize > DEV_BSIZE)
+ 			bp->b_blkno *= lp->d_secsize / DEV_BSIZE;
+ 		else
+ 			bp->b_blkno /= DEV_BSIZE / lp->d_secsize;
  		bp->b_bcount = lp->d_secsize;
  		bp->b_flags = B_BUSY | B_READ;
  		bp->b_cylin = DOSBBSECTOR / lp->d_secpercyl;
***************
*** 148,153 ****
--- 152,161 ----
  	
  	/* next, dig out disk label */
  	bp->b_blkno = dospartoff + LABELSECTOR;
+ 	if (lp->d_secsize > DEV_BSIZE)
+ 		bp->b_blkno *= lp->d_secsize / DEV_BSIZE;
+ 	else
+ 		bp->b_blkno /= DEV_BSIZE / lp->d_secsize;
  	bp->b_cylin = cyl;
  	bp->b_bcount = lp->d_secsize;
  	bp->b_flags = B_BUSY | B_READ;
***************
*** 300,305 ****
--- 308,317 ----
  	if (osdep && (dp = osdep->dosparts) != NULL) {
  		/* read master boot record */
  		bp->b_blkno = DOSBBSECTOR;
+ 		if (lp->d_secsize > DEV_BSIZE)
+ 			bp->b_blkno *= lp->d_secsize / DEV_BSIZE;
+ 		else
+ 			bp->b_blkno /= DEV_BSIZE / lp->d_secsize;
  		bp->b_bcount = lp->d_secsize;
  		bp->b_flags = B_BUSY | B_READ;
  		bp->b_cylin = DOSBBSECTOR / lp->d_secpercyl;
***************
*** 333,338 ****
--- 345,354 ----
  
  	/* next, dig out disk label */
  	bp->b_blkno = dospartoff + LABELSECTOR;
+ 	if (lp->d_secsize > DEV_BSIZE)
+ 		bp->b_blkno *= lp->d_secsize / DEV_BSIZE;
+ 	else
+ 		bp->b_blkno /= DEV_BSIZE / lp->d_secsize;
  	bp->b_cylin = cyl;
  	bp->b_bcount = lp->d_secsize;
  	bp->b_flags = B_BUSY | B_READ;
***************
*** 374,385 ****
  {
  	struct partition *p = lp->d_partitions + DISKPART(bp->b_dev);
  	int labelsector = lp->d_partitions[2].p_offset + LABELSECTOR;
! 	int sz;
  
  	sz = howmany(bp->b_bcount, lp->d_secsize);
  
! 	if (bp->b_blkno + sz > p->p_size) {
! 		sz = p->p_size - bp->b_blkno;
  		if (sz == 0) {
  			/* If exactly at end of disk, return EOF. */
  			bp->b_resid = bp->b_bcount;
--- 390,406 ----
  {
  	struct partition *p = lp->d_partitions + DISKPART(bp->b_dev);
  	int labelsector = lp->d_partitions[2].p_offset + LABELSECTOR;
! 	int sz, blkno;
! 
! 	if (lp->d_secsize > DEV_BSIZE)
! 		blkno = bp->b_blkno / (lp->d_secsize / DEV_BSIZE);
! 	else
! 		blkno = bp->b_blkno * (DEV_BSIZE / lp->d_secsize);
  
  	sz = howmany(bp->b_bcount, lp->d_secsize);
  
! 	if (blkno + sz > p->p_size) {
! 		sz = p->p_size - blkno;
  		if (sz == 0) {
  			/* If exactly at end of disk, return EOF. */
  			bp->b_resid = bp->b_bcount;
***************
*** 391,403 ****
  			goto bad;
  		}
  		/* Otherwise, truncate request. */
! 		bp->b_bcount = sz << DEV_BSHIFT;
  	}
  
  	/* Overwriting disk label? */
! 	if (bp->b_blkno + p->p_offset <= labelsector &&
  #if LABELSECTOR != 0
! 	    bp->b_blkno + p->p_offset + sz > labelsector &&
  #endif
  	    (bp->b_flags & B_READ) == 0 && !wlabel) {
  		bp->b_error = EROFS;
--- 412,424 ----
  			goto bad;
  		}
  		/* Otherwise, truncate request. */
! 		bp->b_bcount = sz * lp->d_secsize;
  	}
  
  	/* Overwriting disk label? */
! 	if (blkno + p->p_offset <= labelsector &&
  #if LABELSECTOR != 0
! 	    blkno + p->p_offset + sz > labelsector &&
  #endif
  	    (bp->b_flags & B_READ) == 0 && !wlabel) {
  		bp->b_error = EROFS;
***************
*** 405,412 ****
  	}
  
  	/* calculate cylinder for disksort to order transfers with */
! 	bp->b_cylin = (bp->b_blkno + p->p_offset) /
! 	    (lp->d_secsize / DEV_BSIZE) / lp->d_secpercyl;
  	return (1);
  
  bad:
--- 426,432 ----
  	}
  
  	/* calculate cylinder for disksort to order transfers with */
! 	bp->b_cylin = (blkno + p->p_offset) / lp->d_secpercyl;
  	return (1);
  
  bad:
Index: sys/scsi/sd.c
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/scsi/sd.c,v
retrieving revision 1.1.1.2
diff -c -r1.1.1.2 sd.c
*** sd.c	1997/04/01 07:19:17	1.1.1.2
--- sd.c	1997/04/09 18:10:56
***************
*** 583,589 ****
  		     p = &sd->sc_dk.dk_label->d_partitions[SDPART(bp->b_dev)];
  		     blkno += p->p_offset;
  		}
! 		nblks = howmany(bp->b_bcount, sd->sc_dk.dk_label->d_secsize);
  
  		/*
  		 *  Fill out the scsi command.  If the transfer will
--- 583,590 ----
  		     p = &sd->sc_dk.dk_label->d_partitions[SDPART(bp->b_dev)];
  		     blkno += p->p_offset;
  		}
! 		blkno *= sd->sc_dk.dk_label->d_secsize / sd->params.blksize;
! 		nblks = howmany(bp->b_bcount, sd->params.blksize);
  
  		/*
  		 *  Fill out the scsi command.  If the transfer will
Index: sys/ufs/ffs/ffs_vfsops.c
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.1.1.2
diff -c -r1.1.1.2 ffs_vfsops.c
*** ffs_vfsops.c	1997/04/01 07:20:32	1.1.1.2
--- ffs_vfsops.c	1997/04/09 18:54:57
***************
*** 306,312 ****
--- 306,314 ----
  	struct csum *space;
  	struct buf *bp;
  	struct fs *fs, *newfs;
+ #ifdef notyet
  	struct partinfo dpart;
+ #endif
  	int i, blks, size, error;
  	int32_t *lp;
  
***************
*** 321,330 ****
--- 323,337 ----
  	/*
  	 * Step 2: re-read superblock from disk.
  	 */
+ #ifdef notyet
  	if (VOP_IOCTL(devvp, DIOCGPART, (caddr_t)&dpart, FREAD, NOCRED, p) != 0)
  		size = DEV_BSIZE;
  	else
  		size = dpart.disklab->d_secsize;
+ #else
+ 	size = DEV_BSIZE;
+ #endif
+ 
  	error = bread(devvp, (daddr_t)(SBOFF / size), SBSIZE, NOCRED, &bp);
  	if (error)
  		return (error);
***************
*** 342,347 ****
--- 349,355 ----
  	 */
  	bcopy(&fs->fs_csp[0], &newfs->fs_csp[0], sizeof(fs->fs_csp));
  	newfs->fs_maxcluster = fs->fs_maxcluster;
+ 	newfs->fs_fsbtosb = fs->fs_fsbtosb;
  	bcopy(newfs, fs, (u_int)fs->fs_sbsize);
  	if (fs->fs_sbsize < SBSIZE)
  		bp->b_flags |= B_INVAL;
***************
*** 423,429 ****
--- 431,439 ----
  	struct buf *bp;
  	register struct fs *fs;
  	dev_t dev;
+ #ifdef notyet
  	struct partinfo dpart;
+ #endif
  	caddr_t base, space;
  	int blks;
  	int error, i, size, ronly;
***************
*** 451,460 ****
--- 461,474 ----
  	error = VOP_OPEN(devvp, ronly ? FREAD : FREAD|FWRITE, FSCRED, p);
  	if (error)
  		return (error);
+ #ifdef notyet
  	if (VOP_IOCTL(devvp, DIOCGPART, (caddr_t)&dpart, FREAD, cred, p) != 0)
  		size = DEV_BSIZE;
  	else
  		size = dpart.disklab->d_secsize;
+ #else
+ 	size = DEV_BSIZE;
+ #endif
  
  	bp = NULL;
  	ump = NULL;
***************
*** 472,477 ****
--- 486,507 ----
  		error = EROFS;		/* XXX what should be returned? */
  		goto out;
  	}
+ 	/* XXX bread assumes b_blkno in DEV_BSIZE unit. Calculate fsbtosb */
+ 
+ 	size = fs->fs_fsize / fs->fs_nspf;
+ 	fs->fs_fsbtosb = fs->fs_fsbtodb;
+ 	if (size >= DEV_BSIZE) {
+ 		size = size / DEV_BSIZE;
+ 		for (i = 0; size > 1; size >>= 1)
+ 			i ++;
+ 		fs->fs_fsbtosb += i;
+ 	} else {
+ 		size = DEV_BSIZE / size;
+ 		for (i = 0; size > 1; size >>= 1)
+ 			i ++;
+ 		fs->fs_fsbtosb -= i;
+ 	}
+ 
  	ump = malloc(sizeof *ump, M_UFSMNT, M_WAITOK);
  	bzero((caddr_t)ump, sizeof *ump);
  	ump->um_fs = malloc((u_long)fs->fs_sbsize, M_UFSMNT,
***************
*** 522,528 ****
  	ump->um_dev = dev;
  	ump->um_devvp = devvp;
  	ump->um_nindir = fs->fs_nindir;
! 	ump->um_bptrtodb = fs->fs_fsbtodb;
  	ump->um_seqinc = fs->fs_frag;
  	for (i = 0; i < MAXQUOTAS; i++)
  		ump->um_quotas[i] = NULLVP;
--- 552,558 ----
  	ump->um_dev = dev;
  	ump->um_devvp = devvp;
  	ump->um_nindir = fs->fs_nindir;
! 	ump->um_bptrtodb = fs->fs_fsbtosb;
  	ump->um_seqinc = fs->fs_frag;
  	for (i = 0; i < MAXQUOTAS; i++)
  		ump->um_quotas[i] = NULLVP;
***************
*** 920,926 ****
  	register struct buf *bp;
  	int i, error = 0;
  
! 	bp = getblk(mp->um_devvp, SBOFF >> (fs->fs_fshift - fs->fs_fsbtodb),
  	    (int)fs->fs_sbsize, 0, 0);
  	bcopy((caddr_t)fs, bp->b_data, (u_int)fs->fs_sbsize);
  	/* Restore compatibility to old file systems.		   XXX */
--- 950,956 ----
  	register struct buf *bp;
  	int i, error = 0;
  
! 	bp = getblk(mp->um_devvp, SBOFF >> (fs->fs_fshift - fs->fs_fsbtosb),
  	    (int)fs->fs_sbsize, 0, 0);
  	bcopy((caddr_t)fs, bp->b_data, (u_int)fs->fs_sbsize);
  	/* Restore compatibility to old file systems.		   XXX */
Index: sys/ufs/ffs/fs.h
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/ufs/ffs/fs.h,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 fs.h
*** fs.h	1997/03/31 08:57:31	1.1.1.1
--- fs.h	1997/04/09 18:26:54
***************
*** 228,234 ****
  	int32_t	 *fs_maxcluster;	/* max cluster in each cyl group */
  	int32_t	 fs_cpc;		/* cyl per cycle in postbl */
  	int16_t	 fs_opostbl[16][8];	/* old rotation block list head */
! 	int32_t	 fs_sparecon[49];	/* reserved for future constants */
  	time_t	 fs_fscktime;		/* last time fsck(8)ed */
  	int32_t	 fs_contigsumsize;	/* size of cluster summary array */ 
  	int32_t	 fs_maxsymlinklen;	/* max length of an internal symlink */
--- 228,235 ----
  	int32_t	 *fs_maxcluster;	/* max cluster in each cyl group */
  	int32_t	 fs_cpc;		/* cyl per cycle in postbl */
  	int16_t	 fs_opostbl[16][8];	/* old rotation block list head */
! 	int32_t  fs_fsbtosb;		/* fsbtodb and dbtofsb shift constant */
! 	int32_t	 fs_sparecon[48];	/* reserved for future constants */
  	time_t	 fs_fscktime;		/* last time fsck(8)ed */
  	int32_t	 fs_contigsumsize;	/* size of cluster summary array */ 
  	int32_t	 fs_maxsymlinklen;	/* max length of an internal symlink */
***************
*** 394,401 ****
   * Turn file system block numbers into disk block addresses.
   * This maps file system blocks to device size blocks.
   */
! #define fsbtodb(fs, b)	((b) << (fs)->fs_fsbtodb)
! #define	dbtofsb(fs, b)	((b) >> (fs)->fs_fsbtodb)
  
  /*
   * Cylinder group macros to locate things in cylinder groups.
--- 395,402 ----
   * Turn file system block numbers into disk block addresses.
   * This maps file system blocks to device size blocks.
   */
! #define fsbtodb(fs, b)	((b) << (fs)->fs_fsbtosb)
! #define	dbtofsb(fs, b)	((b) >> (fs)->fs_fsbtosb)
  
  /*
   * Cylinder group macros to locate things in cylinder groups.
--- end here -----

for second method
--- start here ---
Index: sys/arch/i386/i386/disksubr.c
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/arch/i386/i386/disksubr.c,v
retrieving revision 1.1.1.2
diff -c -r1.1.1.2 disksubr.c
*** disksubr.c	1997/04/01 07:21:40	1.1.1.2
--- disksubr.c	1997/04/09 20:06:30
***************
*** 186,195 ****
--- 186,197 ----
  			/* read a bad sector table */
  			bp->b_flags = B_BUSY | B_READ;
  			bp->b_blkno = lp->d_secperunit - lp->d_nsectors + i;
+ #if 0
  			if (lp->d_secsize > DEV_BSIZE)
  				bp->b_blkno *= lp->d_secsize / DEV_BSIZE;
  			else
  				bp->b_blkno /= DEV_BSIZE / lp->d_secsize;
+ #endif
  			bp->b_bcount = lp->d_secsize;
  			bp->b_cylin = lp->d_ncylinders - 1;
  			(*strat)(bp);
***************
*** 391,397 ****
  			goto bad;
  		}
  		/* Otherwise, truncate request. */
! 		bp->b_bcount = sz << DEV_BSHIFT;
  	}
  
  	/* Overwriting disk label? */
--- 393,399 ----
  			goto bad;
  		}
  		/* Otherwise, truncate request. */
! 		bp->b_bcount = sz * lp->d_secsize;
  	}
  
  	/* Overwriting disk label? */
***************
*** 405,412 ****
  	}
  
  	/* calculate cylinder for disksort to order transfers with */
! 	bp->b_cylin = (bp->b_blkno + p->p_offset) /
! 	    (lp->d_secsize / DEV_BSIZE) / lp->d_secpercyl;
  	return (1);
  
  bad:
--- 407,413 ----
  	}
  
  	/* calculate cylinder for disksort to order transfers with */
! 	bp->b_cylin = (bp->b_blkno + p->p_offset) / lp->d_secpercyl;
  	return (1);
  
  bad:
Index: sys/scsi/sd.c
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/scsi/sd.c,v
retrieving revision 1.1.1.2
diff -c -r1.1.1.2 sd.c
*** sd.c	1997/04/01 07:19:17	1.1.1.2
--- sd.c	1997/04/09 02:23:09
***************
*** 577,589 ****
  		 * First, translate the block to absolute and put it in terms
  		 * of the logical blocksize of the device.
  		 */
! 		blkno =
! 		    bp->b_blkno / (sd->sc_dk.dk_label->d_secsize / DEV_BSIZE);
  		if (SDPART(bp->b_dev) != RAW_PART) {
  		     p = &sd->sc_dk.dk_label->d_partitions[SDPART(bp->b_dev)];
  		     blkno += p->p_offset;
  		}
! 		nblks = howmany(bp->b_bcount, sd->sc_dk.dk_label->d_secsize);
  
  		/*
  		 *  Fill out the scsi command.  If the transfer will
--- 577,589 ----
  		 * First, translate the block to absolute and put it in terms
  		 * of the logical blocksize of the device.
  		 */
! 		blkno = bp->b_blkno;
  		if (SDPART(bp->b_dev) != RAW_PART) {
  		     p = &sd->sc_dk.dk_label->d_partitions[SDPART(bp->b_dev)];
  		     blkno += p->p_offset;
  		}
! 		blkno *= (sd->sc_dk.dk_label->d_secsize / sd->params.blksize);
! 		nblks = howmany(bp->b_bcount, sd->params.blksize);
  
  		/*
  		 *  Fill out the scsi command.  If the transfer will
***************
*** 812,819 ****
  	lp->d_flags = 0;
  
  	lp->d_partitions[RAW_PART].p_offset = 0;
! 	lp->d_partitions[RAW_PART].p_size =
! 	    lp->d_secperunit * (lp->d_secsize / DEV_BSIZE);
  	lp->d_partitions[RAW_PART].p_fstype = FS_UNUSED;
  	lp->d_npartitions = RAW_PART + 1;
  
--- 812,818 ----
  	lp->d_flags = 0;
  
  	lp->d_partitions[RAW_PART].p_offset = 0;
! 	lp->d_partitions[RAW_PART].p_size = lp->d_secperunit;
  	lp->d_partitions[RAW_PART].p_fstype = FS_UNUSED;
  	lp->d_npartitions = RAW_PART + 1;
  
Index: sys/msdosfs/msdosfs_vfsops.c
===================================================================
RCS file: /mnt2/NetBSD/cvsroot/netbsd/sys/msdosfs/msdosfs_vfsops.c,v
retrieving revision 1.1.1.2
diff -c -r1.1.1.2 msdosfs_vfsops.c
*** msdosfs_vfsops.c	1997/04/01 07:16:20	1.1.1.2
--- msdosfs_vfsops.c	1997/04/08 00:14:29
***************
*** 316,322 ****
  	 * Read the boot sector of the filesystem, and then check the
  	 * boot signature.  If not a dos boot sector then error out.
  	 */
! 	if ((error = bread(devvp, 0, 512, NOCRED, &bp)) != 0)
  		goto error_exit;
  	bp->b_flags |= B_AGE;
  	bsp = (union bootsector *)bp->b_data;
--- 316,322 ----
  	 * Read the boot sector of the filesystem, and then check the
  	 * boot signature.  If not a dos boot sector then error out.
  	 */
! 	if ((error = bread(devvp, 0, BLKDEV_IOSIZE, NOCRED, &bp)) != 0)
  		goto error_exit;
  	bp->b_flags |= B_AGE;
  	bsp = (union bootsector *)bp->b_data;
--- end here -----