Subject: kern/11989: raidframe disk geometry pessimizes ffs layout
To: None <gnats-bugs@gnats.netbsd.org>
From: None <sommerfeld@hamachi.org>
List: netbsd-bugs
Date: 01/18/2001 06:03:18
>Number:         11989
>Category:       kern
>Synopsis:       raidframe disk geometry pessimizes ffs layout
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jan 18 06:03:00 PST 2001
>Closed-Date:
>Last-Modified:
>Originator:     Bill Sommerfeld
>Release:        20010116
>Organization:
	hah
>Environment:
	
System: NetBSD hydra.hamachi.org 1.5Q NetBSD 1.5Q (HYDRA) #3: Tue Jan 9 22:33:17 EST 2001 sommerfeld@snoop:/usr/smpsys/arch/i386/compile/HYDRA i386
Architecture: i386
Machine: i386
>Description:

sys/dev/raidframe/rf_netbsdkintf.c constructs a fabricated label for
the raid device with geometry as follows:

	/* fabricate a label... */
	lp->d_nsectors = raidPtr->Layout.dataSectorsPerStripe;
	lp->d_ntracks = 1;
	lp->d_ncylinders = raidPtr->totalSectors / 
		(lp->d_nsectors * lp->d_ntracks);

This means "too many cylinders, too few tracks", which tricks newfs
into making very large numbers of very small cylinder groups, which
causes severe problems under certain workloads (see kern/11983).  In
short, when there are significant numbers of files which are around
the size of a cylinder group or larger, then cylinder groups tend to
fill, which causes allocations which prefer those cylinder groups to
spend a lot of time hunting for free blocks.

While the fix to ffs_dirpref() for kern/11983 will alleviate some of
the pain, overall system performance will still suffer because
individual cylinder groups will be more likely to be completely full.

>How-To-Repeat:
	see kern/11983
	
>Fix:

Se larger value for ntracks, and correspondingly smaller value for ncylinders.

first thing which comes to mind is:

	raid.d_nsectors = component.d_nsectors
	raid.d_ntracks = component.d_ntracks * N
	raid.d_ncylinders = component.d_ncylinders

(N here is "number of disk's worth of real data", which appears to be
"Layout.numDataCol")

however, because of how ffs does its layout, a cylinder really should
be to be an integral number of stripes long.  

perhaps:
	estcyl = component.d_ncylinders
	estcylsize = totalsize / estcyl

	stripepercyl = (estcylsize+sectorsperstripe-1)/sectorsperstripe

	nsectors = sectorsperstripe
	ntracks = stripepercyl
	cylinders = totalsectors / (nsectors * ntracks)

sample values from two raids i have set up:

	4 disks of 168x20x5273 in raid 5, stripe unit of 128

	totalsize = 53320572
	estcyl = 5273
	estcylsize = 10111
	stripepercyl = 79
	nsectors = 128
	ntracks = 79
	ncylinders = 5272

(vs 128/1/138855)

	3 disks of 184/5/6810 in raid 5, stripe unit of 32

	estcyl = 6810
	estcylsize = 922
	stripepercyl = 29
	nsectors = 64
	ntracks = 29
	ncylinders = 6769

(vs 64/1/196306)
	
>Release-Note:
>Audit-Trail:
>Unformatted: