Subject: kern/20191: RAIDframe RAID5 + softdep can deplete kernel memory
To: None <gnats-bugs@gnats.netbsd.org>
From: Paul Ripke <stix@stix.homeunix.net>
List: netbsd-bugs
Date: 02/04/2003 14:46:11
>Number:         20191
>Category:       kern
>Synopsis:       RAIDframe RAID5 + softdep can deplete kernel memory
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Feb 03 19:47:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Paul Ripke
>Release:        NetBSD 1.6M (-current 2003-01-29)
>Organization:
>Environment:
System: NetBSD stix-pc.stix.org.au 1.6M NetBSD 1.6M (STIX-PC) #37: Fri Jan 31 14:28:33 EST 2003 stix@stix-amd.stix.org.au:/usr/src/sys/arch/i386/compile/STIX-PC i386
Architecture: i386
Machine: i386
RAM: 128MB
>Description:
When testing RAIDframe and newfs settings across a three disk raid
level 5, using three 1GB vnd devices on three destinct physical disks,
I have repeatedly hung the kernel. "show uvmexp" and gdb on cores show
uvmexp.free less than 5, often only 1.

Hang also reliably occurs when testing a degraded 3-disk RAID5 set, set up
on two 30GB wd devices.

I am yet to see the hang on a mirror or stripe.

From a dump taken from an earlier kernel:
(gdb) print uvmexp
$1 = {pagesize = 4096, pagemask = 4095, pageshift = 12, npages = 31381, free = 4, active = 16949, inactive = 8493,
  paging = 180, wired = 686, ncolors = 32, colormask = 31, zeropages = 0, reserve_pagedaemon = 1, reserve_kernel = 5,
  anonpages = 12784, filepages = 13198, execpages = 2683, freemin = 64, freetarg = 85, inactarg = 8480,
  wiredmax = 10460, anonmin = 25, execmin = 12, filemin = 25, anonminpct = 10, execminpct = 5, fileminpct = 10,
  anonmax = 204, execmax = 76, filemax = 128, anonmaxpct = 80, execmaxpct = 30, filemaxpct = 50, nswapdev = 1,
  swpages = 262079, swpginuse = 11241, swpgonly = 9386, nswget = 2294, nanon = 291415, nanonneeded = 291415,
  nfreeanon = 272082, faults = 1641588, traps = 2781446, intrs = 16918234, swtch = 7362904, softs = 14611102,
  syscalls = 9382809, pageins = 2182, swapins = 373, swapouts = 439, pgswapin = 0, pgswapout = 11526, forks = 6379,
  forks_ppwait = 1204, forks_sharevm = 1254, pga_zerohit = 0, pga_zeromiss = 478622, zeroaborts = 0,
  colorhit = 9403421, colormiss = 66632, fltnoram = 3, fltnoanon = 0, fltpgwait = 6, fltpgrele = 0, fltrelck = 6860,
  fltrelckok = 6860, fltanget = 998920, fltanretry = 2188, fltamcopy = 72135, fltnamap = 129168, fltnomap = 738405,
  fltlget = 191039, fltget = 4670, flt_anon = 849972, flt_acow = 57721, flt_obj = 167057, flt_prcopy = 23982,
  flt_przero = 442479, pdwoke = 2668, pdrevs = 2347, pdswout = 2347, pdfreed = 154, pdscans = 1744940,
  pdanscan = 11501, pdobscan = 588268, pdreact = 310694, pdbusy = 5380, pdpageouts = 748, pdpending = 748,
  pddeact = 1897808, pdreanon = 646097, pdrefile = 2340, pdreexec = 180660}
(gdb) xps
              proc   pid     flag st              wchan comm
        0xcb6997c0  6377     4006  4         0xc04023d0 pax (uvn_fp1)
        0xcb893244  6349    20204  3         0xc07460d4 raidio0 (raidiow)
        0xcb7fc998  6348    20204  3         0xc04023d0 raid0 (km_getwait2)
        0xcb5a53e4   474     4006  4         0xc04023d0 top (flt_noram5)
        0xcb516b5c   451      105  4         0xc04023d0 screen-3.9.11 (flt_noram1)
        0xcb51601c   288        4  4         0xc04023d0 cron (flt_noram3)
...

I have seen the pagedaemon in "emergva" on many occasions.

>How-To-Repeat:
RAID5 sets tested ranged from 8-64 sect/SU, fifo from 1-100.
newfs usually "-i 32k -f 4k -b 32k", but other parameters have been tried.
mount must be "-o softdep" for the problem to occur.
With standard kernel, system will hang part way through unpacking
gcc-2.95.2.tar.gz.

With the following kernel parameters changed, the script below hung
on the 60th extract of the gcc tarball.

uvmexp.reserve_kernel = 128
uvmexp.freemin = 192
uvmexp.freetarg = 256

#!/bin/ksh
cd /mnt
i=0
while [ $i -le 100 ]; do
	echo $i
	pax -rvzf /tmp/gcc-2.95.2.tar.gz 2>&1 | tail -1
	mv gcc-2.95.2 z$i
	let i++
done

>Fix:
Workaround: mount the filesystem without softdep.
>Release-Note:
>Audit-Trail:
>Unformatted: