Subject: Re: kern/28621: 1.6.x "vp != NULL" crash in ffs_sfotdep.c:4653 while unmounting a softdep (+quota) filesystem
To: None <netbsd-bugs@netbsd.org>
From: David Young <dyoung@pobox.com>
List: netbsd-bugs
Date: 01/07/2005 22:35:19
I (dyoung@netbsd.org) see this, too, on a Soekris board. Below is the
Soekris panic.  I ran umount(8) with the attached script.  This is a very
-current kernel, 2.99.11.  I see this often.  I can provide console access
to a Soekris board where it occurs.  It is really too bad if I cannot
use softdep because it speeds up the script by several minutes. -dcy

  Stopped in pid 26781.1 (umount) at      netbsd:cpu_Debugger+0x4:        popl  
  %
  ebp
  db> trace/u
  cpu_Debugger(0,c385e934,2,c3978d0c,c0271fb9) at netbsd:cpu_Debugger+0x4
  panic(c02c50a0,c029f1fc,c02a206d,c02b3340,1408) at netbsd:panic+0xa9
  __assert(c029f1fc,c02b3340,1408,c02a206d,0) at netbsd:__assert+0x19
  flush_inodedep_deps(c0546800,80,0,0,0) at netbsd:flush_inodedep_deps+0x3b
  softdep_sync_metadata(c3978e1c,0,0,4,c385e934) at netbsd:softdep_sync_metadata
+0
  x6b
  ffs_full_fsync(c3978e1c,c3a5a2b8,c05f7000,c3a5c04c,0) at netbsd:ffs_full_fsync
+0
  x20e
  ffs_fsync(c3978e1c,c02800a0,c385e934,c2e3f0fc,1) at netbsd:ffs_fsync+0x48
  VOP_FSYNC(c385e934,c2e3f0fc,1,0,0) at netbsd:VOP_FSYNC+0x4c
  softdep_flushworklist(c05f7000,c3978ea4,c3906008,0,0) at netbsd:softdep_flushw
or
  klist+0x87
  ffs_sync(c05f7000,1,c2e3f0fc,c3906008,0) at netbsd:ffs_sync+0x10e
  dounmount(c05f7000,0,c3906008,c3906008,bfbfe930) at netbsd:dounmount+0xc3
  sys_unmount(c36efe70,c3978f70,c3978f68,c02c9ce8,0) at netbsd:sys_unmount+0xf2
  syscall_plain() at netbsd:syscall_plain+0xc2
  --- syscall (number 22) ---
  ?(4807053c,bfbfedbc,0,8068000,8065d68) at 0x4807b78b
  Bad user frame pointer: 0x48078200
  db>

*************
*************

This is a representative mount(8) output from one of the Soekris boxes:

# mount 
/dev/wd0a on / type ffs (read-only, local)
mfs:10 on /dev type mfs (synchronous, local)
/etc on /permanent/etc type null (local)
mfs:1390 on /etc type mfs (synchronous, noatime, local)
/home on /permanent/home type null (local)
mfs:1398 on /home type mfs (synchronous, noatime, local)
/tmp on /permanent/tmp type null (local)
mfs:1420 on /tmp type mfs (synchronous, noexec, nosuid, nodev, noatime, local)
/var on /permanent/var type null (local)
mfs:1429 on /var type mfs (synchronous, noatime, local)

*************
*************

Here is the script:

#!/bin/sh
# $Id: upgrade 2288 2004-12-23 07:16:30Z dyoung $

[ "$(whoami)" = "root" ] || { 
	echo "This script is intended to be run as root on a CUW node." 1>&2; 
	exit 1; 
}
[ $1 ] || { echo "Usage: $0 user@host:/path/to/tar" 1>&2; exit 1; }

gripe () {
	echo "$*" 1>&2
}

bomb () {
	gripe "$*"
	cd
	if mount | grep -q "on /mnt " ; then
		umount /mnt
	fi
	exit 1
}

attempt () {
	eval $* || bomb "Upgrade failed on $* [$?]"
}

set -u

extract_scp_format="\([^@]*\)@\([^:]*\):\(.*\)"
user=$(echo $1 | sed -n -e "s/$extract_scp_format/\1/p")
host=$(echo $1 | sed -n -e "s/$extract_scp_format/\2/p")
tar=$(echo $1 | sed -n -e "s/$extract_scp_format/\3/p")

current=$(mount | sed -n -e "s/^[^ ]*\([ae]\) on \/ .*/\1/p")
if [ $current = 'a' ] ; then
	setactive=1
	dev="/dev/wd0e"
	rdev="/dev/rwd0e"
elif [ $current = 'e' ] ; then
	setactive=0
	dev="/dev/wd0a"
	rdev="/dev/rwd0a"
fi
ddev="/dev/rwd0d"

echo "Preparing for upgrade on $dev."

attempt newfs $rdev
attempt mount -o noatime $dev /mnt
attempt cd /mnt

echo "Installing the upgrade."
attempt "ssh $user@$host cat $tar | pax -pe -r -z"

echo "Updating etc/fstab"
fstab=$(mktemp /var/tmp/$(basename $0).fstab.XXXXXX)
sed -e "s|/dev/wd0[ae] / \(.*\)|$dev / \1|" /mnt/etc/fstab > $fstab
attempt install -o root -g wheel -m 0644 $fstab /mnt/etc/fstab
rm $fstab

echo "Updating bootstrap"
attempt fdisk -f -a -$setactive $ddev
attempt mbrlabel -frw $ddev
attempt installboot -o console=com0kbd,speed=19200 $rdev /mnt/usr/mdec/bootxx_ffsv1

if [ -f /usr/share/cuw_config.subr ] ; then
	. /usr/share/cuw_config.subr
	if [ -f $cuw_conf_file ] ; then
		echo "Copying existing $cuw_conf_file."
		cp $cuw_conf_file /mnt/$cuw_conf_file
	else
		echo "No existing configuration found."
	fi
fi

cd -
umount /mnt

echo "Upgrade complete ($dev)."

# $Id: upgrade 2288 2004-12-23 07:16:30Z dyoung $

-- 
David Young             OJC Technologies
dyoung@ojctech.com      Urbana, IL * (217) 278-3933