Subject: does fsync(2) still ignore some filesystem metadata in 1.6.x?
To: NetBSD Kernel Technical Discussion List <tech-kern@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 11/26/2004 17:38:03
The following description of changes to src/sys/ufs/ffs/ffs_vnops.c
suggests that in 1.6.x an fsync(2) call will _not_ necessarily cause the
inode data to be flushed:

----------------------------
revision 1.61
date: 2003/10/25 19:52:21;  author: kleink;  state: Exp;  lines: +5 -8
Remove the present incarnation of FSYNC_DATAONLY use from ffs_fsync() and
ffs_full_fsync(); while it is supposed to hint that the update of _file_
metadata (as in timestamps et al.) may be omitted it doesn't mean the
same for _filesystem_ metadata.
----------------------------


If so then could this explain why sometimes(?) installboot(8) fails
(silently) during an install (at least on i386)?

If so then would it be safe and wise to pull those fixes up to the
netbsd-1-6 branch?  Is that the only change necessary to implement the
complete fix?

During the last couple of i386 installs I've done from my 1.6.x tree,
using sysinst, I've found I have to manually re-run installboot as
otherwise the systems fail to boot at all and finally, almost by
accident, I've figured out why it fails and how to reproduce it:

# mount -u -o async /
# mount
/dev/sd0a on / type ffs (asynchronous, local)
# /usr/mdec/installboot -v /usr/mdec/biosboot_com0.sym /dev/rsd0a
/usr/mdec/biosboot_com0.sym: entry point 0x8063000
proto bootblock size 48640
room for 10 filesystem blocks at 0x580
renamed //boot -> //boot.bak
Will load 0 blocks.
BSD partition starts at sector 63
deleting //boot.bak
#

The resulting PBR is now totally broken as it'll load no blocks and then
jump to what's likely random memory, usually causing an immediate reboot.

However if the async option is turned off then it works as it should:

# mount -u -o noasync /
# /usr/mdec/installboot -v /usr/mdec/biosboot_com0.sym /dev/rsd0a
/usr/mdec/biosboot_com0.sym: entry point 0x8063000
proto bootblock size 48640
room for 10 filesystem blocks at 0x580
renamed //boot -> //boot.bak
Will load 80 blocks.
dblk: 14752, num: 16
dblk: 14768, num: 16
dblk: 14784, num: 16
dblk: 14800, num: 16
dblk: 14816, num: 16
BSD partition starts at sector 63
deleting //boot.bak
# 

(I'm going to add an error check to installboot so that it bails if it
finds the block count in the /boot inode to be zero, and in the mean
time I'm going to take the use of '-o async' out of my version of sysinst!)

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>