NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/50108 (fsck_ffs fails replaying wapbl journal on filesystem with 4k sectors)



The following reply was made to PR bin/50108; it has been noted by GNATS.

From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/50108 (fsck_ffs fails replaying wapbl journal on filesystem with 4k sectors)
Date: Mon, 17 Aug 2015 23:11:51 +0000 (UTC)

 dudinea%gmail.com@localhost (Eugene) writes:
 
 >Why? Michael's explanation and fix look right: I've tested his proposed
 >solution, and it works right for both 4k and regular 512 byte sectors. 
 
 It has already been committed to -current and netbsd-7.
 
 >BTW, on the other hand, the idea of the fix - passing same function
 >argument with different units depending on user/kernel mode -  does not
 >look elegant to me. 
 
 All disk I/O code originally addressed a block by physical block
 number and DEV_BSIZE was the (one and only) physical block size.
 
 At some point, about when the SCSI subsystem was integrated into
 the kernel, the model was changed, the kernel now uses the fixed
 DEV_BSIZE=512 coordinates and the driver translates that into
 physical blocks.
 Userland however still maintains the old view and there was lots
 of code in the kernel that treated DEV_BSIZE as the physical block
 size.
 
 This wasn't a problem as virtually all disks used 512byte blocks
 at that time. Both views were effectively the same.
 
 When disks with other block sizes became popular, it was necessary
 to decide how to handle this in kernel and userland.
 
 Userland using physical blocks for addressing had no problems
 except for determining the block size. This is done by querying
 the device driver for low-level tools and by checking filesystem
 parameters (== superblock) for filesystem tools.
 
 The kernel mainly needed to fix all places where filesystem
 to disk block translations still didn't use DEV_BSIZE.
 
 One problem is shared code between kernel and userland. But
 since all this code only dealt with the translation between filesytem
 coordinates and disk coordinates, this is handled in exactly one
 place, the fsbtodb/dbtofsb macros which are now defined differently
 for kernel and userland.
 
 WAPBL is an exception. It's the only filesystem-related code that
 does not use filesystem coordinates but disk coordinates (in kernel
 that's DEV_BSIZE units) but which are then passed in the journal to
 userland code.
 And since WAPBL is again shared code between kernel and userland the
 different views need to be reflected by that code.
 
 
 >Wouldn't it be better to modify fsck_ffs code (that is the
 >wapbl_read()/wapbl_write() functions in sbin/fsck_ffs/wapbl.c -
 >kern/vfs_wapbl.c calls them back for IO when working in user mode) 
 >to use the same DEV_BSIZE unit as kernel when replaying the journal? 
 
 The functions are used for both types of coordinates. When reading/writing
 the journal it's filesystem coordinates. When transfering data between
 filesystem and journal its disk coordinates (as written by the kernel).
 
 The only way to get out of this split is to have the same view on
 both sides again. Either revert the kernel to use physical block
 numbers again (i.e. change all drivers and most filesystems)
 or change userland to use DEV_BSIZE coordinates (become incompatibel
 with other BSDs and third party software).
 
 On the other hand, it doesn't really add complexity. Unit translations
 are done all over the place and you just have to keep in mind that
 kernel and userland have different views.
 
 
 -- 
 -- 
                                 Michael van Elst
 Internet: mlelstv%serpens.de@localhost
                                 "A potential Snark may lurk in every tree."
 


Home | Main Index | Thread Index | Old Index