current-users: Re: 'ffs_alloccg: map corrupted' with UFS2 kernel

Subject: Re: 'ffs_alloccg: map corrupted' with UFS2 kernel
To: None <current-users@netbsd.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: current-users
Date: 04/12/2003 12:30:11

On Sat, Apr 12, 2003 at 07:08:26PM +0900, enami tsugutomo wrote:
> > From: David Brownlee <abs@netbsd.org>
> > bsize   32768   shift   15      mask    0xffff8000
> > fsize   4096    shift   12      mask    0xfffff000
> 
> I also could reproduce same panic using same block/frag size.  And
> following patch fixes for me.  We can't leave valid data in cache if
> tried block isn't real super block.  If we leave it, and some other
> data (in this case cg) starts exact same offset with larger size,
> we'll see junk at the end of chunk.

Heh!  I had a bug like this a long time ago when I was trying to
see if it was feasible to replace the statically-allocated buffers with
malloc()ed ones.  It can be hard to remember that one of the invariants
of the vfs_bio cache is that you must treat the device as if it were
made up of "blocks" of the filesystem block size that can only be
accessed by I/O of whole "blocks" -- else you will get double-caching of
data and corruption if an earlier cached entry is written back; this
seems like a different failure mode from the same error.

For example, if you're accessing block 4 with a transfer size of 8k,
you must not access block 2 with a transfer size of 8k; the resulting
cache entries will overlap!  I thought there were a few suspicious
areas in the existing code, but Charles and a few others confirmed
to me at the time that what I just stated was, at least, how it was
all *supposed* to work (I was about to embark on some grandiose extent
scheme to ensure that overlapping transfers DTRT, and so forth...)

Thor