Subject: Re: LFS writes and network receive (too much splhigh?)
To: Juan RP <juan@xtrarom.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-kern
Date: 10/22/2006 16:29:48
On Sun, Oct 22, 2006 at 10:20:13PM +0200, Juan RP wrote:
> On Sun, 22 Oct 2006 15:07:02 -0400
> Thor Lancelot Simon <tls@rek.tjls.com> wrote:
> 
> > If I restore a backup full of large files (such that the smooth
> > syncer, which schedules writes when _a file's_ oldest data is 30
> > seconds old), over the network onto LFS, the following thing happens:
> > 
> > 1) Writes queue up for 30 seconds (this is a design flaw in the
> > smooth syncer)
> 
> Design flaw that in 4.0 can be changed via sysctl:

No, it can't.  The design flaw is that the syncer writes *all* the
dirty data for any file when the oldest dirty data for that file is
syncdelay (or filedelay) old.  That is a wrong implementation of the
smooth sync algorithm, which should write each page of data (and perhaps
any immediately adjacent dirty data, for the sake of efficiency) when
that page is syncdelay (or filedelay) old.

The implementation we have dengenerates to being equivalent to non
smooth sync in the case in which all the data dirtied during the sync
interval belongs to the same few large files: it does huge writes every
30 seconds, instead of doing smaller writes at a lower, constant rate.

For LFS, it could be helped perhaps by maintaining an estimate of how
much dirty data there is, and syncing whenever that estimate adds up
to more than a segment.  But there would then need to be a way to stop
the sync so as to not do partial-segment writes at the end.

Thor