Subject: Re: NTP loses sync if st driver pushed hard?
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Matthew Jacob <mjacob@feral.com>
List: tech-kern
Date: 09/17/2001 17:26:58
On Mon, 17 Sep 2001, Thor Lancelot Simon wrote:

> On Mon, Sep 17, 2001 at 12:34:23AM -0700, Matthew Jacob wrote:
> > > 
> > > 1) The clock slippage *may* be limited to the end of the backup runs; I
> > >    can't shake the feeling that I've seen it otherwise, but... it may
> > >    only happen when I do the close(); however, it does happen with either
> > >    the rewinding or non-rewinding device.
> > 
> > The filemark writes can take a long time- all data in a write buffer will
> > flush. But this shouldn't cause any lossage.
> > 
> > I don't recall what HBA this is... let me go back and look at old mail...you
> > said "various machines"- did they all have the same HBA type (Advansys)?
> 
> No; the machine with the DLT2000XT in it has a Qlogic 1280.
> 
> > Looking again at #2 above- this makes me wonder if this isn't some scheduler
> > problem, oddly enough. It sounds to me like the backup process runs and
> > consumes a very large quantum of runtime because it's being so succesful at
> > pushing data- so much so that when it has to wait a bit (because the internal
> > tape buffer is full and needs to flush a bit), that it gets penalized and
> > doesn't get started quickly enough when the command that waited on a tape
> > buffer flush finishes. But this may make no sense at all.
> 
> I wondered about *exactly* this, so I tried raising HZ on my machines to
> make the scheduler quantum smaller.  Throughput increased very slightly,
> but I still couldn't stream -- and I lost much more time, too.

Hmm. Well, that shot my wad for theorizing. Until I can actually reproduce
something like it on reasonable h/w (I just plain suspect a 12 year old
Archive QIC-150 on anything) it'd be WAGs- it's not completion thread
activations getting in the way, it's not the scheduler. I dunno.

-matt