Subject: Re: ktr from freebsd
To: Andrey Petrov <petrov@netbsd.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 04/27/2004 13:00:57
--xHFwDpU9dbj6ez1V
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Apr 27, 2004 at 11:03:38AM -0700, Andrey Petrov wrote:
> On Fri, Apr 23, 2004 at 04:47:33PM -0700, Bill Studenmund wrote:
> > On Thu, Apr 22, 2004 at 08:11:37PM -0700, Andrey Petrov wrote:
> > > Have to agree 'too tied' is overstatement, I looked at uvmhist _after_
> > > I started using ktr so things like multiple buffers, explicit initial=
ization,
> > > strlens and UVMHIST_FUNC presets looked excessive for what I need.
> >=20
> > I don't think it's excessive. It's great for the time when you need jus=
t a=20
> > little bit more than what you have now.
> >=20
> > Multiple buffers are so that different subsystems can log or not log as=
=20
> > they see fit.
>=20
> Control over subsystem logging can be done by mask, while merging differe=
nt
> buffers when you need to see interactions between subsystems could be
> somewhat difficult.

If you want things in the same log, just code them to use the same history=
=20
log.

If we use a bitmask, we force different systems to use the same log=20
regardless of what is really needed. As we go towards more & more=20
concurrency (yes, it is a ways off for the kernel), we will be limiting=20
potential concurrency.

> >               UVMHIST_FUNC lets you name what routine the events come=
=20
> > from. I agree you pretty much always want the name of the routine,so=20
> > __FUNCTION__ would have been fine..
> >=20
> > The strlens seemed weird for me at first. But they make sense when you=
=20
> > remember that the history's dumped by a libkvm-using program in userlan=
d.=20
> > By having explicit lengths, extracting data from the kernel is much mor=
e=20
> > efficient.
>=20
> Seems that priority is reversed here, it's more important to be efficient
> when data is being collected, then when it's dumped.

Nope. For two reasons:

1) If you're logging a very busy service, the act of dumping the log can=20
impact what you're doing. So you need to think about logging as a whole=20
process; a change that makes collecting a little faster is no good if it=20
makes dumping hideously worse.

It's much much more efficient to store the length than not. Think about=20
what it takes for libkvm to get data out of the log. If we have the=20
lengths, we extract the struct uvm_history_ent. We then extract fmtlen=20
bytes from fmt and fnlen bytes from fn. Three reads of known-sizes of data=
=20
from known addresses. And three system calls.

If we don't store the lengths, how are we going to read the strings?=20
kvm_read() reads a fixed-number of bytes. So we have to kvm_read() a byte=
=20
at a time and compare it with zero. So for the case above, we have one=20
system call for the overall entry and one per character in each string,=20
plus one each for the trailing NUL.

We could just snprintf() into a ring buffer in the kernel, so the logger
only has to read characters. But then we're putting a call to printf() in
the code we're trying to log, which will make logging impact things even
more. It's one thing to call strlen() (which will probably be inlined!)
and store about 12 ints worth of data (ILP32) each log entry, it's another
to call printf().

Yes, I am assuming logging in a case where your logging has an impact on=20
what you're doing but that you can process the logs (in a userland program=
=20
for instance) fast enough to keep up.

2) In uvmhist, half of the string lengths are of the format strings. gcc=20
will turn strlen("Foo") into a constant (3 in this case) at compile time.=
=20
So by storing lengths, our compiler is removing half the calls to=20
strlen().

The uvm history system is designed to handle logging a very busy kernel=20
subsystem to disk very efficiently. While ktr or something like it may be=
=20
simpler, I don't think it will perform as well under heavy load. Since we=
=20
will need to log such subsystems, why not focus on one logging system that=
=20
will work well for all (almost all?) our needs, rather than have two=20
subsystems, one of which is only good some of the time.

The one thing I wish we had with uvm hist is an ability to dump the raw=20
uvmhist_ent's to disk then we run a program to turn that binary dump into=
=20
a text dump. That way we don't even call printf() while we're actually=20
logging.

Take care,

Bill

--xHFwDpU9dbj6ez1V
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAjrv5Wz+3JHUci9cRAtqSAKCKe6AZzhsaUuAliU6w2WLyx7FLNwCfXQZt
wf07/mmjPs+XimrMu+M8EbY=
=BE54
-----END PGP SIGNATURE-----

--xHFwDpU9dbj6ez1V--