Subject: Re: ktr from freebsd
To: Andrey Petrov <petrov@netbsd.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 04/27/2004 17:13:11
--+nBD6E3TurpgldQp
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Apr 27, 2004 at 02:46:01PM -0700, Andrey Petrov wrote:
> We probably see purpose of kernel trace facility differently.
> For me it's needed when you can't use printf because it's too
> intrusive and using it leads to different system behavouir.
> So lightwight and no side-effects would be priority 1.
>=20
> Second would be reliablility, I don;t want losing information
> so everything which is collected can be trusted.

=46rom looking at your port, I'd say just hammer on UVMHIST. Hammer hard,=
=20
but hammer on it.

I say that as I think NetBSD will want something different from what=20
FreeBSD has, and so it'd be better to not call it the same thing. So since=
=20
our ktr won't be their ktr, let's just do something different. It can end=
=20
up looking very similar, though. :-)

=46rom looking at the code for uvmhist and ktr, the big differences I see=
=20
are:

1) 4 arguements for uvmhist, 6 for ktr. I think 4 will probably be fine=20
most of the time, and the times when we need more, just do two loggings.

2) ktr has an asm version. I'd say ignore that, at least for now. I don't=
=20
think we have much asm in the kernel now, and what we do have is perfectly=
=20
happy calling C code. So keep one routine for simplicity. Also, we then=20
don't have to port the logging to each different arch.

3) ktr logs file/line while uvmhist logs function. I think function name=20
is simpler. It (and the messge) uniquely ID the message, so we don't need=
=20
more.

Also, we could use CCP string concatination and prepend __FUNCTION__ to=20
the given format string, so we only need one string, the "format" string.
And since we could format the concatination (say it's "funcname: <format=20
str>"), we can have display tools strip it out if wanted.

4) ktr has a ddb log dumper. Given how similar everything is, shouldn't be=
=20
hard to add the same thing for uvmhist. Basically the ddb support you=20
added just given a uvmhist and taught to print them would do it.

5) uvmhist records the lengths of the strings too. If we only look at the=
=20
log from within the kernel (say ddb), this is a waste. However uvmhist=20
already has tools to read the log from userland & then dump it out. libkvm=
=20
does NOT support reading strings easily, so this is a very good thing. If=
=20
we added a libkvm readstring ioctl or something like that, then we could=20
get rid of the string length. The other option is to have the logger=20
speculatively read say 256 bytes at a time, but then we'd need to figure=20
out what happens if part of the data we end up reading isn't mapped. :-)

6) locking. uvmhist uses splhigh() + spinlocking, while ktr uses=20
atomic_cmpset_rel_int(). As I understand that, it's a form of RAS. In the=
=20
long run, that'd be cool to do. But most of the NetBSD kernel just uses=20
spl for now.

If you do check ktr in, you MUST address this point. I think the=20
spinlocking from uvmhist is the way to go.

7) uvmhist has separate loggers and ktr uses a mask of well-known=20
log types.

This one I'm not so sure of. At first, I thought that uvmhist's different=
=20
logging facilities was fine. Then I looked at the number of masks in the=20
ktr header, and they have 24 predefined mask items. I really hadn't=20
envisioned us having 24 separate uvmhist logs.

So I'm not sure what to say about this one. If we turn uvmhist into a=20
kernel-wide facility, we will want some way to control what is and isn't=20
logged. I was originally thinking maybe 4 or 6 uvmhists, covering general=
=20
areas of the kernel.

I'm not sure what to say for this.

> I can't tell if complete log is needed, so far the tail of
> it was enought. From my experience setting buffer size and trace mask
> let me have sufficient amount of trace.
>=20
> Saving to file seems nice but not necessary feature,
> target user of kernel trace should be able to use ddb. Utility to
> extract log from corefile would be useful though.

I disagree strongly here. I think logging to a file is an essential=20
feature. While I agree it isn't needed if you're trying to figure out why=
=20
the kernel just crashed & dumped you into ddb, it's good for a lot of=20
other things. It's really nice to be able to get the log, save it, and=20
process it later. Also, uvmhist's using libkvm means we do get dumps from=
=20
core files too. :-)

[snip]

> > The uvm history system is designed to handle logging a very busy kernel=
=20
> > subsystem to disk very efficiently. While ktr or something like it may =
be=20
> > simpler, I don't think it will perform as well under heavy load. Since =
we=20
> > will need to log such subsystems, why not focus on one logging system t=
hat=20
> > will work well for all (almost all?) our needs, rather than have two=20
> > subsystems, one of which is only good some of the time.
> >=20
>=20
> Would be interesting to see how libkvm-based utility does the job.
> vmstat seems to do only a snapshot.

Hmm, yeah. But all we need to do is sleep & wake up ever now & then. If we=
=20
assume we won't overflow the log too fast (like in a second), then this=20
should work.

Take care,

Bill

--+nBD6E3TurpgldQp
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAjvcXWz+3JHUci9cRAiRfAKCLFCXPco+yQyuFnxDXciO7QyxMFwCdEmBT
yLGbI4ywRIYHgFdceol5Vho=
=1P3p
-----END PGP SIGNATURE-----

--+nBD6E3TurpgldQp--