Subject: an in-kernel getcwd() implementation
To: None <tech-kern@netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 03/07/1999 02:35:16
I've done what I think is a more-or-less complete implementation of an
in-kernel getcwd().

This has several benefits:
	1) it's about 28 times faster than userspace getcwd() 
		on a simple benchmark (getcwd 6 directories deep)
	2) in theory, it's less likely to get hung up on unrelated
	   wedged filesystems (e.g., NFS mounted from servers which are down),
	   since it doesn't wildly stat() left and right when it
	   crosses a mount point.
	3) in theory, it will enable our linux implementation to run a
	   number of interesting binaries (e.g., oracle).

The native kernel interface is similar to that used by the Linux
syscall, which is to say it returns the length of the buffer actually
used rather than the pointer to the buffer.  It's called __getcwd
since _getcwd was already in use in our libc as a weak alias for
getcwd().

I haven't yet committed it to the tree since it affects some rather
critical functionality (the name-to-vnode cache) so I want some review
of the code first; moreover, I do not believe that it's ready for
production use yet..

It uses the kernel name-to-vnode cache (implemented in
sys/kern/vfs_cache.c) tweaked to also work as a vnode-to-name cache.
vfs_cache.c and namei.h were changed to add a second hash chain to
each cache entry, and build a second hash table, keyed by the
destination vnode; namecache entries for directory vnodes for entries
other than "." and ".." are linked into this second hash table.

If the cache misses, it falls back to the approach used by the libc
getcwd, which is to crawl up ".." pointers, reading the parent
directory for a match.  Fortunately, since it can do this in O(n) time
rather than O(n^2) time, it's still a good deal faster than userspace
getcwd; also, with the testing I've done so far, the cache hit rate
seems to be about 99%.

I've written a mostly complete regression test, which tests all the
usual system call issues -- bad pointers, bad lengths, etc., and also
tries rather hard to get into some of the dark corners of the code.
(There are still a few untested areas, involving interaction with
chroot, vn_lock failures, and mount -o union).

I have not yet tweaked libc to call the system call version of getcwd.

For those of you with developer accounts on the CVS server, a copy of
my work so far can be found in:

	cvs.netbsd.org:~sommerfeld/getcwd.tar 

To build it, you'll also need to manually edit your files file to add
kern/vfs_getcwd.c, and rebuild init_sysent.c and friends from sys/kern.

I realize the code isn't as KNF'ed as many would like; fixing that is
on my list of things to do.  If/when folks think it's of suitable
quality to go in the tree, I'll assign rights to TNF, put a TNF
copyright on it and commit it.

If you have any comments on the code (trivial, substantive, or
anywhere in between), let me know.

					- Bill