Subject: Re: CVS commit: src/lib/libc
To: Frank van der Linden <fvdl@netbsd.org>
From: Christos Zoulas <christos@zoulas.com>
List: tech-userlevel
Date: 05/18/2006 10:08:02
On May 18, 11:39am, fvdl@netbsd.org (Frank van der Linden) wrote:
-- Subject: Re: CVS commit: src/lib/libc

| On Wed, May 17, 2006 at 08:36:50PM +0000, Christos Zoulas wrote:
| > Log Message:
| > PR/24324: Arne H Juul: Re-implement seekdir/telldir using a pointer of
| > locations per directory instead of a global hash table to avoid memory
| > leak issues, and incorrect results.
| 
| There is a problem with this.
| 
| Historically, directory(3) operations have allowed the following:
| 
| 	dirp = opendir("foo");
| 	dp = readdir(dirp);
| 	...
| 	pos = telldir(dirp);	/* remember where we left off */
| 	closedir(dirp);
| 	...
| 	dirp = opendir("foo");
| 	seekdir(dirp, pos);	/* continue work at old spot */
| 
| This kind of code was used, presumably, in a case of resource
| shortage, while implementing e.g. a treewalk, and the number of
| subdirectories was large, so an open directory had to be temporarily
| closed in order for another one to be opened.
| 
| I certainly don't think that's good programming practice, in fact,
| I think it's silly. But, it has always been allowed, and our
| manpage explicitly allows it:
| 
| "It is safe to use a previous telldir() value immediately after a call
|  to opendir() and before any calls to readdir()."
| 
| The standards documents are vague on this, and the language used in
| them can be explained both ways.
| 
| Either way, this is a change in semantics for the NetBSD libc.
| 
| Two choices:
| 
| 	1) Find a better way to fix this problem, or
| 	2) Go with the semantics change and delete that line
| 	   from the manual page.
| 
| It's a sucky problem. Allowing the old behavior basically means always
| leaking memory, because a 32<->64 position mapping has to be kept across
| a closedir(), and it's unknown when it can be thrown away.
| 
| I've been agonizing over a fix for this in OpenSolaris, which has the same
| issue (worse: it has du(1) and fts/ftw implementations which depend
| on these semantics), and have come to the conclusion that there is
| no good fix. The least sucky one will have to do, but I'm not sure
| which one this is.

I think I will go with the semantics change. This "feature" is questionable
and its implementation will impose a large overhead. This is why the
solaris dirent is so messed up :-)

christos