Subject: Re: calendar vs. NIS
To: tech-userlevel <firstname.lastname@example.org>
From: Joerg Klemenz <email@example.com>
Date: 11/07/2002 15:43:59
Matthias Buelow wrote:
> Just split it into two loops: one to gather UIDs; then sort the UIDs
> thus found. Then make a second go over the new list, doing the
> calendar stuff on each UID. You can either remove duplicates from
> the sorted list before going throug it for the calendar processing,
> or, since it's sorted, you can check if the next entry is the same
> UID as the current one, and just skip it.
This sounds like massive overkill to me. The vast majority of systems
have an already semi-sorted passwd file with <100 users. Calendar is
not a high-end database application.
I just looked at the code a little: it has three gotos and never
closes its temp file, but throws away the descriptor and unlinks the
pathname (with UID 0).
in the calendar file will grow the memory usage of calendar forever. I
just tried it out on a simulated machine (what else can you do with
this weather): the process gets killed in seconds when the swap runs
out. A malicous user can kill the calendar -a process. CPP should
_really_ check its input files!
> This entirely removes the highly suboptimal duplicates checking from
> the passwd-walking loop. In contrast to your linear-list
> duplicates-finding solution, which runs on O(n^2), this method would
> be in O(n log n), which should be optimal for that application.
The reverse sorted list I showed in the other post should attack that.
It has almost no performance drop for normal users while it fixes NIS
without beeing overly complex. Exponential runtime is not a problem if
the worst case scenario (100000+ users on NIS) can be handled in
seconds or minutes.
The real bottleneck is NIS itself followed by the getpw* routines. The
runtime of the sorting list routine probably makes little difference.