tech-userlevel: Re: calendar vs. NIS

Subject: Re: calendar vs. NIS
To: Joerg Klemenz <joerg@gmx.net>
From: Matthias Buelow <mkb@mukappabeta.de>
List: tech-userlevel
Date: 11/06/2002 22:24:05

Joerg Klemenz wrote:

> Uhh... that sounds complicated. How about that (it's kinda lame, but
> should work for all cases)
[linear list duplicates checking proposed by Joerg]

Just split it into two loops: one to gather UIDs; then sort the UIDs
thus found.  Then make a second go over the new list, doing the calendar
stuff on each UID.  You can either remove duplicates from the sorted
list before going throug it for the calendar processing, or, since it's
sorted, you can check if the next entry is the same UID as the current
one, and just skip it.

This entirely removes the highly suboptimal duplicates checking from
the passwd-walking loop.  In contrast to your linear-list 
duplicates-finding solution, which runs on O(n^2), this method would be
in O(n log n), which should be optimal for that application.

If you use arrays as a representation for the lists, you can use
the C lib function mergesort(3) for sorting; merge sort in general
has the advantage that it works best on presorted sequences, such as
the passwd file, which one can assume is partially sorted (new passwd
entries with higher uids tend to get added to the end.)

-- 
Matthias Buelow