Subject: pwd_mkdb - Lets do this differently!
To: None <netbsd-users@netbsd.org>
From: Stephen M Jones <smj@cirr.com>
List: netbsd-users
Date: 08/08/2001 16:34:09
Hi.  In 1996 I wanted to move from running 3b2 SysVr3 systems to PeeCees
running NetBSD .  One machine has since run NetBSD because it only has
about 20 accounts on it.  However, for another machine I was trying
to move about 10,000 user accounts and found that the BSD pwd.db/spwd.db
concepts were really not practical with a frequently frobbed passwd file.

So, I went with leenox*.  (said with a Skane accent and vodka on the breath)

Then and now I read rants on various BSD list archives about how 'pwd_mkdb'
bites the bag.  I think we need to take a different approach (I want to get
away from leenox and I'm determined to not turn back).

I don't think this is a question of tweaking or optimising 'pwd_mkdb' 
but rather a NEW method of dealing with a password file.  I realise 
this will take work and I know a few old hackers eyebrows are raising.

But, can pwd_mkdb really be optimised?

I'll bore you with statistics for a moment.

I've got 29,365 accounts converted from a passwd/shadow setup.  To 
do something as simple as changing a passwd requires for pwd.db and
spwd.db to be rebuilt.  These files are roughly 12mb each using a
stock pwd_mkdb and take about 5 minutes to build total on a Dec ALPHA
533mhz 5305 w/ 1024mb of RAM.

I was frobbing assignments in 'HASHINFO' hoping to get some speed ups.
I found that by increasing 'nelem' between 1024-2048 (from 256) and
keeping ffactor relatively small 128 (from 32) I could get a passwd
change done in about 4 minutes with the pwd.db and spwd.db files 
2mb smaller (about 10mb each).

Now for some other statistics.  Since August 1st (its now August 8th)
one production leenox system has seen 12,270 changes to passwd/shadow ..
these changes come from userdel, useradd, usermod, chfn, chsh and passwd.
being that there are only 1440 minutes in a day, it would take about
8.52 days to complete these 12,270 changes (and thats just in theory,
if they were happening all sequentially). 

bulk updates to the passwd file are out of the question.

Some thoughts .. btree versus hash? or, why not just get rid of dbs??
Have you ever watched /etc while the passwd files are being rebuilt?
Its just a nest for race conditions.  I've had a few test users on 
and from:

-rw-------   1 users       125 Aug  8 13:06 pw.20243a
-rw-------   1 users       131 Aug  8 13:37 pw.20658a
-rw-------   1 users       125 Aug  8 14:31 pw.21057a
-rw-------   1 users       121 Aug  8 14:34 pw.21069a
-rw-------   1 users       131 Aug  8 15:36 pw.21190a
-rw-------   1 users       134 Aug  8 15:57 pw.21359a
-rw-------   1 users       116 Aug  8 16:16 pw.21568a

We can see there has been some frustration/aborts with trying to do
something as simple as changing their shell.

Last night I decided to try to port JFHaugh's shadow suite.  I was
able to get a working suite of binaries (I hope you aren't cringing)
which could successfully manipulate the /etc/passwd file and a new
/etc/shadow file .. The updates were super quick, for obvious reasons,
there were no crocky databases that had to be rebuilt.  The main
drawback is that now I'm looking at either hacking up libc so that
the system calls work the way the shadow suite expects them to (for
the case of login it seems, other utilities in the shadow suite
seem to just work with the files (passwd/shadow) themselves) .. 
or writing my own login routines for pop3d, login, ftpd, sshd and 
others to use. 

So today I decided "lets try to optimise pwd_mkdb!" .. but in all
honesty, I'm about ready to give up.  This would be a sad reason 
to have to continue with leenox.

I have to apologise too, that I don't know the exact history or reasons
of using pwd.db/spwd.db or than keeping encrypted passwords out of 
the public's eyes and I can imagine though on a PDP-11/750 with
a small number of users using a hash table for commands like ls, finger
and even login would have been a bigwin ..

But from reading other messages with people who have large password 
files or more importantly large password files which are aos'ed or 
frobbed often, it sounds like a fundamental change needs to be made.

smj