Subject: Re: new pid allocation code
To: None <tech-kern@netbsd.org>
From: gabriel rosenkoetter <gr@eclipsed.net>
List: tech-kern
Date: 03/17/2003 14:24:25
--yIMmkp4onSTDetno
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Mar 17, 2003 at 05:18:40PM +0000, David Laight wrote:
> Somewhen in that last 20 years (since I did my maths degree) I seem
> to have forgotten the difference between o(n) and O(n) :-(

So had I. My first response was to say that your numbers were
meaningless, because I had the meanings of omega and little-o
confused. But I stopped, checked the definitions[1], and then didn't
ask quite the right question.

In short terms:

big-O: f(n) =3D O(g(n)) iff 0 <=3D f(n) <=3D c * g(n) for all n >=3D k with
fixed c.

little-o: f(n) =3D o(g(n)) iff 0 <=3D f(n) <=3D c * g(n) for all n >=3D k. k
must not be dependent on n, but may be dependent on c.

So my question should have been: "What's the dependence relation
between k and c?" Or, better, "Do you really mean that you're O(1),
because if so that rocks, no matter what the old algorithm was."

The issue here is that n is the (variable) number of processes, c is
always a fixed mark on that scale, and k is either also a fixed mark
or a mark dependent (by some function) on c.

The actually relevant point is how large c and k are. If either's
astromical relative to the c & k of the old algorithm, then it
doesn't matter that we're constant after a point, because we're
screwed before that point.

It's all pretty academic, though. I'd be surprised if there were a
wide variance in your constants and the existing algorithms, or you
wouldn't be touting {o,O}(n^2) versus {o,O}(1) (because you wouldn't
have measurements that said anything like that).

> In any case my new code has a constant (small) cost for allocating
> a pid.  The existing code has an n^2 term so, for a sufficiently
> large number of processes, the pid allocation will become the
> dominant term in the cost of fork() (unless something else is worse).

If the largest cost in your new way is constant, and *any* cost in
the existing way is n^2, you win. :^>

> These 'small' costs can get significant.  One Unix kernel used a
> linked list for the fd -> file map for each process.  This meant
> that the convertion was o(fd), which made poll o(fd^2) and setting
> up a large poll list o(fd^3).  One particular test suffered badly :-)
> (IIRC it had several thousand files in the list...)

I think those are also Os, but I don't know the details of how the
constants were fixed. Certainly, that code can lead to pathologica
situations easily based on your description.

[1] http://www.nist.gov/dads/HTML/bigOnotation.html

--=20
gabriel rosenkoetter
gr@eclipsed.net

--yIMmkp4onSTDetno
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (NetBSD)

iD8DBQE+diDp9ehacAz5CRoRArl5AJ95nT118hdXtzm//wqBC6+BlnhVLgCffou6
l/dQ2pWw4HqcSazM1WNc2XI=
=7QCZ
-----END PGP SIGNATURE-----

--yIMmkp4onSTDetno--