tech-userlevel: Re: small hack

Subject: Re: small hack
To: None <tech-userlevel@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-userlevel
Date: 09/23/1998 10:34:54

> > X	/*
> > X	 * This algorithm taken from Knuth, Seminumerical Algorithms,
> > X	 * page 139.
> > X	 */
> > X	for (j = 0; j < t; j++) {
> > X		k = random()%t;
> > X		temp = shuffle[j];
> > X		shuffle[j] = shuffle[k];
> > X		shuffle[k] = temp;
> > X	}
> According to an algorythm book I've got (at home, can get reference),
> this shuffle routine isn't that great.  As j progresses, it might
> unshuffle some swaps, giving a slight preference to the initial
> dataset.

"If you disagree with Knuth, you're probably wrong."  Admittedly, I
haven't checked this code against Knuth, for three reasons: (1) I don't
have a copy of Knuth handy; (2) I trust Perry's ability to transcribe
Knuth's algorithm into C; (3) I think the algorithm is correct without
any need to proof by appeal to authority.

In short, I think your algorithms book is wrong, that the only ways in
which this doesn't generate a random shuffle are (a) the bias inherent
in t not being a divisor of the number of possible return values from
random() (if such happens to be the case), and (b) random() not
returning truly random values.

Admittedly, it's not as simple to prove as either of your suggestions,
or (my preference) the one with the first two lines reading
	for (j = 1; j < t; j++) {
		k = random() % (j+1);
But I'm convinced it's correct.  I'm still trying to find a proof of it
that's simple enough to be rendered comprehensibly into email.

					der Mouse

			       mouse@rodents.montreal.qc.ca
		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B