Subject: Re: make(1) variables: sort and disorder [patch]
To: None <tech-userlevel@NetBSD.org>
From: Alan Barrett <apb@cequrux.com>
List: tech-userlevel
Date: 06/01/2005 12:33:42
On Tue, 31 May 2005, Mike M. Volokhov wrote:
> sjg@crufty.net (Simon J. Gerraty) wrote:
> > Rathere than add a new option :X, turn allow :O to take an optional 2nd
> > flag - eg. 
> > :O	ordered
> > :Or	reversed
> > :Ox	random
> 
> Done. Please take a look to the patch (below) and thanks for the advice.

I like this a lot, and it will be very useful in pkgsrc for shuffling
the order of download sites.  I have a few comments about the man page
and the implementation.

>  .It Cm \&:O
>  Order every word in variable alphabetically.
> +.It Cm \&:Or
> +Reverse words in variable from head to tail.

It's not clear whether "head to tail" refers to the original order
or alphabetical order.  So I'd say something like "Order words in
variable in reverse alphabetical order."  Possibly also mention that
${variable:Or} is equivalent to ${variable:O:[-1..1]}.

> +.It Cm \&:Ox
> +Randomize words in variable. The results will be different each
> +time you are referring to the modified variable; use the assignment
> +with expansion
> +.Pq Ql Cm \&:=
> +to prevent such behaviour.

Perhaps this explanation would benefit from an example?

> If randomization will be done on unsorted
> +source sequence, it may produce an alphabetically ordered result;
> +to avoid this explicitly sort the source, and then randomize it as
> +.Ql Cm \&:O:Ox .

It should be expected that shuffling the words will ocasionally produce
ordered output.  That's part of the nature of random shuffling.  There
should be no need to try to prevent it.  Trying to prevent ordered otput
from a random process actually reduces the randomness of the process.
So I would remove this sentence frmo the man page.

>   * Input:
>   *	str		String whose words should be sorted
> + *	otype		How to order: (s)ort, (r)everse, intermi(x) 

The term "intermix" doesn't tell me that it means "random".  I suggest saying
"(x) random".

> +	    /* intermixed variable should return different values each time */
> +	    gettimeofday(&rightnow, NULL);
> +	    srandom(rightnow.tv_sec + rightnow.tv_usec);

I don't think there's any gain from calling srandom() more than once per
program execution.

> +	    for (i = 0; i < ac; i++) {
> +		    ai[i].rnd = random();
> +		    ai[i].avi = av[i];
> +	    }

I suggest using the shuffling method used in the get_shuffle() function
in src/usr.bin/shuttle/shuffle.c.  It has some good properties proved by
Knuth.

--apb (Alan Barrett)