Subject: Re: adb & genetic algorithms
To: None <port-mac68k@NetBSD.ORG>
From: Rick Hawkins <rhawkins@iastate.edu>
List: port-mac68k
Date: 05/06/1997 13:42:34
> > >  And I have three batteries and a long extension
> > cord . . . and toshiba is making 810mb drives, $399 . . .
 
> Oh, cool. Where?

http://www.mcetech.com lists them for $449, but discounts $50 if you 
tell them you saw the post (in all kinds of newsgroups where it didn't 
belong).  I assume that they also exist through other venders.

 
> We know how the interfaces work for the most part, but an algorithm
> such as you describe might help us weed out some problems. Do you have
> any examples of code for such a project so we can determine how 
> difficult it would be?


I can pull up a few, but here's the rough descriptin.  They're usually 
fairly quick to write.  I'll assume here that
a) bsd has loaded enough of itself to run the program
b) the search space is known
c) that there is a way of measuring how successful the attempt was, with 
intermediate states between success & fail.

Now, in no particular languge :)

c popsize is the number of candiate solutions to use at once
c stringsize is the size in bits of the searchspace
Parameter popsize = 20, stringsize=500,groups=popsize/4

binary agents(popsize,stringsize)
integer scores(popsize),order(popsize)

c randomize the initial populations
do i=1,popsize
  agents(i,) =binaryrandom(stringsize)
c ie, completely randomize all of the bits
enddo

c  the main loop
do while not done

c give the current population a try

 do i=1,20
c reinitialize adb
   resetadb()
c try initializing with these parameters
   initadb(agents(i))
c store the evaluation of the attempt
   scores(i) = evaluation()     
 enddo

c if full solution, quit
  if (max(scores) = perfectscore ) then done = true

c now we've tried them all; breed them

c this method is tournament selection; it prevents convergence at too 
c fast a rate

c randomly order the candidates
c randomorder is assumed to be a function that returns an array
c of integers in  a random order
  order=randomorder(popsize)

c now sort each group of 4
c groupsort sorts the i-th group of 4 elements in "order",according to 
c their scores in "scores".  it needn't rearrange scores, as this is all
c we needed them for.  also, it needn't fully sort; all that it really
c has to do is to put the two highest in the first two spots
  do i=1,groups
    groupsort(order,scores,i)

c in this group, keep the two best, and breed the other two
c define the two parents to make code look cleaner while explaining
    mom = 4*i-3
    dad = 4*i-2
    brother = 4*i-1
    sister=4*i
    do j = brother, sister
      startsplice=random(stringlength)
      endsplice=random(stringlength)
c  make a string of 0 above startsplice, and above endsplice; 
c 1  elsewhere  (use modulo stringlenth; end could be before start
      splicestring = makesplice(startsplice,endsplice)
      
      agents(order(j))= (agents(order(mom)) AND splicesting)
         OR (agents(order(dad)) AND NOT splicestring)

c the offspring now have some genes from mom, & some from dad

c now maybe mutate it
      if(random < muterate)
        c if we're here, we're mutating
        mutestring = randomstring AND makesplice(random(stringlength),
           random(stringlength))
        agents(order(j)) = agents(order(j)) XOR mutestring
      endif
     enddo
  enddo    

c we now have our next generation; do it all again

  enddo

enddo
c once we're here, we have something that works
end




A couple of notes:
1) the binary string isn't usually the best choice.  generally, this 
string is a group of other values--separate parameters, etc.  It works 
better to mutate the smaller pieces (ie., if a mutation, one or more of 
these change).  mutating the whole string can cause unreasonable changes 
in high bits of one parameter, and lowbits in another.
2) similarly, if there are known bounds on a parameter (e.g., bcd rather 
than hex), bounds checking on the mutated parameters helps.
3)  the mutation helps avoid local optima.  However, GA's are 
susceptible to these.  Here, though, any "working" value would be 
usable.
4) it could be useful to dump the population at various for inspection, 
which could allow optimizing or hand-searching for values
5) If the searchspace or the adb problem can be parameterized this way, 
there is a *really* cool paper to publish in an alife journal
6) if one of the "bad" outcomes could be a machine hang, the state 
should be saved for each solution, with a score of "hang" saved.  There 
should be a mechanical device resetting the machine at some intervals 
(or if it fails to talk to a second machine often enought.  My original 
thinking, before the 180 was cracked, was to wire the relay switch on my 
old tandy 102 to the reset on the powerbook . . .
7) if there is no way to rank "how" successful an attempt was, this is 
all for nothing :(

rick