Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA249      CA318      CA425      CA651

w2mind.computing.dcu.ie      w2mind.org

Missing
DCU student

CASE3 student Paul Bunbury is missing since Thur 2 Feb 2012.
See appeals on crime.ie and garda.ie and facebook.

He is a great coder. See DCU page and boards.ie page.
He won major coding contests in 2010 and 2011.
He is author of the brilliant "FloodItWorld".
DCU can confirm that in Jan 2012 he passed all 6 modules comfortably.

GAs - Discussion

Why let less fit individuals reproduce?

Why not have a low temperature / let only the fittest reproduce? - In extreme case, if only the single most fit individual reproduces, they do so with themselves - cloning. Crossover of the same chromosome - crossover may as well not happen. Just asexual reproduction with very slow evolution by mutation only.

- Population is very rapidly converged in 1 place. 1-individual hill-climbing.

- Diversity comes from the less fit members. Often they have some (but not all) genes to contribute to the fit.

Also, the less fit at a given point may be on the slopes of the global maximum, while the fittest at this moment are on a local maximum. The less fit have to prove this quick though.


More on noise / temperature / step size

Often have:
Probability of mutation starts high and declines.
Probability of crossover starts high and declines.

Note Pm=1   doesn't work.
But Pc=1   can still work.


Comparison with State-Space Search / Neural Networks

State-Space Search - often have a heuristic "distance from goal".
Here - we have no idea how far away we might be from the optimal solution.

Neural Networks - we have measure of error E.
Here we don't. Absolute fitness gives us less info than an E measure.

More on this in Comparison of Neural Net and GA.




Why does crossover make a random cut?

Yes, in the machine version we actually know where the genes start and end on the chromosome (which nature doesn't know in general when doing crossover). So why not just reshuffle combinations of genes instead of cutting in the middle of a gene?

The reason is - Where does the diversity of genes come from? It would only come from mutation if we never cut in the middle of a gene. Slow evolution.

In fact, people have found that if they have crossover, they don't need mutation! "Sex drives evolution. Mutation is irrelevant."? May make sense - less than 1 in a million genes mutates, whereas crossover happens non-stop.

What if Pm=0? And we rely entirely on crossover to give new diversity.
Imagine if by chance all initial population have a "0" in position i.
Then crossover can never construct an individual with "1" in position i.

Note in nature a crossover may break the gene at the crossover site. However, in the 2-copy system, this is only 1 of the copies. The other may have undergone its own crossover, but very unlikely to be the same point, so we still have a working copy just in case.




What if the new individual is not viable?

For many problems, it is not simply a matter of constructing the individuals out of random values of the n parameters or genes, and then testing the fitness. Say the individual will simply not run at all if gene 1 is not > 3.

If you just encode the parameters as real numbers, it may be that 99 percent of parameter combinations won't even run. Our initial random population will all have zero fitness, and so will almost any mutation or crossover of a fit individual.

We could change the genotype coding to reflect this complexity, so the bitstring in gene 1 was simply interpreted as from 3 upwards, the bitstring in gene 2 was interpreted as 0-7, etc. But how about constraints like: "Unless   gene 3 < gene 7 < gene 11,   the individual will simply not run"? This knocks out a huge chunk of that multi-dimensional space.

One solution is simply a loop when we are constructing new chromosomes:


given parents:

  do
  {
   crossover, constructing child1 and child2
   cout << "searching for viable crossover .. \n"
  }
  while ( ! ( child1.viable() && child2.viable() ) )

where viable() is defined in the application.
We would need a similar loop in the initial randomise population function, to ensure that all initial individuals were viable:

repeat n times:

  do
  {
   construct new randomised individual x
   cout << "searching for viable randomisation .. \n"
  }
  while ( ! x.viable() )

This can work, but only if chromosome interpretation has already done most of the job and we just have 1 or 2 further constraints to satisfy. If 99 percent produced still aren't viable, these loops will take a long time to terminate. This time may be very small compared to the time needed to actually test the solution. If so, loops probably still ok. If not:
  1. More work on redefining the interpretation of the genotype coding.
  2. Redefine crossover and mutation for this application/ class of applications (problem-specific), so that they preserve whatever constraints are demanded, yet do, as far as possible, the traditional jobs of crossover in swapping good ideas and mutation in coming up with novelty.




Using GAs as just yet another tool

Useful for optimising n parameters, all of which influence each other.

Useful for scheduling that minimises some cost function. e.g. Timetabling of lectures/exams. Minimise no. of clashes (options). Minimise no. of exams on consecutive days, etc.


Representing the function - GAs v. Neural Networks

In neural nets, we were looking to match an n-dimensional Input with its correct Output. i.e. Represent the function over its range.
Requirements: Correct Outputs for many representative Inputs (over entire range).

Here, we're trying to find the n-dimensional Input that leads to the highest Output. We don't need to build up a representation of the function.
Requirements: Given any Input, tell me the Output.

We can combine these: Do our GA search, and, as we go along, build up a network to represent the map.
But of course, we won't be representing it over the whole range, because we'll concentrate only on the higher peaks.
We get random coverage at the start, but it quickly becomes non-random.

Really need a different algorithm to represent the entire function.
GAs are for when we want to find the best combination of n parameters and then just use it. We're not particularly interested in the Outputs for other combinations.




Random number generators

Typically, we initialise our population, and make probabilistic decisions thereafter, using a pseudo-random number generator.

If we use the same seed each time, then each run of evolution will be exactly the same. Therefore, use a unique seed each time, e.g. seed based on the clock time.




Sample applications of Genetic Algorithms



Feeds      HumphrysFamilyTree.com

Bookmark and Share           On Internet since 1987.