Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA249      CA318      CA425      CA651

w2mind.computing.dcu.ie      w2mind.org


Computational Evolution


"How evolution works" (Expressed as an algorithm)


Background reading:



Natural Evolution

Living things contain a diversity of DNA, a diversity of ways of persisting. Some things persist, some things don't.

  1. The Victorians saw it as a struggle of the fit, of the strong, where what would persist would be more intelligent, more admirable in some way.

  2. Now we see it as more complex. You can survive by being sneaky, by being a cheater, by being a parasite, by being an unnoticed harmless freeloader, by being a byproduct of something else. You can survive in symbiosis with other species, other genes. You can survive as part of a massive social grouping where the individual is irrelevant. You can get locked into spirals of sexual mate selection which have little to do with "fitness". You can lose talents as well as gain them. You can find yourself stuck in an evolutionary local maximum, where "You can't get from here to there." You can be wiped out by utterly random, and completely unfair, selections (dinosaurs). You can survive unfairly because your competition was so wiped out (mammals). You can even have mass extinctions that have basically no cause (this is related to chaos theory). Certain systems can have periodically unstable dynamics which cause mass extinction for no reason ("self-organised criticality").

    The question we always ask is: "If we come back in a million years, will these creatures be long extinct, or will they actually still be here?" e.g. Now on earth. Go away and come back in 5 million years. What will be here? Go away again and come back in 50 million years. Same question. What will persist?

    Even bodies cannot be taken as a given. The chicken is an egg's ludicrous, elaborate, and yet successful, way of making another egg.   Death cannot be taken as a given. Nor sex. Nor even a population of individuals. Nor even a mechanism of inheritance (DNA). You have to explain why these things evolved. Why they persist. This view of evolution, exemplified by the writings of Richard Dawkins, is very much how Darwin saw it.


Background reading: Memes (evolution of culture):



Computational evolution

  1. "Genetic Algorithms" use evolution as a population search algorithm, and have as a result returned very much to the simple Victorian idea of a "fair test". Normally we want our algorithm to solve a problem, and the best at solving it get to survive. e.g. Fitness = Score out of 100 in a game.

  2. Artificial Life looks at the more modern definition of evolution, and sets up dynamic systems (ecologies, economies) and just lets them run. Instead of an explicit Fitness score, and the fittest reproduce, you have rules or "laws of physics" like: "You need energy to reproduce. You get energy from food. You can eat plants, dead animals, or hunt for yourself. You can steal other creatures' food. To reproduce, you must find a mate. You must stay alive until you reproduce. Every action uses up energy. If you use up too much energy you die." Then you run the environment and see what persists. Often you are surprised at the strategies creatures evolve in order to persist.

    In fact, the above is a perfect example! There is a problem with the rules above. A successful creature might simply do nothing! Why?

    But is just seeing what persists of any interest to engineers? Surely we want to orient the machine towards some particular problem we want solved? The ALife people point out that the most impressive machines ever all arose this way. But their success in emulating this has been limited. Generally very simple things arise - although there have been plenty of insights into the evolutionary process.




Evolution as an algorithm

The simplest, crudest way to express an evolutionary process as an algorithm is to use an explicit fitness function:



We have a population of "creatures".
Their "DNA" is the instructions on how to build a creature.
They are not all identical, but rather there is a diversity of DNA.
Each represents a way of solving a problem or problems.

Each time step:
  Each creature has to solve some problem or problems.
  The creature's "fitness" is rated according to this test.
  The fittest get to reproduce.
  Reproduction involves constructing a new creature from their DNA.
  Crucially, some change in the DNA must be possible,
   otherwise no new solutions can arise.
  One of the major sources of change is sex
   (we mix up the DNA of two successful creatures to make a new one).
  Another source of change is simply small mutations of the DNA.
  The less fit simply don't reproduce.
  The new population of creatures arises from the fittest old ones
   and so, statistically, are more like the fitter end of the 
   previous population.
  The old population (fit and unfit alike) die to make way for the new.
  The new population is diverse itself, and is tested in its turn,
   and so on.




Hardware evolution

Almost all evolution algorithms are done in software only, but some people are experimenting with hardware evolution:





Cornell researchers build a robot that can reproduce




Some definitions


Genotype - The "DNA". The information that is inherited.
 The instructions for how to build a new solution.

Phenotype - The "body". What the genotype constructs.
 The machine that actually solves the problem.
 e.g. The genotype is a list of "weights" (parameters) that define a neural network.
 The phenotype is a neural network with these weights built-in.

Genome - An instance of the genotype.


Gene - A subsection of the genotype that can be isolated
 and identified to have some function.
 Sometimes in nature a gene can be related 1-1 to some function
 like eye colour. More often, phenotype attributes arise from
 the complex interaction of multiple genes
 under a developmental process itself defined by other genes.

 In engineering, we normally enforce a tidy separation of genes.
 The genotype consists of a list of all the parameters needed
 to build the machine (e.g. with a Neural Net:
  no. of nodes, links, weights, thresholds,
  learning rate, weight initialisation parameter).
 Each parameter is a gene.
 Certain combinations of parameters/genes may be better than others.
 Evolution is a search in this multi-dimensional space. 


Allele - The value that a gene has.
 e.g. The gene for eye colour can take values {blue,brown,green}.

Chromosome - Genes are often grouped together on structures
 called chromosomes, so combinations of them can be easily
 reshuffled, especially by sex.



Inheritance

  1. Lamarckian inheritance - The discredited idea that what you learn in life can be passed on through your genes.

  2. Darwinian / Mendelian inheritance - What you learn can only affect whether you pass on your genes, not what is in them.



Adaptive Landscape

Certain gene combinations are fitter than others, and evolution is a search in the multi-dimensional space of gene combinations to find fitter solutions.

This was recognised by evolutionary biologists long before hill-climbing computer programs (or indeed any computer programs):

You will recognise from our previous hill-climbing/descending that this a form of "blind" search on a landscape where we only get feedback on particular points that we try out. We never get to see a picture of the landscape as above. And even worse than the case with Neural Networks, here we do not even get a measure of the slope. We are only told the fitness-value. We do not know the direction in which we should move the genes to improve matters.

No method that works like this can guarantee to find the global maximum without infinite exploration. The global maximum could be almost completely hidden, with almost no landscape leading towards it (it must have some landscape leading towards it if the landscape is continuous - it can't be a point).

But of course how do we know in advance that the landscape is continuous? If we know nothing about it, it may not be.




A population of hill-climbers

The main feature of evolution by hill-climbing is that we use a population of hill-climbers. We start the experiment by "dropping" our hill-climbers at random points scattered all over the landscape (hopefully giving good coverage). Many may be starting in useless places, but if we know nothing about the landscape what else can we do except drop them randomly? They then begin to move. (Of course, as before, they move by discrete jumps, not continuously.)

For instance, let the genotype encode a real numbered value x in the interval 1-10, and the "fitness" of x is:

f(x) = sin(x) + sin(2x) + sin(5x) + cos(x)
Then all we are doing is maximising this function over this interval. We start with a population of random x. Generation 1 drops them all over the interval:

Now, the fittest get to reproduce. Not the single fittest solution, but a number of the better solutions. This means that instead of following a single path up the landscape, we keep multiple solutions alive. Generation 7:

Why is keeping multiple solutions alive a good idea? Because there is no guarantee that the best solutions right now are on the slopes of the global maximum. We give many different areas of the landscape a chance to prove themselves.

It is a "noisy" search over an unknown landscape, where "noise" = step size / size of mutations that we make.

Should the noise decrease over time, as we converge on a solution? (Probability of or size of mutation should decrease over time.)



Feeds      HumphrysFamilyTree.com

Bookmark and Share           On Internet since 1987.