Dot paths example This project simulates the evolution of some simple creatures called Dots living inside a 2D plane. Dots have two attributes: a position in the plane and a set of genes. More precisely, a position is an x,y coordinate while a 'gene', as it were, is an x,y offset within some range. A Dot 'expresses' a gene by applying the offset value contained within to its current position. For example if a Dot starts at position 6,6 and has a gene containing the offset value of 5,-5 then expressing this gene will offset the Dot's position to 11,1. Every Dot has the same number of n genes and each Dot can express each of its genes exactly once. And every generation of Dots begins life at the same location: START - an x,y coordinate in the 2D plane. In this way, a Dot can move through its 2D world some finite distance from the START. A Dot's only purpose in life is to move from START and reach the GOAL - another x,y coordinate somewhere in the 2D plane. Since a Dot has n genes it only has n moves that it can make to reach the GOAL. The object of this simulation is to selectively breed Dots with a set of genes which take the Dots on the shortest path from the START to the GOAL. Broadly speaking, the program can be summarized as follows:
  1. breed initial Dots
  2. express all Dot genes
  3. evaluate fitness of current generation
  4. create mating pool
  5. breed new generation of Dots
  6. loop from step 2 n times

#Setup


To start things rolling, the program breeds an initial generation of Dots where the genes of each Dot are randomly generated x,y offset values within some range. The 'range' here refers to the minimum to maximum step distance we want the Dots to be able to make. We could give offset values to the Dots that are big enough to take them to the GOAL in a single step, but this would make things a bit trivial. In order to make things a bit more interesting, the simulation gives small offset values to Dots forcing the Dots to accrue as many of the best offset values as possible in order to make it an appreciable distance to the GOAL. Here is a example of a path a first generation Dot might take: Typical first generation Dot path In the above image the blue square represents the START position and the red line represents the path the Dot took from the START position. Each red line segment here connected to a red dot represents an expressed gene. Even the maximum distance the Dot traveled in a single step is relatively small here. The final positions of Dots from the first generation is, as you might expect, a pretty random distribution of squiggly lines radiating from START.

#Fitness


Once every Dot from the current generation has expressed all of its genes the program proceeds to evaluate the fitness of each Dot. Here, fitness is a measure of a Dot's proximity to the GOAL relative to each other Dot. The closer a Dot is to the GOAL compared with its peers the higher its fitness weighting. With this definition of fitness in mind let's consider the following comparison image of two Dot paths: Comparison of Dot paths As before, the START is represented by a blue square while the GOAL is represented by a red square. The dot path depicted in the left image would be considered fitter by our working definition compared with its right image counterpart because its final position is closer to the red square. Once all Dots from the current generation have been awarded a fitness weight, the program proceeds to the Selection phase.

#Selection


The selection phase involves generating a mating pool. Dots with higher fitness weights are more likely to be drawn from the mating pool than those with lower fitness weights. The program draws breeding pairs from the mating pool in order to generate new Dots that form the next generation. The fitter Dots can be drawn from the pool multiple times during the course of selection and reproduction.

#Reproduction


To breed a new Dot, the program draws two parents from the mating pool and takes a random 50% portion of each parents genes. The 50% portion of each parents genes are then shuffled together to form a new set of genes belonging to the offspring. Some mutation is introduced during this phase of the program to help maintain genetic diversity.

#Mutation


Over the generations, selection & reproduction on their own will cause the fittest available genes from the Dot gene pool to proliferate and replace the less fit genes. While this is beneficial in the short term, it has diminishing returns in the long term. This is because, on average, even the very best genes gathered together from the initial gene pool will typically allow the Dots to make it only a portion of the distance to the GOAL. On their own then, selection & reproduction will eventually lead to the species becoming a genetic monoculture which leads to evolutionary stagnation. To mitigate this, the program introduces some artificial mutation during the reproduction phase by occasionally splicing a mutant gene into an offspring's genome. A mutant gene takes the form of another x,y offset value within some predefined range. Introducing mutant genes into the gene pool means that potentially fitter genes (than those currently available) can arise within the species, undergo selection and gradually proliferate, helping the Dots get closer and closer to the GOAL Controlling the rate of mutation is key here as we don't want to introduce too many mutant genes too rapidly because this can overwhelm the gene pool with noise. Our object, rather, is to introduce mutant genes at a gradual pace, allowing enough time for the new genes to have some effect on the wider population. The program controls the rate of mutation via a constant called the MUTATION_RATE so that it can doll out mutant genes at a rate that is beneficial to the species.

#Rinse and repeat


Once the new generation has been bred, the old generation is callously discarded and the cycle repeats for the new generation, blissfully unaware of their fate. As the simulation runs, each new generation of Dots should, in theory, inherit fitter & fitter genes that, on average, take them closer and closer to the GOAL. To demonstrate this, below are four snapshots of the path taken by the fittest Dot from the 1st, 4th, 8th and 20th generation of Dots from an example simulation: Comparison of Dot paths In this particular simulation, each generation consisted of 200 Dots where each Dot had 30 genes. We can observe here a general improvement across the generations as the genes that take the Dots in whacky in-direct paths are slowly filtered out of the gene pool, replaced by genes that take the Dots in a straighter line from the START to the GOAL. If you're interested in learning more, check out the code here.

Some of the tech I worked with on this project: