Beruflich Dokumente
Kultur Dokumente
10/23/11 7:50 PM
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 1 of 28
10/23/11 7:50 PM
The term "Evolutionary Computing" may very well be widely known at this point in
time, but they are still very much a programmers tool. 'By programmers for programmers' if you will. The applications out there that apply evolutionary logic are either
aimed at solving specific problems, or they are generic libraries that allow other programmers to piggyback along. It is my hope that Galapagos will provide a generic platform for the application of Evolutionary Algorithms to be used on a wide variety of
problems by non-programmers.
Before we dive into the subject matter too deeply though I feel it is important to highlight some of the (dis)advantages of this particular type of solver, just so you know what
to expect. Since we are not living in the best of all possible worlds there is often no such
thing as the perfect solution. Every approach has drawbacks and limitations. In the case
of Evolutionary Algorithms these are luckily well known and easily understood drawbacks, even though they are not trivial. Indeed, they may well be prohibitive for many a
particular problem.
Firstly; Evolutionary Algorithms are slow. Dead slow. It is not unheard of that a single
process may run for days or even weeks. Especially complicated set-ups that require a
long time in order to solve a single iteration will quickly run out of hand. A light/shadow or acoustic computation for example may easily take a minute per iteration. If we assume we'll need at least 50 generations of 50 individuals each (which is almost certainly
an underestimate unless the problem has a very obvious solution.) we're already looking at a two-day runtime.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 2 of 28
10/23/11 7:50 PM
All is not bleak and dismal however, Evolutionary Algorithms have strong benefits as
well, some of them rather unique amongst the plethora of computational methods. They
are remarkably flexible for example, able to tackle a wide variety of problems. There are
classes of problems which are by definition beyond the reach of even the best solver implementation and other classes that are very difficult to solve, but these are typically
rare in the province of the human meso-world. By and large the problems we encounter
on a daily basis fall into the 'evolutionary solvable' category.
Evolutionary Algorithms are also quite forgiving. They will happily chew on problems
that have been under- or over-constrained or otherwise poorly formulated. Furthermore, because the run-time process is progressive, intermediate answers can be harvested at practically any time. Unlike many dedicated algorithms, Evolutionary Solvers
spew forth a never ending stream of answers, where newer answers are generally of a
higher quality than older answers. So even a pre-maturely aborted run will yield something which could be called a result. It might not be a very good result, but it will be a
result of sorts.
Finally, Evolutionary Solvers allow -in principle- for a high degree of interaction with
the user. This too is a fairly unique feature, especially given the broad range of possible
applications. The run-time process is highly transparent and browsable, and there exists
a lot of opportunity for a dialogue between algorithm and human. The solver can be
coached across barriers with the aid of human intelligence, or it can be goaded into exploring sub-optimal branches and superficially dead-ends.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 3 of 28
10/23/11 7:50 PM
The Process
In this section I shall briefly outline the process of an Evolutionary Solver run. It is a
highly simplified version of the remainder of the blog post, and I'll skip over many interesting and even important details. I'll show the process as a series of image frames,
where each frame shows the state of the 'population' at a given moment in time. Before I
can start however, I need to explain what the image below means.
What you see here is the Fitness Landscape of a particular model. The model contains two
variables, meaning two values which are allowed to change. In Evolutionary Computhttp://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 4 of 28
10/23/11 7:50 PM
ing we refer to variables as genes. As we change Gene A, the state of the model changes
and it either becomes better or worse (depending on what we're looking for). So as Gene
A changes, the fitness of the entire model goes up or down. But for every value of A, we
can also vary Gene B, resulting in better or worse combinations of A and B. Every combination of A and B results in a particular fitness, and this fitness is expressed as the height
of the Fitness Landscape. It is the job of the solver to find the highest peak in this landscape.
Of course a lot of problems are defined by not just two but many genes, in which case
we can no longer speak of a 'landscape' in the strict sense. A model with 12 genes would
be a 12-dimensional fitness volume deformed in 13 dimensions instead of a two-dimensional fitness plane deformed in 3 dimensions. As this is impossible to visualize I shall
only use one and two-dimensional models, but note that when we speak of a "landscape", it might mean something terribly more complex than the above image shows.
As the solver starts it has no idea about the actual shape of the fitness landscape. Indeed, if we knew the shape we wouldn't need to bother with all this messy evolutionary
stuff in the first place. So the initial step of the solver is to populate the landscape (or
"model-space") with a random collection of individuals (or "genomes"). A genome is
nothing more than a specific value for each and every gene. In the above case, a genome
could for example be {A=0.2 B=0.5}. The solver will then evaluate the fitness for each
and every one of these random genomes, giving us the following distribution:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 5 of 28
10/23/11 7:50 PM
Once we know how fit every genome is (i.e., the elevation of the red dots), we can make
a hierarchy from fittest to lamest. We are looking for high-ground in the landscape and
it is a reasonable assumption that the higher genomes are closer to potential highground than the low ones. Therefore we can kill off the worst performing ones and focus on the remainder:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 6 of 28
10/23/11 7:50 PM
It is not good enough to simply pick the best performing genome from the initial population and call it quits. Since all the genomes in Generation 0 were picked at random, it
is actually quite unlikely that any of them will have hit the jack-pot. What we need to do
is breed the best performing genomes in Generation 0 to create Generation 1. When we
breed two genomes, their offspring will end up somewhere in the intermediate modelspace, thus exploring fresh ground:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 7 of 28
10/23/11 7:50 PM
We now have a new population, which is no longer completely random and which is already starting to cluster around the three fitness 'peaks'. All we have to do is repeat the
above steps (kill off the worst performing genomes, breed the best-performing
genomes) until we have reached the highest peak.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 8 of 28
10/23/11 7:50 PM
In order to perform this process, an Evolutionary Solver requires five interlocking parts,
which I'll discuss in something resembling detail. We could call this the anatomy of the
Solver.
1.
2.
3.
4.
5.
Fitness Function
Selection Mechanism
Coupling Algorithm
Coalescence Algorithm
Mutation Factory
Fitness Functions
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 9 of 28
10/23/11 7:50 PM
In biological evolution, the quality known as "Fitness" is actually something of a stumbling block. Usually it is very difficult to say exactly what it means to be fit. It certainly
has little or nothing to do with being the strongest, or the fastest, or the most vicious.
The reason there are no flying dogs isn't that evolution hasn't gotten around to making
any yet, it is that the dog lifestyle is supremely incompatible with flying and the sacrifices required to equip a dog with flight would certainly detract more from the overall
fitness than flight would add to it. Fitness is the result of a million conflicting forces.
Evolutionary Fitness is the ultimate compromise.
A fit individual is on average able to produce more offspring than an unfit one, so we
could say that fitness equals the number of genetic children. A better measure yet
would be to count the number of grand-children. And a better measure yet would be to
count the allele frequency in the gene-pool of the genes that made up the individual in
question. But these are all rather ad-hoc definitions that cannot be measured on the spot.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 10 of 28
10/23/11 7:50 PM
Let's have a look at the fitness landscape again and let's imagine it represents a model
that seeks to encase an object in a minimum volume bounding-box. A minimum bounding-box is the smallest orthogonal box that completely contains any given shape. In the
image below, the green shape is encased by two bounding boxes. B has a smaller area
than A and is therefore fitter.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 11 of 28
10/23/11 7:50 PM
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 12 of 28
10/23/11 7:50 PM
Every individual tries to maximize its own fitness, as high fitness is rewarded by the
solver. And the steepest uphill climb is the fastest way towards high fitness. So if the
black sphere represents the location of the ancestral genome, the orange track represents
the pathway of its most successful offspring. We can repeat this exercise for a large
amount of sample points which will tell us something about how the Solver and the Fitness Landscape interact:
Since every genome is pulled uphill, every peak in the fitness landscape has a basin of attraction around it. This basin represents all the points in model-space that will converge
upon that specific peak. It is important to notice that the area of the basin is in no way
representative of the quality of the peak. Indeed, a very poor solution may have a large
basin of attraction while a good peak might have a small catchment area. Problems like
this are typically very difficult to solve, as the solution tends to get stuck in local optima.
But we'll have a look at problematic fitness functions later on.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 13 of 28
10/23/11 7:50 PM
First, let's have a closer look at the actual fitness landscape for our minimum boundingbox model. I'm afraid it's not quite as simple as the image we've been using so far. I was
actually quite surprised how organic and un-box-like the actual fitness landscape for
this problem is. Remember, the x-axis rotation is mapped along the Gene A direction and
the y-axis rotation along the Gene B direction. So every point on the AB plane represents
a unique rotation composed of two angles. The elevation of this point is a direct mapping of the volume of the bounding-box at those two rotation angles:
The first thing to notice is that the landscape is periodic. I.e., it repeats itself every 90 degrees in both directions. Also, this landscape is in fact inverted as we're looking for a
minimum volume, not a maximum one. Thus, the orange peaks in fact represent the
worst solutions to this problem. Note that there are 16 of these peaks in the entire range
and that they are rounded. When we look at the bottom of this fitness landscape, we get
a rather different view:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 14 of 28
10/23/11 7:50 PM
It would appear that the lowest points in this landscape (the minimum bounding-boxes)
are both fewer in number and of a different kind. We only get 8 optimal solutions and
they are all very sharp, indicating a somewhat more fragile state.
Still, on the whole we have nothing to complain about. All the solutions are of equal
quality and there are no local optima at all. We can generalize this landscape to a 2dimensional graph:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 15 of 28
10/23/11 7:50 PM
This fitness landscape has two kinds of solutions. The high quality sharp ones near the bottom of the graph and the low quality flat ones
near the top. The basin of attraction is given for
both solutions (yellow for high quality, pink for
low quality) and you can see that about half of
the model space is attracted to the low quality
solutions.
An even worse example (flipped upright again this time, so high values indicate good
solutions) would be the following fitness landscape:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 16 of 28
10/23/11 7:50 PM
Even worse than this though is a landscape that has a high degree of noise or chaos. A
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 17 of 28
10/23/11 7:50 PM
landscape may be continuous and yet feature so much detail that it becomes impossible
to make any intelligible pronunciations regarding the fitness of a local patch:
Selection Mechanisms
Biological Evolution proceeds by Natural Selection. The ruthless force identified by Darwin as the arbiter of progress. Put simply, Natural Selection affects the direction of the
gene-pool over time by regulating who gets to mate. In extreme cases mating is prevented because a specific genome is so unfit that the bearer cannot survive until reproductive age. Another rather extreme case would be sterility. However, there's a myriad
ways in which Natural Selection can make it difficult or impossible for certain individuals to pass on their genetic footprint.
However, Natural Selection isn't the only game in town. For a long time now humans
have been using Artificial Selection in order to breed specific characteristics into a
(sub)species. When we try to solve problems using an Evolutionary Solver, we always
use some form of artificial selection. There's no such thing as sex or gender in the computer. The process of selection is also much simpler than in nature, as there is basically
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 18 of 28
10/23/11 7:50 PM
Allow me to enumerate the mechanisms for parent selection that are available in Galapagos. This is only a small subset of the selection algorithms that are possible, but they
seem to cover the basics rather well.
First off, we have Isotropic Selection, which is the simplest kind of algorithm you can
imagine. In fact, it is the absence of a selection algorithm. In Isotropic Selection everyone
gets to mate:
No matter where you find yourself on this fitness graph, your chances of ending up in a mating couple are constant. You might think that
this is a particularly pointless selection strategy
as it does nothing to further the evolution of the gene-pool. But it is not without precedent in nature. Take for example wind-pollination or coral spawning. If you're a sexually functioning member of such a species, you get to play ball come mating season. Another example would be females in a walrus colony. Every female in a colony gets to
breed with the dominant male, no matter how fit or unfit she is. Isotropic Selection is
certainly not without function either. For one, it dampens the speed with which a population runs uphill. It therefore acts as a safe-guard against a premature colonization of a
local -and possibly inferior- optimum.
Another mechanism available in Galapagos is Exclusive Selection, where only the top N%
of the population get to mate:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 19 of 28
10/23/11 7:50 PM
Another common pattern in nature is Biased Selection, where the chance of mating increases as the fitness increases. This is something we typically see with species that form
stable couples. Everyone is basically capable of finding a mate, but the really attractive
individuals manage to get a lot of hanky-panky on the side, thus increasing their
chances of becomes genetic founders for future generations. Biased Selection can be amplified by using power functions, which have the effect of flattening or exaggerating the
curve.
Coupling Algorithms
Coupling is the process of finding mates. Once a genome has been elected to mate by the
active Selection Algorithm, it has to pick a mate from the population to complete the act.
There are of course many ways in which mate selection could occur, but Galapagos at
the moment only allows one; selection by genomic distance. In order to explain this in
detail, I should first tell you how a Genome Map works. This
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 20 of 28
10/23/11 7:50 PM
is a Genome Map. It displays all the genomes (individuals) in a certain population as dots on a
grid. The distance between two genomes on the
grid is roughly analogous with the distance between the genomes in gene-space. I say roughly
because it is in fact impossible to draw a map
with exact distances. A single genome is defined
by a number of genes. We assume that all the
genomes in a species have the same number of
genes (this is not technically a limitation of Evolutionary Algorithms, even though it is
currently a limitation of Galapagos). Therefore the distance between two genomes is an
N-Dimensional value, where N equals the number of genes. It is not possible to accurately display an N-Dimensional point cloud on a 2-Dimensional screen so the Genome
Map is only a coarse approximation. It also follows that the axes of this graph have no
meaning whatsoever, the only information a Genome Map conveys is which genomes
are more or less similar (close together) and which genomes are more or less different
(far apart).
Imagine you are an individual that has been selected for mating (yay). The population is
well distributed and you are somewhere near the average (I'm sure you are a wildly
original and delightful person in real life, but for the time being try to imagine you are
in fact sort of average):
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 21 of 28
10/23/11 7:50 PM
You could of course limit your search of potential partners to your immediate neighbourhood.
This means that you mate with individuals who
are very much like you and it means your offspring will also be very much like you.
When this is taken to extremes we call it incestuous mating behaviour and it can become detrimental pretty quickly. Biological incest has a
nasty habit of expressing unhealthy but recessive
genes, but in the digital world of Evolutionary
Solvers the biggest risk of incest is a rapid decline in population diversity. Low diversity decreases the chances of finding alternate solution
basins and thus it risks getting stuck in local optima.
The other extreme is to exclude everyone near you. You'll often hear it said that opposites attract, but that's true only up to a point. At some point the genomes at the other
end of the scale become so different as to be incompatible.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 22 of 28
10/23/11 7:50 PM
You definitely do
not want
to mate
with
a
member
in a different
subspecies, as the offspring would likely land somewhere in the middle. And since these
two species are climbing different peaks, "in the middle" actually puts you in a fitness
valley.
It would seem that the best option is to balance in-breeding and out-breeding. To select
individuals that are not too close and not too far. In Galapagos you can specify an inbreeding factor (between -100% and +100%, total out-breeding vs. total in-breeding respectively) that allows you to guide this relative offset:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 23 of 28
10/23/11 7:50 PM
Coalescence Algorithms
Once a mate has been selected, offspring needs to be generated. On the genetic level this
is anything but fun and games. The biological process of gene recombination is horrendously complicated and itself subject to evolution (meiotic drive for example). The digital
variant is much more basic. This is partially because genes in evolutionary algorithms
are not very similar to biological genes. Ironically, biological genes are far more digital
than programmatic genes. As Mendel discovered in the 1860's, genes are not continuously variable qualities. Instead they behave like on-off switches. Genes in evolutionary
solvers like Galapagos behave like floating point numbers, that can assume all the values between two numerical extremes.
When we mate two genomes, we need to decide what values to assign to the genes of
the offspring. Again, Galapagos provides several mechanisms for achieving this.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 24 of 28
10/23/11 7:50 PM
Blend Coalescence will compute new values for genes based on both parents, basically averaging the values:
It is also possible to add a blending preference based on relative fitness. If mum is fitter than dad for example, her gene
values will be more prominent in the offspring:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 25 of 28
10/23/11 7:50 PM
Mutation Factories
All the mechanisms we have discussed so far (Selection, Coupling and Coalescence) are
designed to improve the quality of solutions on a generation by generation basis. However all of them have a tendency to reduce the bio-diversity in a population. The only
mechanism which can introduce diversity is mutation. Several types of mutation are
available in the Galapagos core, though the nature of the implementation in Grasshopper at the moment restricts the possible mutation to only Point mutations.
Before we get to mutations though, I'd like to talk briefly about Genome Graphs. A popular way to display multi-dimensional points on a two-dimensional medium is to draw
them as a series of lines that connect different values on a set of vertical bars. Each bar
represents a single dimension. This way we can quite easily display not just points with
any number of dimensions, but even points with a different number of dimensions in
the same graph:
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 26 of 28
10/23/11 7:50 PM
Two examples of mutations that cannot be used on a species which requires a fixed
number of genes are Addition and Deletion mutations.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 27 of 28
10/23/11 7:50 PM
Conclusion
Galapagos is still a very young product and hasn't really had time to position itself firmly in any work-flow, provided that it could. It seems to be capable of solving relatively
small problems quite quickly, but it certainly needs a lot of work to make it more robust
and usable. It is likely that the most effective applications for a solver of this type and
capability are small or partial problems. To try and evolve anything complicated will almost certainly result in frustration.
http://www.grasshopper3d.com/profiles/blogs/evolutionary-principles
Page 28 of 28