Beruflich Dokumente
Kultur Dokumente
Optimization
PREPRINT
(camera-ready)
0-7803-1899-4/94 $4.00
c 1994 IEEE
(camera-ready preprint) ICEC '94
c 1994 IEEE (pp. 82-87)
Abstract| Many, if not most, optimiza- A few studies have tried a dierent approach to
tion problems have multiple objectives. His- multicriteria optimization with GAs: using the GA
torically, multiple objectives have been com- to nd all possible tradeos among the multiple,
bined ad hoc to form a scalar objective func- con
icting objectives. Such solutions are non-
tion, usually through a linear combination dominated, in that there are no other solutions su-
(weighted sum) of the multiple attributes, or perior in all attributes. In attribute space, the set of
by turning objectives into constraints. The non-dominated solutions lie on a surface known as
genetic algorithm (GA), however, is readily the Pareto optimal frontier2. The goal of a Pareto
modied to deal with multiple objectives by GA is to nd a representative sampling of solutions
incorporating the concept of Pareto domina- all along the Pareto front.
tion in its selection operator, and applying II. Previous Work
a niching pressure to spread its population
out along the Pareto optimal tradeo sur- We assume the reader is familiar with the simple
face. We introduce the Niched Pareto GA as GA [3]. Here we review previous approaches to mul-
an algorithm for nding the Pareto optimal tiobjective optimization with GAs.
set. We demonstrate its ability to nd and In his 1984 dissertation [10], and later in [11],
maintain a diverse \Pareto optimal popula- Schaer proposed his Vector Evaluated GA (VEGA)
tion" on two articial problems and an open for nding multiple solutions to multiobjective (vec-
problem in hydrosystems. tor valued) problems. He created VEGA to nd
and maintain multiple classication rules in a set
I. Introduction covering problem. VEGA tried to achieve this goal
Genetic algorithms (GAs) have been applied almost by selecting a fraction of the next generation us-
exclusively to single-attribute1 problems. But a care- ing one of each of the attributes (e.g., cost, reli-
ful look at many real-world GA applications reveals ability). Although Schaer reported some success,
that the objective functions are really multiattribute. VEGA seems capable of nding only extreme points
Typically, the GA user nds some ad-hoc function on the Pareto front, where one attribute is maximal,
of the multiple attributes to yield a scalar tness since it never selects according to tradeos among
function. Often-seen tools for combining multiple attributes.
attributes are constraints, with associated thresh- In his review of GA history, including Schaer's
olds and penalty functions, and weights for linear VEGA, Goldberg [3] suggested the use of non-
combinations of attribute values. But penalties and domination ranking and selection to move a pop-
weights have proven to be problematic. The - ulation toward the Pareto front in a multiobjective
nal GA solution is usually very sensitive to small problem. He also suggested using some kind of nich-
changes in the penalty function coecients and ing to keep the GA from converging to a single point
weighting factors [9]. on the front. A niching mechanism, such as shar-
The authors are with the Illinois Genetic Algorithms Labo-
ing [5], would allow the GA to maintain individuals
ratory, University of Illinois at Urbana-Champaign, 117 Trans- all along the non-dominated frontier.
portation Building, 104 South Mathews Ave., Urbana, IL
61801. Internet: jehorn@uiuc.edu, nick-n@uiuc.edu, gold-
Fonseca and Fleming [2], and, independently, Horn
berg@vmd.cso.uiuc.edu. Phone: 217/333-2346, Fax: 217/244- and Nafpliotis [7], implemented Goldberg's two sug-
5705. The rst author acknowledges support from NASA under
contract number NGT-50873, while the remaining authors ac-
gestions, and successfully applied the resulting al-
knowledge support provided by the U.S. Army under Contract gorithms to dicult, open problems. Fonseca and
DASG60-90-C-0153.
1 We use the terms \attribute", \objective", and \criteria" 2 We assume familiarity with the concept of Pareto optimality,
interchangeably to describe a scalar value to be maximized or but note here that the Pareto front often goes by the names
minimized. \Decision variable" refers to the parameters of the Pareto optimal set, non-dominated frontier, ecient points, and
problem encoded in the genome of the genetic algorithm. admissible points.
(camera-ready preprint) ICEC '94
c 1994 IEEE (pp. 82-87)
Fleming found many good tradeos in a four at- sharing to choose a winner, as we explain later. The
tribute gas turbine design problem. Horn and Naf- sample size tdom (size of comparison set) gives us
pliotis concentrated on a series of two attribute prob- control over selection pressure, or what we call dom-
lems, which we describe later in this paper. ination pressure. The performance of the Niched
Pareto GA is somewhat sensitive to the amount of
III. The Niched Pareto GA domination versus sharing pressure applied [7].
The specics of the Niched Pareto GA are local- A problem will arise if both candidates are on
ized to implementation of selection for the genetic the current non-dominated front since neither will
algorithm. One of the most widely implemented be dominated. Even o the front, a small tdom
selection techniques for GAs is tournament selec- could mean that neither appears dominated. And
tion. In tournament selection a set of individuals is of course both could be dominated. How is a winner
randomly chosen from the current population and then chosen in such a \tie"? If we choose the winner
the best of this subset is placed in the next pop- at random, genetic drift will cause the population to
ulation. By adjusting the size of the tournament converge to a single region of the Pareto front. To
we can exert some control over the amount of se- prevent this we implement a form of sharing when
lection pressure and hence convergence speed. Thus there is no preference between two individuals.
the smallest tournament size of two (binary tourna- B. Sharing on the non-dominated frontier
ment) exhibits slower convergence than any larger
tournament size. Fitness sharing was introduced by Goldberg and
Tournament selection assumes that we want a sin- Richardson [5], analyzed in detail by Deb [1], and
gle answer to the problem. After a certain num- applied successfully to a number of dicult and real
ber of generations the population will converge to world problems. The goal of tness sharing is to
a uniform one. To avoid convergence and maintain distribute the population over a number of dierent
multiple Pareto optimal solutions, we have altered peaks in the search space, with each peak receiv-
tournament selection in two ways. First we added ing a fraction of the population in proportion to the
Pareto domination tournaments. Second, when we height of that peak4 .
have a non-dominant tournament (i.e., a tie), shar- To achieve this distribution, sharing calls for the
ing is implemented to determine the winner. degradation of an individual's objective tness fi by
a niche count mi calculated for that individual. This
A. Pareto domination tournaments degradation is obtained by simply dividing the ob-
The binary relation of domination leads naturally jective tness by the niche count to nd the shared
to a binary tournament in which two randomly se- tness: fi =mi . The niche count mi is an estimate
lected individuals are compared. If one dominates of how crowded is the neighborhood (niche) of indi-
the other, it wins. Initially, we used such a small vidual i. It is calculated over
P
all individuals in the
local domination criterion, but we soon found that current population: mi = j 2Pop Sh[d[i; j]], where
it produced insucient domination pressure. There d[i; j] is the distance between individuals i and j and
were too many dominated individuals in later gener- Sh[ d ] is the sharing function. Sh[d] is a decreasing
ations. It seemed that a sample size of two was too function of d[i; j], such that Sh[0] = 1 and Sh[d
small to estimate an individual's true \domination share ] = 0. Typically, the triangular sharing func-
ranking"3. tion is used, where Sh[d] = 1 d=share for d
Because we wanted more domination pressure, and share and Sh[d] = 0 for d > share . Here share is
more control of that pressure, we implemented a the niche radius, xed by the user at some estimate
sampling scheme as follows. Two candidates for se- of the minimal separation desired or expected be-
lection are picked at random from the population. tween the goal solutions. Individuals within share
A comparison set of individuals is also picked ran- distance of each other degrade each other's tness,
domly from the population. Each of the candidates since they are in the same niche. Thus convergence
are then compared against each individual in the occurs within a niche, but convergence of the full
comparison set. If one candidate is dominated by population is avoided. As one niche \lls up", its
the comparison set, and the other is not, the latter niche count increases to the point that its shared
is selected for reproduction. If neither or both are tness is lower than that of other niches.
dominated by the comparison set, then we must use Fitness sharing was originally combined with t-
ness proportionate (a.k.a., roulette wheel) selection.
3 Note that any partial order determines a unique ranking, in
which maximal individuals are ranked rst, then removed. The
When sharing is combined with the more popular
remaining individuals are reordered, and the maximal individuals
of this set are ranked second, and removed, etc. This is the 4 The authors sometimes refer to this form of niching as tness
domination ranking scheme suggested by Goldberg [3]. proportionate sharing.
(camera-ready preprint) ICEC '94
c 1994 IEEE (pp. 82-87)
Unitation 35
12 | P 30
11 | - P f21
10 | - - - P 25
09 | - - - - - P 20
08 | - - - - - - - P f22
15
07 | - - - - - - - - - P
06 | - - - - - - - - - - P 10
05 | - - - - - - - - - - 5
04 | - - - - - - - -
03 | - - - - - - -6 -4 -2 0 2 4 6
02 | - - - -
01 | - -
00 | -
|||||||||||||||||||||||{
0 1 2 3 4 5 6 7 8 9 10 11 12 Pairs
Figure 5: Schaer's function F2, P = fx j 0 x 2g
6 Generation 0
0.15
f22 5 f21 gen 0 =
0.14
4
3 VEGA 0.13
1
0.06
-1 1 2 X
0.05
0 50 100 150 200 250 300 350 400 450 500
attributes f0; 0g, where we detect no plumes and Figure 8: Final distribution, problem 3.
therefore have no volume of contaminant to clean
up. If we maximize the number of detected plumes
our volume of cleanup increases dramatically. GA has found an apparent front that is indeed im-
It is important to note that this problem is in- proved over the initial population. It is promising
tractable. The search space is of size (wk ). In our to see that even after a large number of generations
specic example we have w = 396 and k = 20. The we are maintaining diversity over most of an appar-
whole search space is then 396 20 which is 2:269 ent front. There is denite improvement as to the
1033. This makes it impossible to know the actual location of the front and the decrease in the number
Pareto optimal front from enumeration. of dominated individuals in the population.
Monte-Carlo simulation was used to develop a set We do not know yet whether this is the actual
of possible leak plumes, the set of wells that detect Pareto optimal front or a sub-optimal front. But
each plume, and the volume leaked when each well our rst few runs indicate that the equivalence-class
detected the contaminant plume. Using these data, sharing and dominance tournaments are working to-
we constructed a vector-valued tness function to gether. We have shown that a tradeo curve better
return the number of plumes and average volume than a random sampling can be developed by the
detected by any given set of wells. Niched Pareto GA on an open problem.
In our rst few runs, N = 2000, share = 40,
tdom = 40, pc = 0:8, and no mutation. In Figure 7 V. Discussion
one can see that the random initial population is These preliminary results on the application of the
distributed throughout the search space. Figure 8 niched, Pareto technique are encouraging. Fonseca
shows that after 230 generations the Niched Pareto and Fleming [2] have also reported initial success
9 This open problem was developed by Wayland Eheart and his with a similar algorithm. But we have found that
colleagues at the Civil Engineering Department at the University the performance of the Niched Pareto GA is sensi-
of Illinois at Urbana-Champaign. We are grateful to Dr. Eheart tive to the settings of several parameters. In partic-
and his students S. Ranjithan, P. Stork, and S. Cieniawski for
helping us implement it. ular, it is important to have a large enough popula-
(camera-ready preprint) ICEC '94
c 1994 IEEE (pp. 82-87)
tion to search eectively and to sample the breadth Pareto approach to multiobjective problems.
of the Pareto front. Both [7] and [2] discuss the set- References
ting of share and population size together to yield
eective sampling. [1] Deb, K. (1989). Genetic algorithms in mul-
But the behaviour of the Niched Pareto GA seems timodal function optimization. MS thesis,
to be most aected by the degree of selection pres- TCGA Report No. 89002. University of Al-
sure applied. Just as tournament size tsize is critical abama.
to selection pressure and premature convergence in [2] Fonseca, C. M., & Fleming, P. J. (1993). Ge-
a regular GA with tournament selection, so tdom di- netic algorithms for multiobjective optimiza-
rectly eects the convergence of the Niched Pareto tion: formulation, discussion and generaliza-
GA. Horn and Nafpliotis [7] illustrate the eects of tion. Proceedings of the Fifth International
too little and too much dominance pressure. Here, Conference on Genetic Algorithms. Morgan-
we summarize their empirically-derived, order-of- Kauman, 416-423.
magnitude guidelines: [3] Goldberg, D. E. (1989). Genetic Algorithms in
Search, Optimization, and Machine Learning.
tdom 1% of N; results in too many dominated Reading, MA: Addison-Wesley.
solutions (a very fuzzy front). [4] Goldberg, D. E., Deb, K., & Horn, J. (1992).
tdom 10% of N; yields a tight and complete Massive multimodality, deception, and genetic
distribution. algorithms. Parallel Problem Solving From Na-
tdom 20% of N; causes the algorithm to pre- ture, 2, North-Holland, 37-46.
maturely converge to a small portion of the [5] Goldberg, D. E., & Richardson, J. J. (1987).
front. Alternative tradeos were never even Genetic algorithms with sharing for multi-
found. modal function optimization. Genetic Algo-
We have not yet addressed the critical issue of rithms and Their Applications: Proceedings of
search, but we have some intuitions. Our intuition the Second ICGA, Lawrence Erlbaum Asso-
ciates,
in the case of Pareto optimization is that the diver- [6] Horn, J., Hillsdale, NJ, 41{49.
sity along the currently non-dominated frontier ac- (1993). Finite Markov chain analysis
tually helps the search for new and improved trade- of genetic algorithms with niching. Proceedings
os, thus extending the frontier. Individuals from of the Fifth International Conference on Ge-
netic Algorithms
very dierent parts of the front might be crossed [7] Horn, J., & Nafpliotis, . Morgan-Kauman, 110-117.
to produce ospring that dominate a portion of the N. (1993). Multiobjec-
front lying between their parents. That is, infor- tive optimization using the niched Pareto ge-
mation from very dierent types of tradeos could netic algorithm. IlliGAL Report No. 93005. Illi-
be combined to yield other kinds of good tradeos. nois Genetic Algorithms Laboratory. Univer-
Indeed, we see some evidence for this in problem sity of Illinois at Urbana-Champaign.
3. Because equivalence class sharing cannot be ex- [8] Oei, (1991).
C. K., Goldberg, D. E., & Chang, S. J.,
Tournament selection, niching, and the
pected to maintain more than one copy of an indi- preservation of diversity. IlliGAL Report No.
vidual (i.e., niche counts are approximately 1 for all 91011. Illinois Genetic Algorithms Laboratory.
niches at steady state), and because we used high University of Illinois at Urbana-Champaign.
crossover rates (typically 0.7-0.9), the maintenance [9] Richardson, J. T., Palmer, M. R., Liepins,
of the front over hundreds of generations was largely G., & Hilliard, M. (1989). Some guidelines
due to the constant generation and regeneration of for genetic algorithms with penalty func-
individuals on the front from the crossover of two tions. Proceedings of the Third International
dierent parents. Therefore, most crosses of par- Conference on Genetic Algorithms. Morgan-
ents on or near the front yielded ospring also on
or near the front. This behaviour is evidence that [10] Kauman,
Schaer, J.
191-197.
D., (1984). Some experiments in
Pareto diversity helps Pareto search. machine learning using vector evaluated genetic
Finally, we point out that the domination tour- algorithms, Unpublished doctoral dissertation,
nament does not rely strictly on a domination re- Vanderbilt University .
lation, but rather on an antisymmetric, transitive
relation. Similarly, equivalence class sharing is use- [11] timization
Schaer, J. D., (1985). Multiple objective op-
with vector evaulated genetic algo-
ful not only on the Pareto optimal frontier, but in rithms. In J. Grefenstette, ed., Proceedings of
any equivalence class in a partial order. Thus the an International Conference on Genetic Algo-
Niched Pareto GA can be used to search any par- rithms and their Applications , 93{100.
tially ordered space, not just those induced by the