Subido por Alex Redfield

A novel multi-objective evolutionary algorithm with fuzzy logic based adaptive selection of operators: FAME

Anuncio
Information Sciences 471 (2019) 233–251
Contents lists available at ScienceDirect
Information Sciences
journal homepage: www.elsevier.com/locate/ins
A novel multi-objective evolutionary algorithm with fuzzy
logic based adaptive selection of operators: FAME
Alejandro Santiago a,b,c,∗, Bernabé Dorronsoro a, Antonio J. Nebro d,
Juan J. Durillo e, Oscar Castillo f, Héctor J. Fraire c
a
School of Engineering, University of Cádiz, Spain
Polytechnic University of Altamira, Mexico
c
Madero City Institute of Technology, Mexico
d
University of Malaga, Spain
e
Leibniz Supercomputing Center, Munich, Germany
f
Tijuana Institute of Technology, Mexico
b
a r t i c l e
i n f o
Article history:
Received 3 May 2018
Revised 28 August 2018
Accepted 1 September 2018
Available online 4 September 2018
Keywords:
Multi-objective optimization
density estimation
evolutionary algorithm
adaptive algorithm
fuzzy logic
a b s t r a c t
We propose a new method for multi-objective optimization, called Fuzzy Adaptive Multiobjective Evolutionary algorithm (FAME). It makes use of a smart operator controller that
dynamically chooses the most promising variation operator to apply in the different stages
of the search. This choice is guided by a fuzzy logic engine, according to the contributions of the different operators in the past. FAME also includes a novel effective density
estimator with polynomial complexity, called Spatial Spread Deviation (SSD). Our proposal
follows a steady-state selection scheme and includes an external archive implementing SSD
to identify the candidate solutions to be removed when it becomes full. To assess the performance of our proposal, we compare FAME with a number of state of the art algorithms
(MOEA/D-DE, SMEA, SMPSOhv, SMS-EMOA, and BORG) on a set of difficult problems. The
results show that FAME achieves the best overall performance.
© 2018 Elsevier Inc. All rights reserved.
1. Introduction
Nowadays, there is a plethora of metaheuristic techniques available for solving complex optimization problems. Some
popular examples are Evolutionary Algorithms (EAs), Particle Swarm Optimization (PSO), and Differential Evolution (DE),
just to mention a few. The field of multi-objective optimization, i.e., the optimization of problems involving two or more
conflicting objective functions, is not an exception [5,6,21]. Some of the most popular algorithms in the field, such as NSGAII [8], SPEA2 [49], MOEA/D [47], or SMS-EMOA [3] are, EAs. We can also find examples of highly competitive multi-objective
algorithms whose search engine is not EA-based but is a PSO instead (such as OMOPSO [36] and SMPSO [25]), or a DE [44],
or a Scatter Search algorithm (SS) (AbYSS [26]).
These algorithms, although inspired from many different sources, share most of their core concepts. Most multi-objective
optimization algorithms work on a set of candidate solutions (often called the population) to a target problem, which are
∗
Corresponding author at: Polytechnic University of Altamira, Mexico
E-mail addresses: [email protected] (A. Santiago), [email protected] (B. Dorronsoro), [email protected] (A.J. Nebro),
[email protected] (J.J. Durillo), [email protected] (O. Castillo), [email protected] (H.J. Fraire).
https://doi.org/10.1016/j.ins.2018.09.005
0020-0255/© 2018 Elsevier Inc. All rights reserved.
234
A. Santiago et al. / Information Sciences 471 (2019) 233–251
evolved by the application of stochastic operators. The main differences between these methods are found in the properties
and search capabilities of their operators.
It has been reported that some operators perform better than others when dealing with different features of the search
space. We can find some examples in the multi-objective optimization domain. Deb et al. evaluated the behavior of some
operators in solving problems with variable linkages [9]. Their study revealed that the SBX crossover operator performs
poorly for this kind of problem, especially when compared to the DE and CPX operators. Iorio and Li [19] also analyzed the
suitability of a number of operators for solving rotated problems and those having epistatic interaction between parameters.
The search space of real-world optimization problems is not free of variable-linkage, epistasis, rotation, and other relations between their decision variables. Furthermore, these properties could change throughout the search space. Most
multi-objective metaheuristics use a static set of variation operators and parametrization. An efficient adaptive selection
mechanism to choose an appropriate operator according to the search progress could improve the quality of the Pareto
front approximation found. This idea has been explored by proposals such as AMALGAM [41] and Borg [15].
In addition to operator selection, a major issue influencing the quality of the computed results in multi-objective optimization is the use of density estimators to determine the solutions that must be kept (or discarded). Popular examples
are the crowding distance (CD) of NSGA-II [8] and the hypervolume (HV) contribution [3]. The former has limitations when
facing problems with more than two objectives, while the exponential complexity of the latter makes it unfeasible for
many-objective optimization problems (i.e., those having four or more objectives). The design of a low complexity density
estimator that works efficiently for multi- and many-objective problems is still an open issue. Decomposition-based algorithms, such as MOEA/D [47] and its derivatives, do not make use of density estimators. Instead, they require the use of a
set of weight vectors. Nevertheless, a uniform distribution of the set of weight vectors does not guarantee an even distribution over the Pareto front [35]. Again, finding an optimal set of weight vectors is an open issue [38]. The method proposed
in [14] computes it, but it requires the optimal Pareto front, making it unworkable for real-world problems.
The main contribution of this paper is the proposal of a new multi-objective algorithm, called FAME (Fuzzy Adaptive
Multi-objective Evolutionary algorithm). It implements a pioneer adaptive mechanism for the selection of the operators and
an innovative density estimator for an effective promotion of diversity of solutions in the Pareto front. The two methods
are also significant contributions to the literature. On the one hand, the new operator selection mechanism makes use of
a novel fuzzy logic based strategy to choose the most promising operator to be applied every time. This is the first time,
to the best of our knowledge, that fuzzy logic is used for a dynamic adaptation of variation operators in a multi-objective
optimization algorithm. In addition, our proposed density estimator, called Spatial Spread Deviation (SSD), has polynomial
complexity and offers highly accurate performance for the two- and three-dimensional problems studied: similar to the CD
method for two-objective problems, but without its limitations when dealing with three-objective ones. FAME is compared
with five representative algorithms from the state of the art, outperforming them at the 95% confidence level in most cases
in the selected set of difficult benchmark problems.
The structure of this paper is as follows. We first review the related literature in Section 2, including papers presenting
(i) novel multi-objective optimization algorithms with adaptive selection of operators and (ii) adaptive selection mechanisms
based on fuzzy logic. Sections 3 and 4 describe the fuzzy adaptive mechanism for the selection of the operators and the
spatial spread density estimator, respectively. Our proposed algorithm is described in Section 5. Section 6 briefly describes
the algorithms from the state of the art chosen for comparison. Section 7 presents the experiments performed, and the
results obtained are summarized in Section 8. The performance of the new SSD density estimator and the adaptive selection
of operators are evaluated in Sections 9 and 10, respectively. Our paper ends with our main conclusions and lines for future
research in Section 11.
2. Literature review
In this section, we discuss related work. In particular, we review in Section 2.1 the existing multi-objective optimization
algorithms implementing an adaptive mechanism for the selection of the operators. Section 2.2 briefly describes the relevant
literature on the use of fuzzy logic in evolutionary algorithms.
2.1. Adaptive multi-objective optimization algorithms
The existing adaptive multi-objective algorithms can be classified into two categories, which we refer to as parameter
adaptation and operator adaptation.
Parameter adaptation consists in modifying the control parameters of the operators used by the algorithm. They are
often known as self-adaptive algorithms, and usually include these control parameters as part of the search space. In this
category, Deb et al. [7] introduce a self-adaptive version of the SBX crossover operator (SA-SBX) that dynamically adjusts
the distribution index of the SBX. Another self-adaptive SBX operator is presented in [45], where the distribution index is
dynamically adjusted using feedback information from both a diversity performance metric and the crowding distance. Both
operators performed better than the original SBX for NSGA-II.
The idea behind operator adaptation is to provide the algorithm with a set of different (typically, variation) operators
from which it can choose, according to their expected contribution to the search process. Following this approach, Toscano
and Coello propose a micro-genetic algorithm (μGA2) [39] that runs three instances of the algorithm in parallel, using a
A. Santiago et al. / Information Sciences 471 (2019) 233–251
235
different crossover operator in each one. At some stages of the search, the worst-performing crossover operator is replaced
by the best-performing one. An important limitation for the adaptive capabilities of the algorithm is that crossover operators
cannot be used again after being discarded.
MOSaDE [16] is the multi-objective version of the state-of-the-art single-objective SaDE algorithm, and it falls into this
category too. Four different classical DE strategies are combined through a self-adaptive mechanism that sets their application probabilities according to their success rate in the previous 50 generations (solution A is considered to be better than
B if A dominates B, or if they are non-dominated and A is in a less crowded region). The performance of the algorithm
is far from the state-of-the-art ones. Indeed, an improved version of MOSaDE with object-wise learning strategies, called
OW-MOSaDE [17], obtained an average rank of 9.39 (out of 13 algorithms) for the 13 problems considered in the CEC2009
MOEA competition.
Vrugt and Robinson propose in [41] the so-called AMALGAM algorithm, which applies a number of different multiobjective metaheuristics to evolve the population. Each metaheuristic is adaptively used in order to favor those techniques
exhibiting the best performance in the last iteration. The algorithms used are NSGA-II, a PSO, a DE, and an adaptive metropolis search (AMS) approach. AMALGAM is compared with NSGA-II on the ZDT benchmark and some classical test problems
(Kursawe, Fonseca, Schaffer [6]), and it has been applied to some real-world problems too [18,43].
Finally, Borg [15] is an adaptive MO algorithm specifically designed for many-objective multimodal problems. As in the
previous case, it implements a set of recombination operators, and the choice of the operator to apply is made according
to their participation in the search process. To this end, it tracks the number of solutions every recombination operator
contributed to an -archive of non-dominated solutions.
2.2. Fuzzy evolutionary approaches
We can identify two main areas when combining fuzzy logic and EAs: fuzzy control system design (i.e., using EAs to
discover fuzzy rule sets) and dynamic adaptive parameters in metaheuristics. Our proposal is related to the latter kind, and
we survey the most relevant literature next. As we did not find any example of multi-objective optimization algorithms, our
approach seems to be unique, we focus on related papers targeting single-objective optimization.
The most salient nature-inspired algorithms which use fuzzy logic for dynamic parameter adaptation are surveyed in
[40]. Among them, Melin et al. [22] is an interesting approach, where fuzzy logic is used to dynamically adjust the weight
coefficients C1 and C2 in the velocity formula of a PSO. Three PSO versions were proposed, differing in the set of rules, two
of them outperforming the original algorithm.
In [30] a DE using a fuzzy system adapts the weighting factor F, while the crossover constant CR is fixed. Experimental
results show a statistically significant improvement over the fuzzy versions of two emerging metaheuristics, the harmony
search (inspired by musical improvisation, [33]) and the bat search algorithm (inspired by the echolocation behavior of bats,
[34]). The benefits of implementing fuzzy logic parameter adaptation have been recently demonstrated on other emerging
metaheuristics, such as the Gravitational Search Algorithm (GSA) [31], the Fireworks Algorithm (FWA) [2], and the Search
Group Algorithm (SGA) [29].
The first generation of fuzzy logic deals very satisfactorily with uncertainty. However, there has been severe criticism
about the lack of uncertainty inside its membership functions. In order to cope with this deficiency, a second generation of
fuzzy logic, called interval type-2 fuzzy systems, has been proposed. While type-1 membership functions map to a single
value, type-2 map to a range, called the “footprint of uncertainty” (FOU). An Ant Colony Optimization (ACO) using type-2
fuzzy logic was recently proposed in [32], in a very similar manner to [22], improving the results of a rank based ACO and
an ACO using type-1 fuzzy parameter adaptation. Nevertheless, the robustness of these fuzzy engines relies heavily on the
design of the rule set, requiring thorough knowledge and/or optimization to correctly configure the fuzzy engine.
None of these approaches are followed in the present paper. Our novel method identifies, at each search stage, which
variation operators are more (or less) promising for the evolution of the population, assigning them the right application
probability using interval type-1 fuzzy logic. In addition, our proposal is the only one that has been applied to multiobjective optimization algorithms.
3. The fuzzy adaptive mechanism for the selection of the operators
This section introduces a mechanism that allows multi-objective EAs applying different recombination operators at different stages of the search process. In particular, the use of different operators is dynamically adjusted according to their
contribution to the search in the past. Intuitively, the idea is to favor the use of operators generating higher quality solutions
over the use of other operators. Section 3.1 details the Fuzzy Inference System (FIS) used in the proposed adaptive approach,
as well as the selection mechanism for the variation operators. After that, Section 3.2 presents the pool of variation operators
considered in this paper, and their strengths.
3.1. Fuzzy inference system
Our method is based on the Mamdani-Type FIS [37] to compute the probability of applying the different operators. Mamdani FISs are based on completely linguistic fuzzy models using linguistic variables both in the inputs and outputs. Fuzzy
236
A. Santiago et al. / Information Sciences 471 (2019) 233–251
Fig. 1. Mamdani fuzzy inference diagram.
sets defined by membership functions are used to represent the linguistic values that are used in the granulation of the
input and output spaces of the fuzzy model. Regarding the inference, we use the originally proposed approach by Mamdani
based on the “max min” composition: using the minimum operator for implication and maximum operator for aggregation.
Fig. 1 shows an example of a Mamdani FIS. In it, the aggregation of the consequents from the rules are combined into a
single fuzzy set (output), to be defuzzified (maped to a real value). A widely used defuzzification method, also adopted in
this work, is the centroid calculation, which returns the center of the area under the curve.
We use triangular shaped membership functions in all inputs and outputs,
μA ( x ) =
⎧
0
if x ≤ a
⎪
⎪x − a
⎪
⎪
⎨
if a ≤ x ≤ b
b−a
c−x
⎪
⎪
if b ≤ x ≤ c
⎪
⎪
⎩c − b
0
if
.
(1)
c≤x
the parameters a and c determine the “corners” of the triangle, and b determines the peak. A membership function μA maps
real values of x with a degree of membership 0 ≤ μA (x) ≤ 1. The used granularity levels were: low (a = −0.4, b = 0.0, c = 0.4),
mid (a = 0.1, b = 0.5, c = 0.9) and high (a = 0.6, b = 1.0, c = 1.4).
In the following, we refer to the set of available variation operators considered in FAME as Operators. The FIS monitors
the search process by controlling the Stagnation and Utilization[Op] variables (Op ∈ Operators) in a time window (of size windowsSize). Depending on the values of these variables, the FIS updates the probability of applying every operator, OpProb[Op].
The value of Stagnation is shared among all operators, and represents a measure of the evolution of the search. In particular,
1.0
its value is incremented by windowSize
every time a non-successful solution is generated (i.e., one that does not contribute
1.0
to the archive of non-dominated solutions). The value of Utilization[Op] is incremented by windowSize
every time the operator
Op is used.
The size of the time window must be carefully tuned, and depends on the number of operators considered. Using a large
size may lead to an irresponsive behavior of the algorithm, while a very low value leads to highly fluctuating probabilities.
In both cases, the probabilities are not properly adapted to the different stages of the search.
Table 1 describes the set of AND rules we designed for FIS to compute the likelihood of applying every operator. The
Mamdani fuzzy system can be obtained with different methods, including the use of evolutionary or metaheuristic approaches, machine learning or designed based on expert knowledge. In this paper, the proposed method was elaborated
based on the last approach, expert multi-objective knowledge [1,10,11,26,28], which means the design of the fuzzy rules
was made aiming at achieving a known behavior (found by extensive previous experimentation) and then performing an
appropriate parametrization to obtain a good performance. The other methods mentioned are outside the scope of this paper. They could be addressed in the future, generalizing the method proposed in this paper. In particular, we believe that
using evolutionary or metaheuristic approaches in designing an optimal Mamdani fuzzy system, including the rules and
parametrization, could improve the results even more. The alternative of using machine learning is interesting. However, we
believe that the requirement of having input-output training data available will complicate achieving a good design of the
A. Santiago et al. / Information Sciences 471 (2019) 233–251
237
Table 1
Fuzzy inference system rules for operator Op.
AND Antecedents
Consequent
Stagnation
Utilization[Op]
OpProb[Op]
High
High
High
Mid
Mid
Mid
Low
Low
Low
High
Mid
Low
High
Mid
Low
High
Mid
Low
Mid
Low
Mid
Mid
Low
Mid
High
Mid
Low
Fig. 2. Output surface of the Fuzzy inference system.
fuzzy system for real-world problems, where the optimal Pareto front is unknown. The fuzzy system is designed to favor
smooth changes in the probabilities of applying the operators to the detriment of rough ones, an idea represented by the
surface of the rules in Fig. 2. The surface of a fuzzy model is a graphical representation of the outputs produced by the
fuzzy model for different combinations of input data. The idea is similar to plotting a mathematical model, but instead of
computing the output by applying a mathematical equation to the input variables, in a fuzzy model the input values are
fuzzified, then the inference process is used, and the output values are finally calculated in the defuzzification process. The
output of the model is the probability of applying each operator, while the inputs are the usefulness of each operator and
the stagnation value in the defined time window. Once FIS assigns a probability to each operator (within a membership degree [0, 1]), a roulette mechanism is used to select the next operator to apply. It first aggregates all probability values into
a variable Sum, then it iterates by randomly selecting one operator and subtracting its probability from Sum, until Sum ≤ 0.
Then the last visited operator is selected.
3.2. Variation operators pool
The pool of operators used in FAME is composed of two recombination operators (Simulated Binary Crossover (SBX),
Differential Evolution (DE)) and two mutation operators (Polynomial mutation (PM), Uniform Mutation (UM)). They are
operators that are very well-known in the related literature. The idea behind considering the same number of recombination
and mutation operators is to attempt to find an equilibrium between exploitation and exploration. Adding more operators
could make the algorithm less responsive to the needs of the search process. We carefully selected the operators so that we
238
A. Santiago et al. / Information Sciences 471 (2019) 233–251
can cover both linear and nonlinear movements. As is commonly done in the field of continuous optimization, the operators
selected perform smooth changes in the variables.
The operators SBX and PM are used in NSGA-II, SPEA2, and in many other algorithms, such as scatter search [26]. UM has
been successfully applied in algorithms as OMOPSO [36] and Borg [15]. In addition, DE is currently one of the most effective
techniques for solving continuous optimization problems in both the single- and multi-objective domains [4,20,23,46]. We
adopt here the scheme DE/rand/1/bin, widely used in the literature.
The UM implementation in this paper adds a random value from within [-0.05, 0.05] to the original variable: a linear
movement at the variable level, while PM uses a distribution index ηc > 1.0, producing a nonlinear movement at the variable
level.
4. The spatial spread density estimator
The proposed density estimator is based on the mathematical definition of moment (μn ), given in Eq. (2) for discrete
variables, as a measure of the shape of a set of points.
μn =
∞
( x i − c ) n p( x i ) ,
(2)
i=1
where, xi is one of the values that the random variable can take, p(xi ) is the probability of xi , n is the order of the moment,
and c is the reference point with respect to which the moment is defined. There are different definitions of moment, such
as the raw moment or the central moment. The raw moment, also known as the absolute moment, or the moment of a
function, is the moment with respect to the origin, when c = 0. The most popular moment is the central moment, used
in statistics as a measure of central tendency. The central moment substitutes the value of c with the expected value or
arithmetic mean x. For a uniform random variable, when the order is 2, it is easy to argue that the central moment for n
points is the variance:
μ2 =
n
1
(xi − x )2 .
n
(3)
i=1
Note also that Eq. (3) is widely used to compute variance independently of the probability density functions. As with
the sample mean and sample central moments, where the data consists of a sample of size n without knowledge of the
probability density function, the sample moments are estimators of the moments. According to the law of large numbers,
the sample mean for a large number of observations is likely to be close to the expected value (a raw moment in the origin
of order 1). We make use of these concepts to develop the SSD, using the absolute moment or moment about a point.
The datapoints used are the normalized Euclidean distances between pairs of solutions in the objective space. We choose
the difference between the maximum (Dmax ) and minimum (Dmin ) distance in the set of points as our reference. The main
idea is that the distances between neighboring solutions in the objective space should be similar. When we minimize the
moment of order 2 to (Dmax − Dmin ), the distances between points become equidistant. This is done for each solution i in
SSD, as defined in Eq. 4.
μ2 =
n
1
n−1
2
Di, j − (Dmax − Dmin ) ,
(4)
j=1, j=i
minimizing μ2 may lead towards undesirable Pareto front approximations composed of clusters of similar solutions,
while leaving other areas uncovered. An example is shown in Fig. 3a, where we present the Pareto front approximation
found by a version of the steady-state NSGA-II [24], in which its CD density estimator was replaced by the μ2 metric. To
solve this problem, we include a penalization function in the SSD with the aim of avoiding solutions with similar fitness
values having a good SSD value. We define the SSD value of solution i as
n
1
n−1
SSDi =
Di, j − (Dmax − Dmin )
2
j=1 j=i
+
k+1
(Dmax − Dmin )/Di, j ,
(5)
j=2
where k defines the neighbors to be considered in the penalization function. We set it to the number of objectives of
the problem. The diversity of solutions in the Pareto front approximation greatly improves with the penalization function
defined in SSD compared to using the μ2 metric itself. The differences can be visually appreciated in the example displayed
in Fig. 3. The procedure to compute SSD for a set of non-dominated solutions A is carefully described in Algorithm 1. Every
solution is assigned an SSD fitness value, which measures its contribution to the diversity of the solution set A. Initially, this
SSD fitness value is set to 0 (lines 3–5) for each solution except for those having the minimum and maximum values in each
objective, whose SSD fitness value is set to −∞ (lines 6–9). These represent the extremes of the Pareto front, and we are
A. Santiago et al. / Information Sciences 471 (2019) 233–251
239
Fig. 3. Pareto front approximations found by Steady-State NSGA-II implementing μ2 (left) or SSD (right) as density estimator. Results correspond to a
representative run to solve the ZDT1 problem, after 25,0 0 0 function evaluations.
Algorithm 1 Computation of the Spatial Spread Deviation.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
function Compute_SDD(A)
n = |A|
for i ∈ A do
A[i].SSD = 0
end for
for m = 1 to k do
Sort (A, m )
A[1].SSD = A[n].SSD = −∞
end for
Compute the matrix of normalized distances M
Find Dmax and Dmin between all the pairs of solutions
for i ∈ A do
temp = 0
for ∀ j = i ∈ Archive do
t emp = t emp + (M[i][ j] − (Dmax − Dmin ))2
end for
temp = temp
n−1
√
[i].SSD = A[i].SSD + temp
end for
for i ∈ A do
Sort (M[i] )
temp = 0
for j = 2 to k + 1 do
t emp = t emp + (Dmax − Dmin )/(M[i][ j] )
end for
A[i].SSD = A[i].SSD + temp
end for
end function
always interested in keeping them. After the initialization, the procedure computes the normalized Euclidean distances Di, j
between each pair of solutions in the objective space. These distances are stored in a matrix M (of size n2 , where n = |A|).
The maximum and minimum of these distances are stored in the variables Dmax and Dmin , respectively (line 11). The matrix
M and the variables Dmax and Dmin are then used to update the SSD fitness of every solution (lines 12–19), as indicated in
Eq. (4). Solutions with close SSD fitness values are penalized in lines (20–27), as defined in Eq. (5).
The complexity of SSD method is determined by the loop in lines 20–27. In every iteration of the loop, a row of M is
sorted. In addition, it requires performing a loop on the k nearest solutions for computing the penalty (k ≤ n). The complexity
of the sorting (O(nlogn)) is higher than the complexity of this loop (O(k)). Therefore, the complexity of the whole procedure
is O(n2 log n).
240
A. Santiago et al. / Information Sciences 471 (2019) 233–251
5. Fuzzy adaptive multi-objective evolutionary algorithm with Spatial Spread Deviation (FAME)
The main features of FAME can be summarized as:
•
•
•
FAME applies different variation operators to generate new solutions, dynamically determining, during the search, the
operator to use each time, employing the mechanism described in Section 3.
FAME makes use of the new SSD density estimator, presented in Section 4, to maintain the diversity of an external
archive with the best non-dominated solutions found during the search.
FAME follows a steady-state population update scheme.
The reasons for the first two features are discussed in Sections 3 and 4, respectively. Regarding the use of the steadystate scheme, it has been shown to enhance the performance of several state-of-the-art multi-objective evolutionary algorithms [3,12,15,24,46,47]. Although it introduces some computational overhead (the density estimators have to be recalculated after adding every new solution in the population), the algorithm can benefit from up-to-date information to make
the appropriate decisions [12,24].
Algorithm 2 includes the pseudo-code of FAME. The inputs of the algorithm are: the problem to solve, the termination
Algorithm 2 Pseudo-code of FAME.
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
Step 1 Initialization
Initialize(Pop)
Archive.add (Pop)
SetOperatorsP rob(1.0 )
window = Stagnation = 0
Step 2 Main loop
while Stopping criterion not reached do
Step 2.1 Parents selection
for i = 1 to |Parents| do
if RandomDouble(0, 1 ) ≤ β then
Parents[i] ← T ournamentSSD(Archive )
else
Parents[i] ← T ournamentSSD(Pop)
end if
end for
Step 2.2 Reproduction
Operator ← Roulette(OpP rob)
O f f spring ← Operator (Parents )
E valuate(O f f spring)
OpUse = U pdateUse(Operator )
window + +
Step 2.3 Archive Update
if !(Archive.add (O f f spring)) then
Stagnation = U pdateStagnation()
end if
Step 2.4 Call to the Fuzzy Inference System
if window == windowSize then
U pdateOpP rob(OpUse, Stagnation )
window = Stagnation = 0
end if
Step 2.5 Population Update
NewPop ← Pop.add (O f f spring)
f ast _non_dominated_sortSSD(NewPop)
Pop ← RemoveW orstSolution(NewPop)
end while
Step 3 Output
return Archive
criterion, the set of recombination operators (Operators), the population size (PopulationSize), and the size of the window
(windowSize) to be used by the selection mechanism (as presented in Section 3). The output of FAME is the computed
Pareto front approximation for the considered problem. The code of the algorithm is composed of three main steps.
Step 1 is the initialization phase. Firstly, it creates the main population of the algorithm (Pop) by inserting PopulationSize
randomly generated solutions, which are evaluated and assigned an SSD fitness of 0.0. All the non-dominated solutions just
generated are added to the external archive. The initial probability of applying each recombination operator is set to 1.0.
A. Santiago et al. / Information Sciences 471 (2019) 233–251
241
Step 2 is the main loop of FAME. It iterates until the termination condition is reached. This step is divided into smaller
substeps:
•
•
•
•
•
In Step 2.1, the parents are chosen. They are selected from either the external archive (Archive) or the main population
(Pop), based on a randomized process that is controlled by an external parameter β ∈ [0:1] (it may be tuned to give
some preference to elitism over diversity, or the opposite). Once this decision has been made, a tournament method
similar to the one used in NSGA-II is applied: the choice is firstly made according to the ranking value, and if solutions
are non-dominated, then the decision is driven by the SSD fitness value.
Step 2.2 selects and applies a recombination operator. A roulette mechanism (as explained in Section 3) is used to select
the operator, which is then used to generate an offspring from the parents chosen in Step 2.1. The use of the chosen
1.0
operator is increased by windowSize
(line 16), and the window counter is incremented.
Step 2.3 checks whether the offspring should be included in the external archive or not. In the latter case, the Stagnation
1.0
variable is increased by windowSize
.
Step 2.4 invokes FIS (Section 3) when window == windowSize to update the application probability for all operators,
according to their use and the value of Stagnation. It also resets the values of window and Stagnation.
Step 2.5 updates the population with the solution created in Step 2.2. Then,
f ast _non_dominated_sort is applied to divide the population into different fronts. The solution in the last front with the
worst SSD value (when considering only solutions in that front) is removed.
Finally, Step 3 returns the solutions contained in the external archive as the output computed by FAME.
6. Algorithms in the comparison
This section describes the algorithms that have been considered in our experiments.
6.1. BORG
Borg [15] is a steady-state multi-objective evolutionary algorithm for solving many-objective multimodal problems. Borg
uses an -box archive, based on the dominance concept, to keep a diverse representation out of all the non-dominated
solutions found during the search. Borg also makes use of a metric called -progress, based on the content of this archive,
for identifying when the search is stagnated. When stagnation is detected, Borg re-starts the search by creating a new
population including random solutions as well as solutions taken from the -box archive. The size of this newly generated
population does not need to match the size of the previous one; instead, its size is determined based on the number of
solutions contained in the -box archive. Borg also considers several recombination operators, whose use is determined
adaptively based on how they have contributed to the search so far. More specifically, their contribution is measured in
terms of how many solutions they produced that are in the -box archive.
6.2. SMS-EMOA
SMS-EMOA [3] is a steady-state multi-objective EA designed to compute a Pareto front approximation with maximal
HV. The HV contribution is used during the search process to determine which solutions are kept or discarded. In every
generation, the current population, of size N, and a newly generated solution are considered to update the population. This
set of N + 1 solutions is partitioned into different subsets of non-dominated solutions, using the ranking method of NSGAII. The solution from the last subset with the lowest HV contribution is removed. The remaining N solutions become the
population of the next generation.
6.3. SMEA
SMEA [46] combines evolutionary search and machine learning. More specifically, self-organizing maps, an unsupervised
machine learning method, is applied to determine neighboring relationships within the current population of each generation. These relationships are used to determine clusters within which the recombination operator can be applied. As
with the previous algorithms, SMEA performs a steady-state evolution. In each generation, one solution is created, and the
population of the next generation is built using a procedure similar to the one described for the SMS-EMOA algorithm.
6.4. SMPSOhv
SMPSOhv [27] is an extension of the SMPSO [25] multi-objective particle swarm algorithm. The main feature of SMPSO
is the use of a velocity constraint mechanism that avoids having the particles often fly beyond the limits of the search
space due to having high velocities. All the non-dominated solutions found during the search are inserted into an external
archive of limited size. SMPSOhv uses the HV quality indicator to keep the size of this archive within the specified limit. In
particular, every time a new insertion makes the archive grow beyond this size, the solution that contributes least to the
HV of the Pareto front approximation is removed, following a similar procedure to that in SMS-EMOA and SMEA.
242
A. Santiago et al. / Information Sciences 471 (2019) 233–251
Table 2
Parameter settings.
MOEA/D-DE
SMEA
Population size:
Differential evolution:
Polynomial mutation:
Neighborhood size:
101/210 for bi/tri-objective MOPs
pr = 1.0, F = 0.5, CR = 1.0
Pm = 1/n, nm = 20
20, Prob. mating: 0.9
10 0/20 0 for bi/tri-objective MOPs
pr = 1.0, F = 0.9, CR = 1.0
Pm = 1/n, nm = 20
5, Prob. mating: 0.9
SMS-EMOA
SMPSOhv
Population size:
Polynomial mutation:
Recombination:
10 0/20 0 for bi/tri-objective MOPs
Pm = 1/n, nm = 20
SBX: pr = 0.9, ηc = 20.0
10 0/20 0 swarm/archive bi/tri-objective MOPs
Pm = 1/n, nm = 20
Max/Min C12 = 2.5/1.5, r12 = 1/0
BORG
FAME
Population size:
Archive size:
Selection:
Recombination:
Max/Min 10 0 0 0/10 0
∞
Tournament size Max/Min: 200/2
pr = 1.0, DE: CR = 1.0, F = 0.5
SBX: ηc = 20.0 SPX: = 3.0
PCX: wζ = 0.1, wη = 0.1 UNDX: σζ = 0.5, ση = 0.35
pm = 1/n, nm = 20
pm = 1/n
Init/Max 20 0/20 0 0
25/100 for bi/tri-objective MOPs
10 0/20 0 for bi/tri-objective MOPs
Tournament size: 5, β = 0.9
pr = 1.0, DE: CR = 1.0, F = 0.5
SBX: ηc = 20.0
Polynomial mutation:
Uniform mutation:
Window size:
pm = 0.30, nm = 20
pm = 0.30
14
6.5. MOEA/D-DE
MOEA/D [47] is a steady-state multi-objective evolutionary algorithm that employs an aggregation approach to decompose a multi-objective optimization problem into a set of single-objective optimization ones–called subproblems from now
on. Each solution in the population aims at optimizing one of these subproblems. To this end, a neighboring structure is
defined among all the solutions within the population. The resulting neighbors define the mating pools for the recombination operator and the subproblems the generated solution can optimize. The version used in this paper is MOEA/D-DE [20],
which employs differential evolution as the recombination operator and Tchebycheff as decomposition method.
7. Experimental setup
This section details our experimental setup. There is in the literature a large number of benchmark problems, proposed
to assess the performance of multi-objective algorithms. The most well-known ones are the ZDT, DTLZ, WFG, LZ [20], and
GLT [46] problem families. Among these, we have adopted the two latter ones, which are the most recently proposed ones.
They are characterized by having complex Pareto sets, which make them harder to solve than the other problem sets. For
the evaluated algorithms, we took the implementations provided by the jMetal framework [28], except for BORG and SMEA,
for which the implementations of the original authors were used. All these algorithms were configured as suggested in the
original papers in which they were presented. Table 2 summarizes these configurations. The parameter values for FAME
were determined after a preliminary experimental phase. The values for the population and archive of non-dominated solutions sizes were fixed to 25/100 and 10 0/20 0, respectively for bi/tri-objective MOPs. The tournament size for selecting
parents was set to 5. The solutions for the tournament were chosen from the archive with 90% probability (i.e., β = 0.9).
The configuration for the recombination operators was the following: SBX uses a distribution index ηc = 20.0, DE uses a
crossover constant CR = 1.0 and a select weighting factor F = 0.5. Regarding the mutation operators, both the PM and UM
were applied to every variable with a probability of pm = 0.3. In addition, the former was configured to use ηm = 20.0.
The stopping condition was set to 45,0 0 0 and 150,0 0 0 function evaluations for the GLT and LZ problems, respectively, as
is commonly done in the literature. The Pareto front approximations produced by the different algorithms were compared
by means of three well-known indicators: Additive Epsilon (I + ) [50], Generalized Spread (IGS ) [48], and Hypervolume (IHV )
[42]; as metrics for the convergence of the solutions, their diversity, or both, respectively. The indicators I + and IGS are to
be minimized, while IHV must be maximized.
Each algorithm solved every problem for 100 independent runs, and the quality of the obtained fronts was computed
according to the three mentioned indicators. The results are presented in this paper as the median, x˜, and interquartile
range, IQR, of these values, as measures of location (or central tendency) and statistical dispersion, respectively. In the tables,
the best and second best values for every problem are emphasized with dark and light gray background, to allow easily
identifying the best performing algorithms. The Wilcoxon signed-rank test was applied to assess the statistical confidence in
the pairwise comparisons between FAME and the state-of-the-art algorithms, at the 95% confidence level: ‘’ is used when
FAME is statistically better than the corresponding algorithm, while ‘ ’ means the opposite. The symbol ‘–’ is used when
there is no statistical difference. Finally, the Friedman statistical test was applied to analyze the overall performance of the
algorithm across all considered problems [13].
A. Santiago et al. / Information Sciences 471 (2019) 233–251
2
Pareto Front
MOEA/D-DE
1.5
1.5
1
1
F2
F2
2
0.5
0
243
Pareto Front
FAME
0.5
0
0.2
0.4
0.6
F1
0.8
1
0
0
0.2
0.4
0.6
0.8
1
F1
Fig. 4. Best Pareto fronts reported by MOEA/D-DE and FAME, for GLT4 according IHV .
8. Results
We show in Table 3 the median and interquartile range of the three considered quality indicators for all the evaluated
benchmark problems. Overall, it is possible to see at a glance that FAME stands out in our comparison, as it achieves the
best or second best figures in most of the cases.
When focusing on IHV , MOEA/D-DE stands as the best competitor to FAME. It outperforms our approach in five problems
(namely, LZ09F2, LZ09F4, LZ09F9, GLT1, and GLT4), but it is outperformed by FAME for seven other problems, with statistical
significance. In the comparison with the other four algorithms, FAME is only significantly outperformed by SMPSOhv on
LZ09F6. We can see how FAME is better than BORG and SMS-EMOA in all problems with statistical significance. This is also
the case for SMEA, where no statistical difference can be guaranteed for one problem.
We now analyze one of the cases when FAME is outperformed by another algorithm in terms of IHV . Fig. 4 depicts the
best Pareto front approximations computed by MOEA/D-DE and FAME for the GLT4 problem, according to IHV . The figure
shows that the front computed by FAME provides a clearly better distribution of points than the one computed by MOEA/DDE in spite of the results in Table 3. Actually, the better performance of MOEA/D-DE for this problem is due to its robustness,
as evidenced by its smaller computed value of IQR: 3.5E − 4 against the 2.4E − 1 obtained by FAME.
According to I + , FAME achieves the best values nine out of fifteen problems, and four second-best values. It outperforms
SMS-EMOA and BORG in all problems with statistical significance. SMPSOhv and SMEA outperform FAME in two problems
each: LZ09F4 and GLT3, respectively, LZ09F6 and GLT5. MOEA/D-DE and SMEA are the best algorithms after FAME, each
providing two best results. If we consider the pairwise comparison of FAME with the rest of the algorithms (15 problems and
5 algorithms, making a total of 75 pairwise comparisons), our proposal computes better results with statistical significance
in 65 cases, and it is outperformed by another algorithm in only 6 cases. In the remaining four comparisons, no statistical
significance was found.
Focusing on IGS , FAME also stands out as the best performing algorithm, beating the others in 60 out of the 75 pairwise
comparisons. In this case, SMEA appears as its strongest competitor. In the comparison FAME versus SMEA, the former
obtains better IGS values in five problems while it is outperformed by the latter in five cases, all at the 95% significance
level. The reason for the performance of SMEA may be the fact that the neighborhood relationships are updated every time
a new solution is added to the population, therefore allowing the algorithm to gather information about the distribution of
the solutions at every search stage. The MOEA/D-DE algorithm, which was the second most competitive algorithm according
to IHV and I + , performs poorly in terms of IGS . The use of weight vectors, although providing accurate results, seems to
compromise the diversity of the solutions, as demonstrated by the fact that MOEA/D-DE cannot achieve the best result
for any of the evaluated problems. It attracts attention that FAME performs poorly on the GLT3 problem in terms of this
indicator (it is statistically worse than all the evaluated algorithms). To further investigate this result, Fig. 5 shows the
Pareto fronts with the best IGS value for FAME, compared to the three best-performing algorithms for this problem. Despite
its worse IGS values, the computed approximations cover the whole Pareto front, while BORG only found nine solutions
(although well distributed), the front found by MOEA/D-DE presents a highly dense region in the elbow of the front, and
SMS-EMOA can only find two solutions on the right-hand side of the elbow.
If we focus on the three-objective problems studied (LZ09F6, GLT5, and GLT6), we can see that FAME also outperforms the
other algorithms in terms of convergence and diversity of the computed approximations to the Pareto front. Only SMPSOhv
beats FAME in terms of IHV and I + on LZ09F6. An example of the fronts computed by FAME and SMEA is shown in Fig. 6. In
this case, the best Pareto fronts computed by each algorithm according to the IGS metric for the GLT6 problem are plotted.
The figure visually illustrates the clearly better Pareto front approximation computed by FAME. Finally, we also analyzed the
overall performance of the algorithms, for all studied problems. Table 4 shows the Friedman ranks of the six algorithms for
the three considered indicators, with 95% significance.
244
A. Santiago et al. / Information Sciences 471 (2019) 233–251
Table 3
Median and IQR of the 6 algorithms over 100 independent runs in terms of IHV , I + and IGS . Dark/light gray emphasizes
the best/second-best results.
A. Santiago et al. / Information Sciences 471 (2019) 233–251
1
245
1
Pareto Front
BORG
0.8
0.6
0.6
F2
F2
0.8
Pareto Front
FAME
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
0
1
0
0.2
0.4
F1
0.6
0.8
1
F1
1
1
Pareto Front
MOEA/D-DE
0.6
0.6
F2
0.8
F2
0.8
Pareto Front
SMS-EMOA
0.4
0.4
0.2
0.2
0
0
0.2
0.4
0.6
0.8
1
0
0
0.2
0.4
F1
0.6
0.8
1
F1
Fig. 5. Best Pareto fronts provided by the three best performing algorithms, as well as FAME, for GLT3 problem according to IGS .
Fig. 6. Best Pareto fronts provided by SMEA and FAME for GLT6 according to IGS .
Table 4
Average rankings of the algorithms.
IHV
I +
IGS
Algorithm
Rank
Algorithm
Rank
Algorithm
Rank
FAME
MOEA/D-DE
SMEA
SMPSOhv
SMS-EMOA
BORG
1.78
2.57
3.32
3.90
4.63
4.79
FAME
MOEA/D-DE
SMEA
SMPSOhv
BORG
SMS-EMOA
1.75
2.85
3.22
3.52
4.77
4.90
FAME
SMEA
SMPSOhv
MOEA/D-DE
BORG
SMS-EMOA
2.09
3.02
3.21
4.11
4.16
4.42
246
A. Santiago et al. / Information Sciences 471 (2019) 233–251
Table 5
Median and IQR of FAME using CD vs. SSD density estimators over 100 independent runs. Dark/light gray background emphasizes the best/secondbest results.
We can see that FAME is clearly the best performing algorithm, in terms of all studied indicators. The second-best algorithm in terms of IHV and I + is MOEA/D-DE, followed by SMEA, SMPSOhv, in that order. In terms of IGS the ranking changes
and SMEA stands as the second best algorithm, followed by SMPSOhv, MOEA/D-DE, BORG, and SMS-EMOA.
9. Performance analysis of SSD density estimator
Many multi-objective algorithms use density estimators for computing well distributed approximations to the Pareto
front. One commonly adopted option is based on the HV contribution of each solution to the computed approximation.
Although it often provides accurate results, its exponential complexity with the number of objectives [42] compromises
the scalability of the algorithm. Another option we can find in the literature is the Pareto strength and raw fitness from
SPEA2 [49]. It presents lower complexity than HV, O(n3 ), but we still consider it high. The Crowding Distance (CD) used by
NSGA-II [8] is a low complexity alternative (O(k nlogn)) [8], but its performance decreases for problems of three or more
objectives. The SSD density estimator proposed in this paper is a suitable solution that can be adopted by other algorithms
in the literature. Its complexity O(n2 logn) is slightly higher than that of CD. However, our proposal is highly competitive
with CD in two-dimensional problems, and more suitable for three-dimensional ones, as we show in this section.
We now compare the SSD and CD density estimators. To this end, we configure FAME in two ways: one with the SSD
technique the other with the CD technique, to manage the diversity of solutions in the archive/population. The latter configuration will be referred to as FAME-CD, while the former is called FAME (as before). The results obtained by these two
configurations are summarized in Table 5 using the median and IQR values after 100 independent runs for the indicators
IHV , I + , and IGS . As in the previous section, the symbol ‘’ is used to show that our algorithm statistically outperforms the
compared one (in this study, FAME-CD), with 95% significance. The opposite case is indicated with ‘ ’, and no statistical
difference is indicated with ‘ - ’.
In terms of the IHV metric, we can see that both algorithms perform similarly, statistically outperforming the other one
in four problems. FAME-CD outperforms FAME in three problems from the LZ benchmark and one from the GLT benchmark,
whereas FAME is superior in two problems from the LZ benchmark and two more from GLT benchmark. No statistical
differences were found in seven problems. It is important to note that FAME offers statistically better performance than
FAME-CD in all the three-objective problems (LZ09F6, GLT5, GLT6).
In terms of the I + metric, a similar behavior can be observed. FAME-CD is superior in three problems (LZ09F1, GLT1,
and GLT2), while FAME statistically outperforms it in three problems too: LZ09F6, GLT5, and GLT6. No statistical differences
A. Santiago et al. / Information Sciences 471 (2019) 233–251
247
Table 6
Global percentage of failed contributions per operator (one independent run).
Problem
DE
SBX
PM
UM
LZ09F1
LZ09F2
LZ09F3
LZ09F4
LZ09F5
LZ09F6
LZ09F7
LZ09F8
LZ09F9
GLT1
GLT2
GLT3
GLT4
GLT5
GLT6
Average
26.93%
3.50%
3.70%
5.06%
4.30%
6.21%
6.01%
2.83%
2.83%
8.69%
5.38%
30.96%
9.34%
14.88%
16.44%
9.80%
14.42%
5.69%
8.21%
10.26%
11.22%
18.09%
4.74%
4.78%
4.82%
4.32%
4.34%
13.63%
4.05%
16.14%
19.09%
9.59%
12.10%
11.19%
9.68%
9.39%
7.72%
20.60%
0.92%
0.97%
12.97%
3.69%
8.48%
7.45%
4.78%
32.70%
29.48%
12.37%
17.17%
20.15%
23.86%
24.84%
21.12%
25.11%
0.90%
0.79%
23.30%
8.46%
17.52%
12.88%
8.82%
42.86%
38.38%
18.81%
were found in the other nine problems studied. Using the SSD density estimator also lead to statistically better performance
for all three-objective problems, according to I + .
The tie between FAME and FAME-CD in terms of IHV and I + is broken when considering IGS . In terms of the diversity
of solutions in the Pareto front approximations, FAME-CD can only outperform FAME in one single bi-objective problem,
namely LZ09F7, whereas our proposed algorithm was found to provide statistically better results in seven problems: four
bi-objective problems from the LZ benchmark (LZ09F1, LZ09F2, LZ09F4, LZ09F5) and three from the GLT (GLT2, GLT5, GLT6).
FAME statistically outperformed FAME-CD in the GLT5 and GLT6 three-objective problems, and no statistical differences were
found for the other three-dimensional problem, LZ09F6.
To summarize, the experimental results show a significant improvement in the distribution of the solutions found when
using SSD, as measured by the IGS metric. The results are similar regarding the other two metrics. The meaningful improvements in IGS , together with the similar performance in accuracy, and the superior performance for three-objective problems,
justify our decision to design SSD and implement it in FAME. Although the complexity of SSD is greater than that of CD, it
is polynomial, therefore lower and more scalable than other common alternatives, such as those based on HV contributions.
We conclude that SSD stands out as a highly competitive new player for the estimation of the density of the solutions
in the Pareto front approximation of multi-objective algorithms. The experimental results from Section 8 include algorithms
using different environmental selection schemes, such as hypervolume contributions (SMS-EMOA), weight vectors (MOEA/DDE), and − box (Borg), besides SSD (FAME), which is a strong competitor.
10. Analysis of the adaptation of the operators in FAME
A key issue in the design of FAME is the implementation of an adaptive mechanism for the choice of the variation
operators to be applied each time. In this section, we get an insight into the choices made by the algorithm during the
evolution, and its impact on the overall performance. We have kept track of the contribution of the operators to the archive
of non-dominated solutions in every time window when solving all the problems considered in this research. Although
this obviously changes for different runs due to the stochastic behavior of the algorithm, it is possible to observe a similar
pattern in each problem and independent run. We monitor the contributions (i.e., the number of solutions generated that
are incorporated in the archive of non-dominated solutions) of every operator in the time window, before calling FIS. We
compute, for every operator, the percentage of failed offspring solutions as the difference between the number of solutions
it generated and those that contributed to the archive. These values are shown in Table 6 for every operator in all problems,
after one single run.
The adaptive operators choice is not an easy task: an optimal adaptation would mean that all solutions generated during
the search by every operator are contributed to the archive. FAME makes a remarkable adaptation in GLT1 and GLT4, where
the percentage of failed offspring solutions is less than 10% (a desirable behavior) for all operators. The average percentage
of failed offspring solutions of all operators for all problems is roughly between 10% and 19%, meaning that over 80% of all
generated solutions during the search contributed to the archive. The evolution of the contribution of the operators is also
graphically studied for some representative problems. In order to graphically present the adaptive behavior, only a sample
of 100 time windows is considered (out of over 10,0 0 0 time windows for the LZ benchmark, and 3,0 0 0 for GLT). Therefore,
we have plotted one value every 107 (32) time windows for the LZ (GLT) problems.
We now graphically analyze the contribution to the performance of FAME of the mechanism for selecting operators. The
plots show the use of the different operators and the evolution of the HV of the solutions in the archive. The y axis shows
both the percentage of use of the operator (solid gray line) and the percentage of use of the operator leading to an update
in the archive (dashed line). If the values of the two lines are the same, then 100% of the generated offspring contributed
248
A. Santiago et al. / Information Sciences 471 (2019) 233–251
Fig. 7. Contribution of operators in FAME for problem LZ09F8 and evolution of IHV .
to the archive. The x axis presents the time step of the evolution, discretizing all the evaluations performed into 100 steps
(from 1 to 101). The circles represent the IHV values, normalized by the extreme points of the optimal Pareto fronts.
Fig. 7 shows the case of problem LZ09F8. For this problem, the number of offspring solutions that do not contribute to
the archive is less than 1% for the UM and PM operators, and never higher than 5% for the others. The good predictions
made for the UM and PM operators is captured in the plots at the bottom, showing an almost perfect match between the
predictions and the real contributions of the two operators, all along the run. In the case of DE and SBX (upper plots), there
is also an accurate match between predictions and actual contributions, with only a few gaps in the early stages of the
search. We would like to emphasize how the contribution of SBX is null at the beginning, until time step 5 (i.e., generation
280), when it starts contributing. FAME is able to detect the change, accurately predicting its contribution in the next steps.
These accurate predictions allow the algorithm to keep improving the Pareto front approximation over the whole run.
As in the case of LZ09F8, the selection mechanism used in FAME also offers a highly accurate performance for GLT1,
as depicted in Fig. 8. The percentage of generated solutions that do not contribute to the archive is less than 10% for all
operators. In this case, we can draw the same conclusions as before: the hypervolume value gradually improves through the
search, driven by the capacity of FAME to accurately predict the contributions of the considered operators in the different
stages of the search. We would like to emphasize how FAME detects, at around time step 50, that the contribution of
DE suddenly changes from almost 0% to over 60%, increasing at the same time the contribution of PM, and significantly
reducing it in the case of SBX and UM, down to around 0%. FAME is able to predict all these simultaneous changes at
that point, showing a highly adaptive behavior. We now analyze the cases in which FAME obtains the worst percentages
in Table 6. They correspond to LZ09F1, GLT3, GLT5, and GLT6. For all of them we can observe a similar behavior: a quick
convergence of the Pareto front approximation to highly accurate HV values, which gets stuck during the rest of the run,
negatively impacting on the prediction errors (only a few of the newly generated solutions from that point contribute to the
archive). Although the prediction errors are high for these four problems, FAME is extremely competitive with the compared
state-of-the-art algorithms, significantly outperforming them in 44 out of the 60 cases (comparisons against five algorithms,
for three performance metrics and four problems), as shown in Table 3. A representative example is shown in Fig. 9 for
the GLT5 problem. It can be seen how the archive converges to less than 10% error from the optimum in a few time steps,
making it extremely difficult to have new solutions contribute.
A. Santiago et al. / Information Sciences 471 (2019) 233–251
Fig. 8. Contribution of the operators in FAME for problem GLT1 and evolution of IHV .
Fig. 9. Contribution of the operators in FAME for problem GLT5 and evolution of IHV .
249
250
A. Santiago et al. / Information Sciences 471 (2019) 233–251
11. Conclusions and Future Research
In this paper we have proposed a new multi-objective EA, called FAME (standing for Fuzzy Adaptive Multi-objective
Evolutionary algorithm). It was shown to clearly outperform five algorithms from the state of the art on the well-known
LZ and GLT benchmarks, containing both two- and three-dimensional problems. The comparison was made in terms of
three considered quality metrics: the well known hypervolume, additive epsilon, and generalized spread functions. For every
metric, FAME stands out as the best algorithm of ones included in the comparision.
FAME implements two innovative components that notably contribute to the excellent performance of the algorithm in
the studied benchmarks: The first one is that it implements an adaptive selection of operators based on a fuzzy logic engine
to predict, among a set of recombination and mutation operators, those that are most likely to contribute to the improvement of the current Pareto front approximation, assigning them different probabilities of being employed, according to the
predicted values. These predictions are based on the contributions of the different operators to the search during the previous generations. The second one is that we designed a pioneering effective density estimator with polynomial complexity,
called Spatial Spread Deviation (SSD), that is used to manage the non-dominated solutions of the approximate Pareto front.
Thorough studies were made to analyze the contribution of the two novel components to the performance of the algorithm,
which can be adopted by other multi-objective optimization algorithms with the expectation of improving their performance. This will be addressed in future research. In Addition, we plan to evaluate the performance of the algorithms when
solving combinatorial problems, as well as considering real-world applications.
Acknowledgments
A. Santiago and H. Fraire thank Consejo Nacional de Ciencia y Tecnología and Tecnolgico Nacional de México for contracts
[360199, 60 02.16-P] and [280 081] (Redes Temáticas). B. Dorronsoro acknowledges the Spanish Ministerio de Economía y
Competitividad and European Regional Development Fund for the support provided under contracts [TIN2014-60844-R] (SAVANT project) and [RYC-2013-13355].
References
[1] E. Alba, B. Dorronsoro, Cellular Genetic Algorithms, Springer–Verlag, 2008.
[2] J. Barraza, P. Melin, F. Valdez, C. González, Fireworks algorithm (FWA) with adaptation of parameters using fuzzy logic, in: Nature-Inspired Design of
Hybrid Intelligent Systems, Springer–Verlag, 2017, pp. 313–327.
[3] N. Beume, B. Naujoks, M. Emmerich, SMS-EMOA: Multiobjective selection based on dominated hypervolume, Eur. J. Oper. Res. 181 (3) (2007)
1653–1669.
[4] B. Bokovi, J. Brest, Protein folding optimization using differential evolution extended with local search and component reinitialization, Inf. Sci. 454–455
(2018) 178–199.
[5] P. Chakraborty, S. Das, G.G. Roy, A. Abraham, On convergence of the multi-objective particle swarm optimizers, Inf. Sci. 181 (8) (2011) 1411–1425.
[6] C.A.C. Coello, G.B. Lamont, D.A.V. Veldhuizen, Evolutionary Algorithms for Solving Multi-Objective Problems, Springer–Verlag, 2006.
[7] K. Deb, S. Karthik, T. Okabe, Self-adaptive simulated binary crossover for real-parameter optimization, in: Genetic and Evolutionary Computation
Conference, ACM, 2007, pp. 1187–1194.
[8] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Trans. Evol. Comput. 6 (2) (2002) 182–197.
[9] K. Deb, A. Sinha, S. Kukkonen, Multi-objective test problems, linkages, and evolutionary methodologies., in: Genetic and Evolutionary Computation
Conference, ACM, 2006, pp. 1141–1148.
[10] B. Dorronsoro, G. Danoy, A.J. Nebro, P. Bouvry, Achieving super-linear performance in parallel multi-objective evolutionary algorithms by means of
cooperative coevolution, Comput. Oper. Res. 40 (6) (2013) 1552–1563.
[11] J. Durillo, A. Nebro, C. Coello, J. García-Nieto, F. Luna, E. Alba, A study of multiobjective metaheuristics when solving parameter scalable problems,
IEEE Trans. Evol. Comput. 14 (4) (2010) 618–635.
[12] J. Durillo, A. Nebro, F. Luna, E. Alba, On the effect of the steady-state selection scheme in multi-objective genetic algorithms, in: Evolutionary Multi-Criterion Optimization, Springer–Verlag, 2009, pp. 183–197.
[13] S. García, D. Molina, M. Lozano, F. Herrera, A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: A case
study on the CEC’2005 special session on real parameter optimization, J. Heuristics 15 (6) (2008) 617–644.
[14] I. Giagkiozis, R.C. Purshouse, P.J. Fleming, Generalized decomposition, in: International Conference on Evolutionary Multi-Criterion Optimization,
Springer–Verlag, 2013, pp. 428–442.
[15] D. Hadka, P. Reed, Borg: An auto-adaptive many-objective evolutionary computing framework, Evol. Comput. 21 (2) (2013) 231–259.
[16] V.L. Huang, A.K. Qin, P.N. Suganthan, M.F. Tasgetiren, Multi-objective optimization based on self-adaptive differential evolution algorithm, in: IEEE
Congress on Evolutionary Computation, 2007, pp. 3601–3608.
[17] V.L. Huang, S.Z. Zhao, R. Mallipeddi, P.N. Suganthan, Multi-objective optimization using self-adaptive differential evolution algorithm, in: IEEE Congress
on Evolutionary Computation, 2009, pp. 190–194.
[18] J. Huisman, J. Rings, J. Vrugt, J. Sorg, H. Vereecken, Hydraulic properties of a model dike from coupled Bayesian and multi-criteria hydrogeophysical
inversion, J. Hydrol. 380 (1) (2010) 62–73.
[19] A.W. Iorio, X. Li, Solving rotated multi-objective optimization problems using differential evolution, in: Australian Conference on Artificial Intelligence,
2004, pp. 861–872.
[20] H. Li, Q. Zhang, Multiobjective optimization problems with complicated Pareto sets, MOEA/D and NSGA-II, IEEE Trans. Evol. Comput. 2 (12) (2009)
284–302.
[21] L. Mart, J. Garca, A. Berlanga, J.M. Molina, A stopping criterion for multi-objective optimization evolutionary algorithms, Inf. Sci. 367–368 (2016)
700–718.
[22] P. Melin, F. Olivas, O. Castillo, F. Valdez, J. Soria, M. Valdez, Optimal design of fuzzy classification systems using PSO with dynamic parameter adaptation
through fuzzy logic, Expert Syst. Appl. 40 (8) (2013) 3196–3206.
[23] M. Ming, R. Wang, Y. Zha, T. Zhang, Pareto adaptive penalty-based boundary intersection method for multi-objective optimization, Inf. Scie. 414 (2017)
158–174.
[24] A. Nebro, J. Durillo, On the effect of applying a steady-state selection scheme in the multi-objective genetic algorithm NSGA-II, in: Nature-Inspired
Algorithms for Optimization, Springer–Verlag, 2009, pp. 435–456.
A. Santiago et al. / Information Sciences 471 (2019) 233–251
251
[25] A. Nebro, J. Durillo, J. García-Nieto, C. Coello, F. Luna, E. Alba, SMPSO: A new PSO-based metaheuristic for multi-objective optimization, in: Multi-Criteria Decision-Making, 2009, pp. 66–73.
[26] A. Nebro, F. Luna, E. Alba, B. Dorronsoro, J.J. Durillo, A. Beham, AbYSS: Adapting Scatter Search to Multiobjective Optimization, IEEE Transactions on
Evolutionary Computation 12 (4) (2008) 439–457.
[27] A.J. Nebro, J.J. Durillo, C.A.C. Coello, Analysis of leader selection strategies in a multi-objective particle swarm optimizer, in: Congress on Evolutionary
Computation, 2013, pp. 3153–3160.
[28] A.J. Nebro, J.J. Durillo, M. Vergne, Redesigning the jMetal multi-objective optimization framework, in: Genetic and Evolutionary Computation Conference, ACM, 2015, pp. 1093–1100.
[29] S.F.H. Noorbin, A. Alfi, Adaptive parameter control of search group algorithm using fuzzy logic applied to networked control systems, Soft Comput.
(2017) 1–22.
[30] P. Ochoa, O. Castillo, J. Soria, Differential evolution using fuzzy logic and a comparative study with other metaheuristics, in: Nature-Inspired Design of
Hybrid Intelligent Systems, Springer-Verlag, 2017, pp. 257–268.
[31] F. Olivas, F. Valdez, O. Castillo, Gravitational search algorithm with parameter adaptation through a fuzzy logic system, in: Nature-Inspired Design of
Hybrid Intelligent Systems, Springer–Verlag, 2017, pp. 391–405.
[32] F. Olivas, F. Valdez, O. Castillo, C.I. Gonzalez, G. Martinez, P. Melin, Ant colony optimization with dynamic parameter adaptation based on interval
type-2 fuzzy logic systems, Applied Soft Computing 53 (2017) 74–87.
[33] C. Peraza, F. Valdez, O. Castillo, An improved harmony search algorithm using fuzzy logic for the optimization of mathematical functions, in: Design
of Intelligent Systems Based on Fuzzy Logic, Neural Networks and Nature-Inspired Optimization, Springer–Verlag, 2015, pp. 605–615.
[34] J. Pérez, F. Valdez, O. Castillo, A new bat algorithm with fuzzy logic for dynamical parameter adaptation and its applicability to fuzzy control design,
in: Fuzzy Logic Augmentation of Nature-Inspired Optimization Metaheuristics: Theory and Applications, Springer, 2015, pp. 65–79.
[35] Y. Qi, X. Ma, F. Liu, L. Jiao, J. Sun, J. Wu, MOEA/D with adaptive weight adjustment, Evol. Comput. 22 (2) (2014) 231–264.
[36] M. Reyes, C. Coello, Improving PSO-based multi-objective optimization using crowding, mutation and -dominance, in: Evolutionary Multi-Criterion
Optimization Conference, Springer-Verlag, 2005, pp. 509–519.
[37] S. Roy, U. Chakraborty, Introduction to soft computing: : Neuro-fuzzy and genetic algorithms, Dorling-Kindersley, 2013.
[38] A. Santiago, H.J. Fraire Huacuja, B. Dorronsoro, J.E. Pecero, C. Gómez Santillan, J.J. González Barbosa, J.C. Soto Monterrubio, A survey of decomposition
methods for multi-objective optimization, in: Recent Advances on Hybrid Approaches for Designing Intelligent Systems, in: Studies in Computational
Intelligence, Vol. 547, Springer-Verlag, 2014, pp. 453–465.
[39] G. Toscano Pulido, C. Coello Coello, The micro genetic algorithm 2: Towards online adaptation in evolutionary multiobjective optimization, Evolutionary
Multi-Criterion Optimization, LNCS, Vol. 2632, Springer–Verlag, 2003. 75–75
[40] F. Valdez, P. Melin, O. Castillo, A survey on nature-inspired optimization algorithms with fuzzy logic for dynamic parameter adaptation, Expert Systems
with Applications 41 (14) (2014) 6459–6466.
[41] J. Vrugt, B. Robinson, Improved evolutionary optimization from genetically adaptive multimethod search, in: Proc. of the National Academy of Sciences
of the USA, Vol. 104, 2007. 708–11
[42] L. While, L. Bradstreet, L. Barone, A fast way of calculating exact hypervolumes, IEEE Trans. Evol. Comput. 16 (1) (2012) 86–95.
[43] T. Wöhling, J. Vrugt, Multi-response multi-layer vadose zone model calibration using Markov chain Monte carlo simulation and field water retention
data, Water Resour. Res. 47 (4) (2011) 1–19.
[44] X. Yu, X. Yu, Y. Lu, G.G. Yen, M. Cai, Differential evolution mutation operators for constrained multi-objective optimization, Appl. Soft Comput. 67
(2018) 452–466.
[45] F. Zeng, M. Low, J. Decraene, S. Zhou, W. Cai, Self-adaptive mechanism for multi-objective evolutionary algorithms, in: Int. Conf. on Artificial Intelligence and Applications, 2010, pp. 7–12.
[46] H. Zhang, A. Zhou, S. Song, Q. Zhang, X.Z. Gao, J. Zhang, A self-organizing multiobjective evolutionary algorithm, IEEE Trans. Evol. Comput. 20 (5)
(2016) 792–806.
[47] Q. Zhang, H. Li, MOEA/D: A multiobjective evolutionary algorithm based on decomposition, IEEE Trans. Evol. Comput. 11 (6) (2007) 712–731.
[48] A. Zhou, Y. Jin, Q. Zhang, B. Sendhoff, E. Tsang, Combining model-based and genetics-based offspring generation for multi-objective optimization using
a convergence criterion, in: Congress on Evolutionary Computation, IEEE, 2006, pp. 3234–3241.
[49] E. Zitzler, M. Laumanns, L. Thiele, SPEA2: Improving the Strength Pareto Evolutionary Algorithm, Technical Report, ETH, Switzerland, 2001.
[50] E. Zitzler, L. Thiele, M. Laumanns, C.M. Fonseca, V.G. da Fonseca, Performance assessment of multiobjective optimizers: An analysis and review, IEEE
Trans. Evol. Comput. 7 (2) (2003) 117–132.
Descargar