Beruflich Dokumente
Kultur Dokumente
a r t i c l e i n f o a b s t r a c t
Article history: Abandoned areas with substances potentially harmful to the environment or human health have raised
Available online 25 April 2018 numerous concerns around the world. The objective of the present study is to analyze the possibility of
building solar power plants capable of capturing solar energy in unproductive areas, both contaminated
Keywords: and uncontaminated. For this purpose, American data from the National Solar Radiation Database
Solar energy (NSRDB) were used, a collection of hourly solar radiation measurements and meteorological data, as well
Soil reuse
as data from the RE-Powering America's Land project, run by the United States Environmental Protection
Fuzzy c-means clustering
Agency (EPA). In the analysis, the information about “mapped area”, “distances to the transmission lines”,
Differential evolution
Genetic algorithm
“solar direct normal irradiance on a utility scale” and “off-grid direct normal irradiance” were considered.
Particle swarm optimization To define the best locations, the data were initially pre-processed. A new hybrid fuzzy c-means (HFCM)
algorithm was then applied, initialized, comparatively, by three metaheuristics: Differential Evolution
(DE), Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), for the data clustering. The number
of clusters obtained was validated by three metrics: Calinski-Harabasz Criterion, Davies-Bouldin Crite-
rion and Silhouette Coefficient, with all three unanimously indicating two clusters as the ideal number:
one cluster for locations with greater potential for allocating facilities to capture solar energy, and
another for locations with a lower potential. With the new approach, an increase of 23.3% in the training
velocity of the HFCM algorithm was identified, which required fewer iterations to achieve the same value
of the objective function. Else, a round of experiments was conducted with six different datasets (in-
stances from literature) and the results showed that the proposed method can achieve better results
(faster convergence and smaller solution cost) than the classic FCM. Visually, the predominance of the
allocation of facilities was perceived in states with a greater average incidence of solar radiation.
Therefore, this was the predominant factor in the convergence of the algorithm, which is in accordance
with expectations for solar energy. Finally, the social, economic and environmental gains were consid-
ered with the revitalization of unproductive land with the possibility of implementing solar power plants
in these areas.
© 2018 Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.jclepro.2018.04.207
0959-6526/© 2018 Elsevier Ltd. All rights reserved.
446 D.G.B. Franco, M.T.A. Steiner / Journal of Cleaner Production 191 (2018) 445e457
Meyfroidt, 2011; Morio et al., 2013), leading to demands for greater take a variety of forms, such as equations, networks, graphs or sets
efficiency in territorial occupation, especially the reuse of aban- of rules (Roiger, 2017; Witten et al., 2017). In this learning stage, two
doned areas, posing one of the greatest challenges (Morio et al., distinct approaches can be used. In the first approach, supervised
2013; Nuissl and Schroeter-Schlaack, 2009; U.S. Government Pub- learning, previously defined structures and patterns are considered
lishing Office, 2002). This situation is further aggravated if such a priori. In the second approach, unsupervised learning, these
areas are large and contaminated, thus causing, in addition to risks possible structures and patterns are not considered. It is left to the
to health and the environment, economic risks (Apostolidis and algorithm to identify these relationships between the variables
Hutton, 2006; Cao and Guan, 2007; de Sousa, 2003; Kaufman (Orriols-Puig et al., 2013).
et al., 2005; Morio et al., 2013). Clustering is included in the latter, unsupervised learning,
The aim of the present work is to analyze the use of clustering approach, in which the instances are clustered based on some rule
analysis (CA) to determine the best location to install solar power inherent to the structure, such as the distance between them
plants on unproductive areas (whether contaminated or not). By (Bramer, 2016; Roiger, 2017). Put simply, it could be said that
using public information on the main unused areas in the conti- following the definition of the number of clusters and their
nental United States, a new hybrid clustering methodology was respective centers (so-called centroids), there is a first stage of
developed. This was composed of the pre-processing of data and designating the instances to the closest centroid, followed by the
the application of a hybrid fuzzy c-means (HFCM) algorithm, optimization of its location, minimizing the objective function
initialized comparatively by three metaheuristics: Differential (Aggarwal, 2015).
Evolution (DE), Genetic Algorithm (GA) and Particle Swarm Opti- There are varying numbers and types of clustering algorithms
mization (PSO), with a view to classifying these areas as appro- (Fahad et al., 2014; Halkidi et al., 2001; Nanda and Panda, 2014; Xu
priate or inappropriate for the implementation of these power and Wunsch, 2005), of which some shown in Table 1 may be
plants. The main innovation of this new approach consists in speed highlighted, with the respective references from which they
up the FCM algorithm achieving equal or better results in clustering originated.
problems. Subsequently the technique can be exploited and
expanded to other domains. 2.1. State of the art of clustering analysis
As increased energy consumption clashes with the implications
of fossil fuel consumption and the consequent emission of toxic and For the analysis of the state of the art of CA algorithms, the most
greenhouse gases, investments are required for research and frequently cited articles in the Elsevier database for each year from
development in new clean and renewable energy sources (Almeida 2010 to 2017 were used.
et al., 2017; Ban~ os et al., 2011; Cadez and Czerny, 2016; Manan et al., In 2010, Deng et al. presented the Enhanced Soft Subspace
2017; Manzano-Agugliaro et al., 2012; Perea-Moreno et al., 2017). Clustering (ESSC) algorithm for inserting between-cluster infor-
Thus, renewable energy sources, such as solar energy, have become mation (cluster separation) in the objective function of the fuzzy
strong candidates in the new race for productivity and environ- clustering algorithms, as well as within-cluster information (cluster
mental and social well-being (Gonza lez et al., 2017; Lima et al., compactness), improving performance in real and synthetic high
2013). They are also in vogue in political, business and social dimensional datasets. In 2011, Kalogeratos and Likas addressed the
discourse in general (Onat et al., 2014; Simas and Pacca, 2013). In issue of positioning the initial centroids. Their proposal, k-synthetic
this scenario, solar energy is considered an abundant, free and prototypes, consisted of selecting representative centroids of the
clean energy source (Fern andez-García et al., 2015). characteristics of the cluster, applied to text documents, achieving
The present study is organized as follows. After this introduc- better results in small datasets.
tion, Section 2 looks at the theoretical framework regarding the In 2012, a new methodology was proposed to identify a priori
process of knowledge discovery in databases, FCM, DE, GA and PSO, the number of clusters that adapt better to the data and which
including a brief literature revue and the state of the art CA tech- produce robust solutions using the FCM algorithm and graph par-
niques. In Section 3, the data used in the case study are introduced titioning (Mok et al., 2012). Continuing on the initialization of the
with its respective pre-processing, followed by a description of the centroids, Khan and Ahmad (2013) used the points that minimize
proposed methodology. The results are presented and discussed in the dissimilarity between clusters (based on the most significant
Section 4 with a comparison with instances from literature in order attributes) as initial centroids, leading to a more rapid convergence
to check the HFCM efficiency. Finally, the study is summarized and of the K-modes algorithm in categorical datasets.
concluded in Section 5. Again in relation to the a priori definition of the number of
clusters, Shahbaba and Beheshti (2014) proposed the MACE-means,
which uses the minimization of the initial Average Central Error
2. Theoretical framework (ACE) of the K-means algorithm (estimated from the distance
within-cluster) to define the correct number of clusters in real and
A literature review is presented below regarding knowledge synthetic datasets. In 2015, a new approach emerges with Bruneau
discovery in databases (KDD), data mining and clustering. et al., the Cluster Sculptor, an interactive system that allows the
KDD is a field of knowledge dedicated to identifying the user to improve the results of the clustering with the aid of a two-
extraction of significant patterns of information from databases dimensional data projection.
(Fayyad et al., 1996). KDD is applied in multiple stages, beginning In 2016, Saki and Kehtarnavaz proposed an online frame-based
with the selection, pre-processing and transformation of data, clustering algorithm (OFC) consisting of three phases: removal of
which may include removing outliers, substituting missing data, outliers based on density and generation and updating of new
normalization, principal component analysis (PCA) and other clusters. Tests with real and synthetic datasets showed that the
techniques, depending on the algorithm of the subsequent stage, authors' proposal outperformed the CluStream, DenStream and
data mining (or learning), ending with the interpretation of results SVStream algorthisms, also online clustering algorithms. Finally,
and generating knowledge (Fayyad et al., 1996; Gamarra et al., Ramon-Gonen and Gelbard (2017) presented a new temporal
2016; Orriols-Puig et al., 2013). (evolutionary) clustering algorithms, Cluster Evolution Analysis
The data mining stage may involve one or several algorithms in (CEA), which addresses three aspects of the problem that vary over
search of patterns, trends and structures in the database, which can time: changes in the number of clusters, evolution of cluster
D.G.B. Franco, M.T.A. Steiner / Journal of Cleaner Production 191 (2018) 445e457 447
number C of clusters is in the interval ½2;N. It falls to the researcher belonging to the interval ½0; 1, and randi is a whole number chosen
to determine (with or without the aid of metrics) the ideal number at random in the interval ½1; 2;…;D, with a view to guaranteeing a
of clusters for the problem in question. The error ε is also deter- vi;Gþ1 sxi;G .
mined by the researcher in accordance with the application. The target vector xi;G is compared with the test vector ui;Gþ1 and
the lowest value is selected for the next generation (survival of the
2.3. DE algorithm fittest), as shown in Equation (5).
ui;Gþ1 if f ui;Gþ1 f xi;G
DE is an evolutionary algorithm for stochastic search and global xi;Gþ1 ¼
xi;G else
optimization that maintains, at each iteration G, a population of N (5)
candidates (called target vectors) xi;G ¼ ½xj;i , where j ¼ 1; 2; …; D is
i ¼ 1; 2; …; N
the number of parameters of the model (the dimension of the
problem, ℝD ) and i ¼ 1; 2; …; N, which are submitted to successive In Fig. 2, below, the DE algorithm is presented (Brownlee, 2011;
stages of selection, mutation and recombination. The mutation Chakraborty, 2008; Price et al., 2005; Storn and Price, 1997; Xing
expands the search space, creating a new solution based on the and Gao, 2014).
weighted difference between two members of the previous popu- In addition to the schema presented above, numerous others are
lation added to a third member, as shown in Equation (2), where useful. The classification of variants is given by the notation
vi;Gþ1 is called the donor vector (Brownlee, 2011; Storn and Price, DE=x=y=z, where x specifies the mode of mutation, which may be
1997). rand (random) or best (the element with the lowest cost for the
current population), y is the number of elements of difference and z
vi;Gþ1 ¼ xr1;G þ F* xr2;G xr3;G (2) is the mode of recombination ðbinÞ.
For this notation, the classic method is DE=rand=1=bin (Storn
In iteration G the indices r1 ; r2 ; r3 2f1; 2; …; Ng are selected and Price, 1997), but it should be said that an interesting variant
randomly and are mutually exclusive, as well as different from in- (in terms of population diversification, if it is sufficiently large) is
dex i, which imposes a population N 4. F2½0; 2 is the real DE=best=2=bin, whose mutation is given by Equation (6) (Storn,
weighting constant that controls the amplitude of the differential 1996).
variation. With a view to increasing the diversity of solutions
generated in the mutation stage and reusing fit individuals, a vi;Gþ1 ¼ xbest;G þ F* xr1;G þ xr2;G xr3;G xr4;G (6)
recombination is executed in accordance with Equations (3) and
(4). The vector ui;Gþ1 is called the test vector.
ui;Gþ1 ¼ u1;i;Gþ1 ; u2;i;Gþ1 ; …; uD;i;Gþ1 (3) 2.4. GA algorithm
To a population of N elements, initialized randomly, are applied 41 and 42 are random values in the interval ½0; 1. Ppbest and Pgbest
selection, crossover and mutation operations that, over a previously are the memory of the best solution achieved, in the previous
defined number of iterations (generations) create new solutions iteration by each particle and by the swarm, respectively.
from the previous population and preserve the most apt elements
(in terms of maximum and minimum) according to the value of the 2.6. Selection of the number of clusters
objective function, also known as fitness (Steiner et al., 2015).
Fig. 3 presents the pseudocode of the GA algorithm, where D The number of clusters was validated by three metrics, consid-
represents the number of parameters of the model, or dimension of ered the most frequently used in the literature (Arbelaitz et al.,
the problem ðℝD Þ, and MR and CR are the probabilities of mutation 2013): the Calinski-Harabasz Criterion, Davies-Bouldin Criterion
and crossover, respectively. The stop criteria have to do with the and Silhouette Coefficient.
maximum number of iterations and target value to be achieved for The criterion of the Calinski-Harabasz, also known as the Vari-
fitness (Brownlee, 2011; Engelbrecht, 2007; Kruse et al., 2016). ance Ratio Criterion (VRC), may be defined as the ratio between the
One of the selection models of the parents is stochastic universal mean variance between-cluster and the mean variance within-
sampling, also known as roulette wheel selection, where each in- cluster (Equation (7)). The higher its value, the better the data
dividual is given a probability of selection proportional to its fitness. partitioning.
In other words, the more adequate the individual is to the solution
Pk
of the problem, the greater the probability of being selected. ni ci x2 ðN kÞ
In the recombination phase, or crossover, the parents are par- CHk ¼ Pk i¼1
P * (7)
2 ðk 1Þ
i¼1 x2ki x ci
titioned in determined cut-off points, determined by CR, and have
their coordinates (for the clustering problem) exchanged, giving
where ni is the number of elements in the cluster i; ci is the centroid
birth to children who then suffer mutation in the determined
of the cluster i; x is the mean of the data set; N is the number of
points from MR, with a view to increasing the diversity of available
elements in the data set; k is the number of clusters and x is an
solutions.
element of the data set belonging to cluster ki :k$k is the Euclidean
distance.
2.5. PSO algorithm The Davies-Bouldin criterion is based on the ration between the
distances within-cluster and between-cluster, in accordance with
PSO is a global optimization algorithm characterized by Equation (8). The lower its value, the better the result.
sweeping the search space using a swarm of particles. It belongs to ( )
the fields of swarm intelligence and collective intelligence, 1X k
di þ dj
branches of computational intelligence. It was originally inspired DBk ¼ maxjsi (8)
k i¼1 di;j
by the social behavior of some animals, such as flocks of birds and
shoals of fish (Kruse et al., 2016).
where k is the number of clusters; di is the average distance be-
The search that the algorithm makes consists of N individuals
exploring the neighborhoods of the swarm up to a certain distance tween each point of the ith cluster and the centroid of the respective
and returning information to the members on the discoveries being cluster (analogously for dj ) and di;j is the Euclidean distance be-
made. It can be viewed as a method that combines gradient-based tween the centroids of clusters i and j.
search and population-based search, which requires that the The Silhouette Coefficient, for each point of the data set, is the
function to be optimized should be of the type f : ℝD /ℝ measurement of how similar one point is to the other points in the
(Brownlee, 2011; Engelbrecht, 2007; Kruse et al., 2016). same cluster compared with the points of another cluster. Its value,
Fig. 4 shows the pseudocode of the PSO algorithm, where w Si 2½ 1; 1, for the ith point is defined in accordance with Equation
corresponds to the inertia factor, C1 and C2 are two constants and (9).
: , , ,
:
1. ( , )
2. ( )
3. ¬
4. ( , )
5.
6. ( , )
7. ( , )
8.
9. ( )
10.
11.
12.
Fig. 6. Direct normal irradiance and contaminated areas in the continental United States.
contribute little to the variance of the set (Theodoridis and proposed using three distinct metaheuristics (as detailed in Sec-
Koutroumbas, 2009). tions 2.2, 2.3 and 2.4): DE (Storn, 1996; Storn and Price, 1997) and
For the variables tested, only “off-grid direct normal irradiance” GA (Goldberg, 1989), considered evolutionary strategies, and PSO
was removed, as it did not contribute to the variance of the data set (Kennedy et al., 2001; Kennedy and Eberhart, 1995; Poli, 2008), one
(only 0.5829%, compared with 50.5270% for “mapped area”, of the swarm strategies.
25.9960% for “distance to the transmission lines” and 22.8941% for The pseudocode of the resulting HFCM algorithm is shown in
“direct normal irradiance on a utility scale”). Therefore, the final set Fig. 8, below. Line 1 is the initialization of the fuzzy partition matrix
used in the tests had three variables. This care is necessary to avoid (U0 ) by one of the three metaheuristics: DE, GA or PSO.
correlated independent variables being included in the model Only one iteration was performed for the initialization phase
(collinearity). Fig. 7, below, provides details of the correlation and a population of 4 individuals was used (minimum number of
analysis for the four variables in the original data set and shows the individuals for the DE algorithm).
strong correlation (Pearson's linear correlation coefficient equal to For the visual presentation of the results (Fig. 10), the simple
0.98) between the variables “direct normal irradiation on a utility rounding of the fuzzy partition matrix was used, which stores the
scale” and “off-grid direct normal irradiation”. degrees of participation for each instance in each cluster. Thus, the
values greater than or equal to “0.5” were rounded up to “1” and
3.2. Proposed algorithm values below “0.5” were rounded down to “0”. The intermediate
values of degrees of participation (between 0.4 and 0.6, for
As the classic FCM algorithm (Section 2.1) assumes the random example) could have been representatives of a third intermediate
initialization of the fuzzy partition matrix, a higher number of it- cluster. However, the aim of this work (supported by three distinct
erations is required to achieve the final solution. With this limita- metrics) was to have only two clusters.
tion in mind, here the initialization of the fuzzy partition matrix is As in the initial stage (metaheuristic and partitional) there is no
452 D.G.B. Franco, M.T.A. Steiner / Journal of Cleaner Production 191 (2018) 445e457
Fig. 7. Analysis of the correlation for the four variables of the data set.
the statistic for each of them. It should be noted that the instances
allocated to Cluster 1 have a higher mean for the variables “mapped
area” (Var. 1), “direct normal irradiance on a utility scale” (Var. 3)
and “off-grid direct normal irradiance” (Var. 4). For “distance to the
transmission lines” (Var. 2), Cluster 2 had the lowest mean, a dif-
ference of 11.6% in relation to Cluster 1. It is interesting that Cluster
2, even with five times as many instances as Cluster 1, has lower
standard deviations and variance for all the variables, one of the
indicators of quality (compactness) in clustering (Fahad et al.,
2014). Therefore, Cluster 1 has the greatest potential for the
installation of solar power plants. When choosing the best locations
among the 836 possible alternatives in Cluster 1, economic, tech-
nical and even political aspects should be taken into account.
Table 4
Statistic for the clusters found.
The "bold" means the best values for "Means" (the highest values for Cluster 1, variables (Var.) 1, 3 and 4, and smallest value for Cluster 2, variable 2) and "Standard Deviation"
(the smallest values).
Table 5
Results with Aggregation dataset.
Table 6
Results with Compound dataset.
Table 7
Results with D31 dataset.
Table 8
Results with T4.8k dataset.
Table 9
Results with Credit Card dataset.
Table 10
Results with Wine Quality dataset.
References Guha, S., Rastogi, R., Shim, K., 2000. Rock: a robust clustering algorithm for cate-
gorical attributes. Inf. Syst. 25 (5), 345e366.
Guha, S., Rastogi, R., Shim, K., 1998. CURE: an efficient clustering algorithm for large
Aggarwal, C.C., 2015. Data Mining. Springer, Cham.
databases. In: Proceedings of the 1998 ACM SIGMOD International Conference
Almeida, C.M.V.B., Agostinho, F., Huisingh, D., Giannetti, B.F., 2017. Cleaner Pro-
on Management of Data - SIGMOD '98. ACM, New York, pp. 73e84.
duction towards a sustainable transition. J. Clean. Prod. 142, 1e7.
Halkidi, M., Batistakis, Y., Vazirgiannis, M., 2001. On clustering validation tech-
Apostolidis, N., Hutton, N., 2006. Integrated water management in brownfield sites:
niques. J. Intell. Inf. Syst. 17 (2e3), 107e145.
more opportunities than you think. Desalination 188 (1e3), 169e175.
Hartmann, B., To €ro
€k, S., Bo€rcso € k, E., Ola
hne
Groma, V., 2014. Multi-objective method
Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pe rez, J.M., Perona, I., 2013. An extensive
for energy purpose redevelopment of brownfield sites. J. Clean. Prod. 82,
comparative study of cluster validity indices. Pattern Recogn. 46 (1), 243e256.
202e212.
Babu, G.P., Murty, M.N., 1994. Clustering with evolution strategies. Pattern Recogn.
Hathaway, R.J., Bezdek, J.C., 1991. Grouped coordinate minimization using Newton's
27 (2), 321e329.
method for inexact minimization in one vector coordinate. J. Optim. Theor.
Babuska, R., 1998. Fuzzy Modeling for Control. Springer, New York.
~ os, R., Manzano-Agugliaro, F., Montoya, F.G., Gil, C., Alcayde, A., Go mez, J., 2011. Appl. 71 (3), 503e516.
Ban
Hinneburg, A., Keim, D.A., 2003. A general approach to clustering in large databases
Optimization methods applied to renewable and sustainable energy: a review.
with noise. Knowl. Inf. Off. Syst. 5 (4), 387e415.
Renew. Sustain. Energy Rev. 15 (4), 1753e1766.
Hinneburg, A., Keim, D.A., 1998. An efficient approach to clustering in large
Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V., 2001. Support vector clustering.
multimedia databases with noise. In: Proceedings of 4th International Confer-
J. Mach. Learn. Res. 2, 125e137.
€ ence on Knowledge Discovery and Data Mining - KDD '98. AAAI Press, New
Bergius, K., Oberg, T., 2007. Initial screening of contaminated land: a comparison of
York, pp. 58e65.
US and Swedish methods. Environ. Manag. 39 (2), 226e234.
Huang, Z., 1998. Extensions to the k-means algorithm for clustering large data sets
Bezdek, J.C., 1981. Pattern Recognition with Fuzzy Objective Function Algorithms.
with categorical values. Data Min. Knowl. Discov. 2 (3), 283e304.
Springer, New York.
Kalogeratos, A., Likas, A., 2011. Document clustering using synthetic cluster pro-
Bezdek, J.C., Ehrlich, R., Full, W., 1984. FCM: the fuzzy c-means clustering algorithm.
totypes. Data Knowl. Eng. Times 70 (3), 284e306.
Comput. Geosci. 10 (2e3), 191e203.
Karray, F.O., Silva, C. De, 2004. Soft Computing and Intelligent Systems Design:
Bezdek, J.C., Hathaway, R.J., Howard, R.E., Wilson, C.A., Windham, M.P., 1987. Local
Theory, Tools, and Applications. Addison-Wesley, Harlow.
convergence analysis of a grouped variable version of coordinate descent.
Karypis, G., Han, Eui-Hong, Kumar, V., 1999. Chameleon: hierarchical clustering
J. Optim. Theor. Appl. 54 (3), 471e477.
using dynamic modeling. Computer 32 (8), 68e75 (Long. Beach. Calif).
Bird, L., Heeter, J., Kreycik, C., 2011. Solar Renewable Energy Certificate (SREC)
Kaufman, M.M., Rogers, D.T., Murray, K.S., 2005. An empirical model for estimating
Markets: Status and Trends.
remediation costs at contaminated sites. Water Air Soil Pollut. 167 (1e4),
Bramer, M., 2016. Principles of data mining. In: Undergraduate Topics in Computer
365e386.
Science, third ed. Springer, London.
Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. In: Proceedings of ICNN
Brownlee, J., 2011. Clever Algorithms. LuLu.
'95-international Conference on Neural Networks. IEEE, Perth, Australia,
Bruneau, P., Pinheiro, P., Broeksema, B., Otjacques, B., 2015. Cluster Sculptor, an
pp. 1942e1948.
interactive visual clustering system. Neurocomputing 150 (B), 627e644.
Kennedy, J., Eberhart, R.C., Shi, Y., 2001. Swarm Intelligence, Science. Morgan
Cadez, S., Czerny, A., 2016. Climate change mitigation strategies in carbon-intensive
Kaufmann.
firms. J. Clean. Prod. 112, 4132e4143.
Khan, S.S., Ahmad, A., 2013. Cluster center initialization algorithm for K-modes
Cao, K., Guan, H., 2007. Brownfield redevelopment toward sustainable urban land
clustering. Expert Syst. Appl. 40 (18), 7444e7456.
use in China. Chin. Geogr. Sci. 17 (2), 127e134.
Kovacs, H., Szemmelveisz, K., 2017. Disposal options for polluted plants grown on
Chakraborty, U.K. (Ed.), 2008. Advances in Differential Evolution, Studies in
heavy metal contaminated brownfield lands: a review. Chemosphere 166,
Computational Intelligence. Springer, Berlin, Heidelberg.
8e20.
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., Reis, J., 2009. Modeling wine prefer-
Kruse, R., Borgelt, C., Braune, C., Mostaghim, S., Steinbrecher, M., 2016. Computa-
ences by data mining from physicochemical properties. Decis. Support Syst. 47
tional intelligence: a methodological introduction. In: Texts in Computer Sci-
(4), 547e553.
ence, Texts in Computer Science, second ed. Springer, London.
de Sousa, C.A., 2003. Turning brownfields into green space in the city of toronto.
Lambin, E.F., Meyfroidt, P., 2011. Global land use change, economic globalization,
Landsc. Urban Plan. 62 (4), 181e198.
and the looming land scarcity. Proc. Natl. Acad. Sci. U. S. A 108 (9), 3465e3472.
Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete
Li, X., Jiao, W., Xiao, R., Chen, W., Liu, W., 2017. Contaminated sites in China:
data via the EM algorithm. J. R. Stat. Soc. Ser. B - Methodol. 39 (1), 1e38.
countermeasures of provincial governments. J. Clean. Prod. 147, 485e496.
Deng, Z., Choi, K.-S., Chung, F.-L., Wang, S., 2010. Enhanced soft subspace clustering
Lima, F., Ferreira, P., Vieira, F., 2013. Strategic impact management of wind power
integrating within-cluster and between-cluster information. Pattern Recogn. 43
projects. Renew. Sustain. Energy Rev. 25, 277e290.
(3), 767e781.
Lu, S., Shang, Y., Li, Y., 2017. A research on the application of fuzzy iteration clus-
Desarbo, W.S., 1982. Gennclus: new models for general nonhierarchical clustering
tering in the water conservancy project. J. Clean. Prod. 151, 356e360.
analysis. Psychometrika 47 (4), 449e475.
MacQueen, J., 1967. Some methods for classification and analysis of multivariate
Dunn, J.C., 1974. A fuzzy relative of the ISODATA process and its use in detecting
observations. In: Proceedings of the Fifth Berkeley Symposium on Mathemat-
compact well-separated clusters. J. Cybern. 3 (3), 32e57.
ical Statistics and Probability. University of California Press, Berkeley,
Eberhart, R.C., Shi, Y., 2007. Computational Intelligence: Concepts to Implementa-
pp. 281e297.
tions. Morgan Kaufmann, Burlington.
Manan, Z.A., Mohd Nawi, W.N.R., Wan Alwi, S.R., Klemes, J.J., 2017. Advances in
Engelbrecht, A.P., 2007. Computational Intelligence: an Introduction, second ed.
Process Integration research for CO 2 emission reduction e a review. J. Clean.
Wiley.
Prod. 167, 1e13.
EPA, 2016. RE-powering America's Land Initiative: Benefits Matrix [WWW Docu-
Manzano-Agugliaro, F., Sanchez-Muros, M.J., Barroso, F.G., Martínez-Sa nchez, A.,
ment]. https://goo.gl/XMov1T. (Accessed 3 March 2017).
rez-Ban
Rojo, S., Pe ~o n, C., 2012. Insects for biodiesel production. Renew. Sustain.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., 1996. A density-based algorithm for
Energy Rev. 16 (6), 3744e3753.
discovering clusters in large spatial databases with noise. In: Proceedings of the
Mok, P.Y., Huang, H.Q., Kwok, Y.L., Au, J.S., 2012. A robust adaptive clustering
2nd International Conference on Knowledge Discovery and Data Mining - KDD
analysis method for automatic identification of clusters. Pattern Recogn. 45 (8),
'96. AAAI Press, Portland, pp. 226e231.
3017e3033.
Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A.Y., Foufou, S.,
Morio, M., Sch€ adler, S., Finkel, M., 2013. Applying a multi-criteria genetic algorithm
Bouras, A., 2014. A survey of clustering algorithms for big data: taxonomy and
framework for brownfield reuse optimization: improving redevelopment op-
empirical analysis. IEEE Trans. Emerg. Top. Comput. 2 (3), 267e279.
tions based on stakeholder preferences. J. Environ. Manag. 130, 331e346.
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., 1996. From data mining to knowledge
Nanda, S.J., Panda, G., 2014. A survey on nature inspired metaheuristic algorithms
discovery in databases. AI Mag. 37e54.
ndez-García, A., Rojas, E., Pe rez, M., Silva, R., Hernandez-Escobedo, Q., Man- for partitional clustering. Swarm Evol. Comput. 16, 1e18.
Ferna
Ng, R.T., Han, J., 1994. Efficient and effective clustering methods for spatial data
zano-Agugliaro, F., 2015. A parabolic-trough collector for cleaner industrial
mining. In: Proceedings of the 20th International Conference on Very Large
process heat. J. Clean. Prod. 89, 272e285.
Data Bases - VLDB '94. Morgan Kaufmann, San Francisco, pp. 144e155.
Gamarra, C., Guerrero, J.M., Montero, E., 2016. A knowledge discovery in databases
Nuissl, H., Schroeter-Schlaack, C., 2009. On the economic approach to the
approach for industrial microgrid planning. Renew. Sustain. Energy Rev. 60,
containment of land consumption. Environ. Sci. Pol. 12 (3), 270e280.
615e630.
Onat, N.C., Kucukvar, M., Tatari, O., 2014. Integrating triple bottom line input-output
Gionis, A., Mannila, H., Tsaparas, P., 2007. Clustering aggregation. ACM Trans. Knowl.
analysis into life cycle sustainability assessment framework: the case for US
Discov. Data 1 (1), 1e30.
buildings. Int. J. Life Cycle Assess. 19 (8), 1488e1505.
Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization & Machine
Orriols-Puig, A., Martínez-Lo pez, F.J., Casillas, J., Lee, N., 2013. Unsupervised KDD to
Learning. Addison-Wesley, Boston.
lez, M.O.A., Gonçalves, J.S., Vasconcelos, R.M., 2017. Sustainable development: creatively support managers' decision making with fuzzy association rules: a
Gonza
distribution channel application. Ind. Market. Manag. 42 (4), 532e543.
case study in the implementation of renewable energy in Brazil. J. Clean. Prod.
Ozkan, I., Turksen, I.B., 2007. Upper and lower values for the level of fuzziness in
142, 461e475.
FCM. Inf. Sci. 177 (23), 5143e5152.
Greenberg, M., Lewis, M.J., 2000. Brownfields redevelopment, preferences and
Pal, N.R., Bezdek, J.C., Tsao, E.C.-K., 1993. Generalized clustering networks and
public involvement: a case study of an ethically mixed neighbourhood. Urban
Kohonen's self-organizing scheme. IEEE Trans. Neural Networks 4 (4), 549e557.
Stud. 37 (13), 2501e2514.
Perea-Moreno, A.-J., García-Cruz, A., Novas, N., Manzano-Agugliaro, F., 2017. Rooftop
D.G.B. Franco, M.T.A. Steiner / Journal of Cleaner Production 191 (2018) 445e457 457
analysis for solar flat plate collector assessment to achieving sustainability document], annual ed. Code Fed. Regul. https://goo.gl/UBCLDF. (Accessed 3
energy. J. Clean. Prod. 148, 545e554. March 2017).
Poli, R., 2008. Analysis of the publications on the applications of particle swarm U.S. Government Publishing Office, 2002. Public Law 107-118-Small Business Lia-
optimisation. J. Artif. Evol. Appl. 2008, 1e10. bility Relief and Brownfields Revitalization Act [WWW Document]. H.R. 2869.
Price, K.V., Storn, R.M., Lampinen, J.A., 2005. Differential Evolution, Natural https://goo.gl/UK19n2. (Accessed 3 March 2017).
Computing Series. Springer, Berlin. van Straalen, N.M., 2002. Assessment of soil contamination: a functional perspec-
Ramon-Gonen, R., Gelbard, R., 2017. Cluster evolution analysis: identification and tive. Biodegradation 13 (1), 41e52.
detection of similar clusters and migration patterns. Expert Syst. Appl. 83, Veenman, C.J., Reinders, M.J.T., Backer, E., 2002. A maximum variance cluster al-
363e378. gorithm. IEEE Trans. Pattern Anal. Mach. Intell. 24 (9), 1273e1280.
Roiger, R.J., 2017. Data Mining: a Tutorial-based Primer, second ed. CRC Press. Wang, W., Yang, J., Muntz, R.R., 1997. STING: a statistical information grid approach
Rong, L., Zhang, C., Jin, D., Dai, Z., 2017. Assessment of the potential utilization of to spatial data mining. In: Proceedings of 23rd International Conference on Very
municipal solid waste from a closed irregular landfill. J. Clean. Prod. 142, Large Data Bases - VLDB ’97. Morgan Kaufmann, San Francisco, pp. 186e195.
413e419. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J., 2017. Data Mining: Practical Machine
Saki, F., Kehtarnavaz, N., 2016. Online frame-based clustering with unknown Learning Tools and Techniques, fourth ed. Morgan Kaufmann.
number of clusters. Pattern Recogn. 57, 70e83. Wu, K.-L., 2012. Analysis of parameter selections for fuzzy c-means. Pattern Recogn.
Shahbaba, M., Beheshti, S., 2014. MACE-means clustering. Signal Process. 105, 45 (1), 407e415.
216e225. Xing, B., Gao, W.-J., 2014. Innovative Computational Intelligence: a Rough Guide to
Sheikholeslami, G., Chatterjee, S., Zhang, A., 1998. Wavecluster: a multi-resolution 134 Clever Algorithms, Intelligent Systems Reference Library. Springer, Cham.
clustering approach for very large spatial databases. In: Proceedings of 24rd Xu, R., Wunsch II, D., 2005. Survey of clustering algorithms. IEEE Trans. Neural
International Conference on Very Large Data Bases - VLDB '98. Morgan Kauf- Networks 16 (3), 645e678.
mann, San Francisco, pp. 428e439. Yeh, I.-C., Lien, C., 2009. The comparisons of data mining techniques for the pre-
Simas, M., Pacca, S., 2013. Energia eo lica, geraç~
ao de empregos e desenvolvimento dictive accuracy of probability of default of credit card clients. Expert Syst. Appl.
sustenta vel. Estud. Avançados 27 (77), 99e116. 36 (2), 2473e2480.
Steiner, M.T.A., Datta, D., Steiner Neto, P.J., Scarpin, C.T., Rui Figueira, J., 2015. Multi- Zahn, C.T., 1971. Graph-theoretical methods for detecting and describing gestalt
objective optimization in partitioning the healthcare system of Parana State in clusters. IEEE Trans. Comput. C 20 (1), 68e86.
Brazil. Omega 52, 53e64. Zgurovsky, M.Z., Zaychenko, Y.P., 2017. The Fundamentals of Computational Intel-
Storn, R., 1996. On the usage of differential evolution for function optimization. In: ligence: System Approach, Studies in Computational Intelligence. Springer,
Proceedings of North American Fuzzy Information Processing. IEEE, Cham.
pp. 519e523. Zhang, T., Ramakrishnan, R., Livny, M., 1996. BIRCH: an efficient data clustering
Storn, R., Price, K., 1997. Differential evolution e a simple and efficient heuristic for databases method for very large. In: Proceedings of the 1996 ACM SIGMOD
global optimization over continuous spaces. J. Global Optim. 11 (4), 341e359. International Conference on Management of Data - SIGMOD '96. ACM, New
Theodoridis, S., Koutroumbas, K., 2009. Pattern Recognition, fourth ed. Academic York, pp. 103e114.
Press. Zhou, K., Fu, C., Yang, S., 2014. Fuzziness parameter selection in fuzzy c-means: the
U.S. Government Publishing Office, 2015. 42 U.S.C. 9601-9628-Hazardous Sub- perspective of cluster validation. Sci. China Inf. Sci. 57 (11), 1e8.
stances Releases, Liability, Compensation [WWW Document]. United States Zhou, K., Yang, S., Shao, Z., 2017. Household monthly electricity consumption
Code, 2012 Ed. Suppl. 3, Title 42-Public Heal. Welfare, Subchapter I. https://goo. pattern mining: a fuzzy clustering-based model and a case study. J. Clean. Prod.
gl/y0ki6N. (Accessed 3 March 2017). 141, 900e908.
U.S. Government Publishing Office, 2011. 40 C.F.R. 239-282-Solid Wastes [WWW