Beruflich Dokumente
Kultur Dokumente
Abstract This paper presents a new approach to forecast clustering techniques. This fact discretizes and simplies
the behaviorFirst,
sequences. of time series based
clustering on similarity
techniques are used ofwith
pattern
the the process of prediction, since during the whole process
aim
Thus, of grouping
the pattern and
prediction labeling
of a data the samples
point from
is provided a dataset.
aspredicted
follows.
the PSF algorithm deals with sequences of labels instead
of with sets of real values. Despite that a naive use of la-
First, the
iscalextracted. Then, sequence prior
this sequence to the day
is searched to be
in the histori- bels to predict time series was presented in [21], the PSF
data
samples and the prediction
immediately after is
the calculated
matched by averaging
sequence. The allmain
the includes a new methodology to automatize the obtaining
of the labels providing rules to assign them to the sam-
novelty
are is that only
considered to the labels
forecast the associated
future with each
behavior of pattern
the time ples of real values. The sensitivity of the key parameter
series, avoiding the use of real values of the time series until involved in the selection of the number of underlying pat-
the
eral last
energysteptime
of the
series prediction
are process.
reported and Results
the from sev-of
performance
terns is also analyzed in order to study the robustness of
the method. The number of labels comprising the pattern
the
lished proposed method
techniques showingis compared
a to that
remarkable of recentlyinpub-
improvement the sequence, used in each prediction process, is systematically
prediction. determined in this work.
The PSF algorithm aims to be a general-purpose fore-
Keywords Time series, forecasting, patterns.
casting procedure. However, electricity-related problems
are addressed in this work. To be precise, two major groups
I. Introduction of time series are forecasted: electricity prices and electric-
The analysis of temporal data and the prediction of fu- ity demand. These groups belong to three dierent mar-
ture values of time series are among the most important kets: the Spanish Electricity Market Operator (OMEL),
problems that data analysts face in many elds, ranging the New York Independent System Operator (NYISO)
from nance and economics, to production operations man- and the Australia's National Electricity Market (ANEM).
agement or telecommunications. Therefore, the overall experimentation consists of six inde-
A forecast is a prediction of some future event(s). pendent time series showing thus the adaptability of the
Forecasting problems are often classied as short-term, PSF to miscellaneous time series. Moreover, in order to fa-
medium-term and long-term. Short-term forecasting prob- cilitate the comparison of the obtained results, all the data
lems involve predicting events only a few time periods sets analyzed are available on-line [19], [25], [26].
(days, weeks, months) into the future. Medium-term fore- The rest of the paper is organized as follows. Section
casts extend from one to two years and long-term forecast- II presents an exhaustive revision of the state-of-the-art
ing problems can extend beyond that by many years. on electricity prices and demand time series forecasting.
Time series data can be dened as a chronological se- Section III introduces the proposed methodology and the
quence of observations on a variable of interest. Most fore- description of the PSF algorithm, which can be applied to
casting problems imply the use of such data whose analysis time series of any nature. Section IV shows the results ob-
has traditionally been done by means of classical statistical tained by the PSF approach in electric energy markets of
tools. Nowadays, data mining techniques are acquiring a Spain, Australia and New York for the whole year 2006,
great relevance due to the large number of samples forming including measures of the quality of them. In Section V
the time series in multiple areas. comparisons between the proposed method and other tech-
A new approach, called Pattern Sequence-based Fore- niques are provided. Finally, Section VI summarizes the
casting (PSF), is here presented in order to forecast time main conclusions achieved and gives clues for future work.
series. This work can be considered a generalization of II. Related work
the algorithm introduced in [33], which is based on near-
est neighbors techniques. Nevertheless, the new approach The forecasting of energy time series has been widely
makes predictions using only labels generated by means of studied in literature, as it is described in Sections II-A and
II-B.
Francisco Martínez Álvarez, Alicia Troncoso and Jesús S. Aguilar
Ruiz are with the Pablo de Olavide University, Seville, Spain (e- A. Electricity prices time series forecasting
mail: fmaralv@upo.es; ali@upo.es; aguilar@upo.es) and José C.
Riquelme is with the Department of Computer Science, University The electric power markets have become competitive
of Seville, Seville, Spain (e-mail: riquelme@lsi.us.es). The nancial
support from the Spanish Ministry of Science and Technology, project
markets due to the deregulation carried out in the last
TIN2007-68084-C-00, and from the Junta de Andalucía, project P07- years, allowing the participation of all producers, investors,
TIC-02611, is acknowledged. traders or qualied buyers. Thus, the price of the electric-
ity is determined on the basis of this buying/selling system. several types of transfer functions. Recently, Pindoriya et
Consequently, a will of obtaining optimized bidding strate- al. [28] proposed an articial neural network in which the
gies has arisen in the electricity-producer companies [29], output of the hidden layer neurons was based on wavelets
needing both insight into future electricity prices and as- that adapted their shape to training data.
sessment of the risk of trusting in predicted prices. A modication of the weighted nearest neighbors (WNN)
Electricity prices time series presents some peculiarities methodology is proposed in [33]. To be precise, the ap-
such as nonconstant mean and variance as well as the pres- proach weighted the nearest neighbors in order to improve
ence of outliers that turns the forecasting into a specially the prediction accuracy.
dicult task. Due to this fact, the accomplishment of ac- The occurrence of outliers (also called spike prices) or
curate forecasting has motivated research works by many prices signicantly higher than the expected values is an
authors nowadays [2], [37]. usual feature found in these time series. With the aim
The authors in [7] used the wavelet transform and au- of dealing with this feature, the authors in [41] proposed
toregressive integrated moving average models (ARIMA) a data mining framework based on both support vector
to predict the day-ahead electricity price. Indeed, they machines (SVM) and a probability classier.
rst used the wavelet transform to split the available his- Recently, a fuzzy inference system adopted due to
torical data into constitutive series. Then, specic ARIMA its transparency and interpretability combined with tra-
models were applied to these series and the forecasts were ditional time series methods was proposed for day-ahead
obtained by applying the inverse wavelet transform to the electricity price forecasting [18].
were proposed to obtain the forecasts of the prices. In ad- The process of forecasting the quantity of electricity re-
dition, the work analyzed the optimal number of samples quired for a specic geographical area during a time period
used to build the prediction models. Aggarwal et al. [3] di- is called load forecasting or demand forecasting. This pro-
vided each day into segments and they applied a multiple cess is key since current technology allows to store only
linear regression to the original series or the constitutive little amount of electricity in batteries. Therefore, the de-
series obtained by the wavelet transform depending on the mand forecasting plays an important role for electricity
segment. Moreover, the regression model used dierent in- power suppliers because both excess and insucient en-
put variables for each segment. ergy production may lead to large costs and signicative
Equally noticeable was the approach proposed by Gar- reduction of benets.
cía et al. [14] in which a forecasting technique based on Load forecasting has been widely studied [31], [37]. The
a generalized autoregressive conditional heteroskedasticity existing procedures are usually divided into two main
(GARCH) model was presented. Hence, this paper focused groups [13]. The rst one gathers traditional approaches
on day-ahead forecast of electricity prices with high volatil- such as regression, data smoothing techniques or Box and
ity periods. Jenkin's models. Thus, the authors in [27] focussed on
Transfer functions models based on past electricity prices the one year-ahead prediction for winter seasons by den-
and demand were proposed to forecast day-ahead electric- ing a new Bayesian hierarchical model. They provide the
ity prices by Nogales et al. in [24], but the prices of all 24 marginal posterior distributions of demand peaks. Also in
hours of the previous day were not known. They used the [8] Bayesian models are used to forecast electricity demand.
median as measure due to the presence of outliers and they Moreover, a multiple linear regression model to forecast
stated that the model in which the demand was considered electricity consumption using some input variables such as
presented better forecasts. the gross domestic product, the price of electricity and the
Weron et al. [38] presented twelve parametric and semi- population was proposed in [23].
parametric time series models to predict electricity prices Taylor et al. [32] compared six univariate time series
for the next day. Moreover, in this work forecasting inter- methods to forecast electricity load for Rio de Janeiro
vals were provided and evaluated taking into account the and England and Wales markets. These methods were an
conditional and unconditional coverage. They concluded ARIMA model and an exponential smoothing (both for
that the intervals obtained by semiparametric models are double seasonality), an articial neural network, a regres-
better than that of parametric models. sion model with a previous principal component analysis
A hybrid model that combined articial neural networks and two naive approaches as reference methods. The best
(ANN) and fuzzy logic was introduced in [4]. As regards method was the proposed exponential smoothing and the
the neural network presented, it had a feed-forward ar- regression model showed a good performance for the Eng-
chitecture and three layers, where the hidden nodes of land and Wales demand.
the proposed fuzzy neural network performed the fuzzi- With reference to the second main group, it gathers arti-
cation process. Following with this technique, another cial intelligence techniques among which expert systems,
neural network-based approach was introduced in [5] in neural networks and fuzzy theory are the most popular
which multiple combinations were considered. These com- [22]. In [11], the authors discussed and presented results
binations consisted of networks with dierent number of by using an ANN to forecast the Jordanian electricity de-
hidden layers, dierent number of units in each layer and mand, which is trained by a particle swarm optimization
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
YES
Insert
predicted sample More days?
NO
END
CLUSTERING
Obtain K
data Select clustering (Silhouette, Dunn and Labeled data
Normalization
(K-means) Davies-Bouldin)
PREDICTION
Fig. 1. Illustration of the proposed methodology. The clustering and prediction stages are further detailed.
technique. They also showed the performance obtained bination of ANN and fuzzy set theory has become a new
by using a back-propagation algorithm and autoregressive tool to be explored.
moving average (ARMA) models. An ANN-based forecast-
ing technique can also be found in [30]. Another proposal III. The proposed methodology
can be found in [36], where a forecasting algorithm based on
The proposed methodology is divided into two phases
Grey Models was introduced to predict the load of Shang-
clearly dierentiated. In a rst step, a clustering technique
hai. In the Grey model the original data series was trans-
is performed and, secondly, the phase of forecasting is ap-
formed to reduce the noise of the data series and the accu-
plied by using the information provided by this cluster-
racy was improved by using Markov chains techniques. Fan
ing. The PSF algorithm is focused on predicting samples
et al. [12] proposed a hybrid machine learning model based
framed in a time series, either one-dimensional or multi-
on Bayesian classiers and SVM. First, Bayesian cluster-
dimensional, previously labeled with clustering techniques.
ing techniques were used to split the input data into 24
As soon as the clustering is applied, the algorithm only pro-
subsets. Then, SVM methods were applied to each subset
cesses the number of the cluster the label associated with
to obtain the forecasts of the hourly electricity load. In
each pattern assigned to the samples, ignoring if they had
[34], the authors proposed a methodology based on WNN
more than one feature.
techniques. The proposed approach was applied to the 24-
With the PSF method, the horizon of prediction can
hour load forecasting problem and they built an alternative
be as long as desired. Hence, more than one sample can
model by means of a conventional dynamic regression tech-
be predicted, making predictions of non-restricted length.
nique to perform a comparative analysis. In [1] the perfor-
This fact is possible because it is implemented with a close
mance of ANN, fuzzy networks and ARIMA models was
loop that feeds the prediction of a sample back in the data
evaluated to forecast the electricity demand time series in
set in order to predict the following sample. As a conse-
Victoria and the results showed that the fuzzy neural net-
quence, the PSF approach is able to insert the predicted
work outperformed the plain ANN and ARIMA models.
samples in the data set with the aim of forecasting fur-
Finally, [35] proposed a new prediction approach based on
ther samples. Therefore, in case the horizon of prediction
SVM techniques with a previous selection of features from
was longer than one day, every predicted sample would be
data sets by using an evolutionary method. The creation
inserted into the data set and considered to be a regular
of hybrids methods that highlight most of the strengths of
sample. This feature is specially useful when the predic-
each technique is currently the most popular work among
tion has to cover various days or a long-term prediction is
the researchers. And, from all hybrids methods, the com-
required.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
"
Fig. 1 shows the basic idea behind the proposed method- The procedure to select the number of clusters to be gen-
ology. All the steps composing this methodology are going erated, K , is now discussed. From the application of these
to be described in subsequent subsections. three indices (see subsections III-B.1, III-B.2, III-B.3), two
possible situations can appear: at least two indices the
A. Data normalization majority coincide in selecting the same K (the K even-
The rst task to be completed is the normalization of tually chosen) or none of them coincide. When the second
data that is only used for the clustering process. It can be situation occurs, the second best values of the three indices
assumed that the prices increase all along the year following are also considered (together with the rst best values).
a tendency in accordance with the intra-annual ination. The K selected is, then, the one pointed by the majority
That is, the original trend is smoothed from the initial of all the cases. Further best values will be included and
data. The transformation applied is, analyzed until one K had more votes than the others.
xj
xj ← 1
N (1) B.1 The silhouette index
xi
The silhouette function provides a quality measure of
N i=1
where xj is the price/demand of the j − th hour of a day separation among the clusters obtained by using a cluster-
and N is equal to 24 since each value represents one hour ing technique. The average distance of the object i belong-
of the day. ing to the cluster A to all the objects in A is denoted by a(i)
B. Clustering technique and the average distance of i to all objects of the cluster
C = A is called d(i, C). For every cluster C = A, d(i, C) is
Given the database of hourly prices/demand, the clus- computed and the smallest one is selected as follows,
tering problem consists of identifying K groups or clusters
such that the prices/demand curves of the days belonging
to a cluster are similar among them and dissimilar to the b(i) = min d(i, C) with i ∈ A
=
(2)
curves of those days belonging to other clusters, according C A
to a distance. Clustering is a dicult task due to the great
number of possible geometric shapes for the clusters and The value b(i) represents the dissimilarity of the object i
distances that can be considered. to its nearest neighbor cluster. Thus, the silhouette values,
As a consequence, the dimensionality of the database is silh(i) are given by the following equation,
drastically reduced from its initial 24 features (equivalent
to the 24 hours of the day) to only one dimension (the label silh(i) =
a(i) − b(i)
(3)
of the cluster to which the day belongs). max{a(i), b(i)}
To achieve this challenge, two questions should be an-
swered: which clustering technique should be chosen? And, The silh(i) can range from −1 to +1, where +1 and −1
if appropriate, how many clusters should be created? means that the object i belongs to an adequate or inad-
These two topics have widely been discussed in the lit- equate cluster, respectively. If the silhouette value of the
erature [39]. Nevertheless, it seems that there is not an object i belonging to the cluster A is close to zero, it means
unique answer because it depends on sensitive factors. that the object i can also be in the nearest neighbor clus-
Crisp or fuzzy clustering are the two main branches of ter to A. If cluster A is a set with only one element, the
non-supervised classication. The discussion of choosing silhouette value of the object i is not dened and in this
one technique or another can be found in [20], in which the case, it is concerted be equal to zero. The objective func-
well-known K-means algorithm was the optimal method tion is the average of silh(i) over the number of objects to
to classify this kind of data set. For this reason, the K- be classied, and the best clustering is reached when this
means algorithm is the clustering technique used in this function is maximized.
work during the whole process of prediction.
The K-means algorithm requires that the user provides B.2 The Dunn index
the number of clusters to be created. However, this num-
ber is a priori unknown and its selection and later evalua- One of the most cited indices was proposed in [10]. The
tions of the results obtained by the clustering are crucial for Dunn index (DU) aims to identify clusters with high inter-
most engineering applications. Thus, the most challenging cluster distance and low intra-cluster distance. The Dunn
problem of the clustering realm has been to select the right index for K clusters Ci with i = 1, ..., K is dened by,
number of clusters for data sets.
For all these reasons, three well-known validity indices DUK = min min fi,j
(4)
have been applied to data in order to decide how many i j = i
groups the original data set has to be split into: silhouette
index [16], Davies-Bouldin index [9] and the Dunn index where
[10]. The three of them share a common feature: the new fi,j =
d(Ci , Cj )
(5)
data structure obtained by the clustering algorithm is eval- max diam(Cm )
uated to test the validity of the partition. m
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Day to be
predicted
4 1 1 1 2 6 3 3 … 3 1 1 2 6 3 1 … 4 1 1 2 6 3 X
6.50 6.50
6.00 6.00
5.50 5.50
5.00 5.00
4.50 4.50
4.00 4.00
3.50 3.50
3.00 3.00
2.50 2.50
1 3 5 7 9 11 13 15 17 19 21 23 1 3 5 7 9 11 13 15 17 19 21 23
6.50
6.00
5.50
5.00
4.50
AVERAGE 4.00
3.50
3.00
2.50
1 3 5 7 9 11 13 15 17 19 21 23
d(Ci , Cj ) is the dissimilarity between clusters Ci and Cj with ni the number of points and zi the centroid of cluster
dened by, Ci .
The existence of highquality clusters is guaranteed if
d(Ci , Cj ) = min x − y the Davies-Bouldin index reaches small values. Therefore,
x ∈ Ci (6)
the optimal number of clusters is found when this index is
y ∈ Cj minimized for the dataset.
clusters is guaranteed if the Dunn index reaches high val- X(i) = [x1 , x2 , . . . , x24 ] . (11)
ues. Therefore, the maximum is observed for the most
d−1
exactly equals to SW −1 and thus successively. That is, the The optimal number of labels comprising the window
length of the window composed of the sequence of labels (parameter W ) that will be used as a pattern of search to
is decreased in one unit. This strategy guaranties that at nd all equal sequences of labels in dataset is determined
least some sequences will be found when W is equal to one. by minimizing the forecasting error when the PSF method
According to the PSF approach, the 24 hourly values of is applied to a training set.
the time series for the day d are predicted by averaging the Mathematically, that means to nd the value of W that
values of the days following those in ESd , minimizes the following function:
1
||X(d) − X(d)|| (15)
X(d) = · X(j + 1) (14)
size(ESd ) d∈T S
j∈ESd
1
n
&
MEAN SILHOUETTE −3
x 10
DUNN INDEX DAVIES−BOULDIN INDEX
0.45 1 1.4
0.4 1.3
0.8
0.35 1.2
0.6
0.3 1.1
0.25 1
0.4
0.2 0.9
0.2
0.15 0.8
0.1 0 0.7
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Number of clusters Number of clusters Number of clusters
MEAN SILHOUETTE −6
x 10
DUNN INDEX DAVIES−BOULDIN INDEX
0.6 2 1.2
1.8
1.15
0.55 1.6
1.1
1.4
0.5
1.2 1.05
0.45 1 1
0.8
0.95
0.4
0.6
0.9
0.4
0.35
0.85
0.2
0.3 0 0.8
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Number of clusters Number of clusters Number of clusters
MEAN SILHOUETTE −4
x 10
DUNN INDEX DAVIES−BOULDIN INDEX
0.55 4.5 1.4
0.5 4 1.3
0.45 3.5
1.2
0.4
3
1.1
0.35
2.5
0.3 1
2
0.25 0.9
1.5
0.2
0.8
1
0.15
0.5 0.7
0.1
0.05 0 0.6
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Number of clusters Number of clusters Number of clusters
0.8 1.6
1.4
1.4
0.7
1.3
1.2
0.6
1 1.2
0.6
0.4
1
0.4
0.3
0.9
0.2
0.2
2 4 6 8 10 12 14 16 18 20 0 0.8
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Number of clusters Number of clusters Number of clusters
on its global maximum (0.0003 and 0.0002 respectively). and Dunn index reach the maximum values (0.5673 and
With regard to the Australian electricity demand, the sit- 0.0178 respectively) while the Davies-Bouldin index reaches
uation shown in Fig. 6(b) is conclusive. Hence, the three its minimum value (0.9800).
methods agree in selecting K = 5 since both silhouette
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
'
MEAN SILHOUETTE x 10
−4 DUNN INDEX DAVIES−BOULDIN INDEX
0.5 3 1.3
0.45 1.2
2.5
0.4 1.1
2
0.35 1
1.5
0.3 0.9
1
0.25 0.8
0.5
0.2 0.7
0.15 0 0.6
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Number of clusters Number of clusters Number of clusters
0.016 1.3
0.55
0.014 1.25
0.5 0.012
1.2
0.01
0.45 1.15
0.008
1.1
0.4 0.006
1.05
0.004
0.35
0.002 1
0.3 0 0.95
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
Number of clusters Number of clusters Number of clusters
B.2 Selecting the length of the window be noticed that the mean of standard deviation of MER
Once the number of clusters is already decided, the next in the Spanish market is lower than that of the New York
step is to select the optimal length of the window W . Thus, market (0.27% versus 1.94%). This fact means that the
this step is focused on nding the W that obtains the min- maximum errors in the OMEL prices time series are closer
imum prediction error in the training set. to the average errors than the maximum errors obtained
Therefore, it is required to evaluate the performance of when the prices are predicted in the NYISO market. The
the PSF algorithm when W varies according to the method- standard deviation for the Australian market is the high-
ology presented in Section III-D. est (4.40%) due to the many peak prices considered as
Table II shows how the prediction error varies in accor- outliers that occur in this prices time series.
dance with the number of patterns considered in the win- TABLE III
dow. Note that the symbol '' means that similar sequences Performance of the PSF algorithm for the year 2006 in
of length W were not found when K clusters were consi- OMEL time series.
dered in the training set. Finally, the W that allows a lower
prediction error is the value chosen for further forecasting Month MER (σ)
PRICES
MAE (σ) MER (σ)
DEMAND
MAE (σ) in MW
on real data. It can be concluded that the optimal lengths Jan.
Feb.
7.26% (0.25) 0.53 (0.07)
4.93% (0.19) 0.36 (0.04)
3.12% (1.86) 744.32 (82.12)
of the windows that have to be used are W = 5, W = 3 Mar. 5.88% (0.22) 0.43 (0.05)
4.21%
5.07%
(2.26)
(4.17)
1033.97 (109.21)
1001.71 (98.85)
and W = 6 for the OMEL, NYISO and ANEM price time Apr.
May
3.62%
8.11%
(0.18)
(0.21)
0.28
0.64
(0.03)
(0.05)
4.18%
5.90%
(1.28)
(2.33)
1006.30 (107.58)
1129.76 (96.37)
series since they reach the lower prediction errors (2.23%, Jun.
Jul.
3.76%
4.30%
(0.24)
(0.23)
0.29
0.33
(0.05)
(0.04)
2.89%
2.34%
(1.81)
(1.19)
693.60 (84.60)
585.88 (63.17)
3.27% and 5.81%, respectively) and W = 2, W = 5 and Aug.
Sep.
5.37%
6.41%
(0.34)
(0.31)
0.42
0.50
(0.06)
(0.06)
3.61%
3.15%
(2.17)
(1.55)
792.21 (86.94)
757.02 (75.39)
W = 3 for the OMEL, NYISO and ANEM demand time
Oct. 7.89% (0.29) 0.58 (0.08) 2.89% (3.40) 1121.43 (149.75)
Nov. 8.30% (0.40) 0.64 (0.05) 4.72% (2.39) 982.19 (120.52)
series (2.87%, 4.99% and 3.43%, respectively). Dec.
Mean
8.02%
6.15%
(0.36)
(0.27)
0.59
0.47
(0.07)
(0.05)
6.21%
4.02%
(3.82)
(2.35)
1503.44 (198.49)
945.99 (106.08)
0
TABLE II
MER obtained with the PSF algorithm on the all the markets.
Market W=1 W=2 W=3 W=4 W=5 W=6 W=7 W=8 W=9 W=10 Selected W
OMEL price (K = 4) 10.32% 8.44% 8.21% 4.39% 2.23% 2.89% 5
OMEL demand (K = 8) 3.11% 2.87% 2
NYISO price (K = 5) 7.09% 5.98% 3.27% 6.98% 4.45% 13.20% 10.31% 3
NYISO demand (K = 4) 5.16% 6.21% 5.68% 5.02% 4.99% 6.23% 7.14% 6.90% 8.91% 5
ANEM price (K = 3) 9.58% 7.91% 6.26% 6.17% 7.33% 5.81% 6.04% 9.12% 6
ANEM demand (K = 5) 3.45% 4.17% 3.43% 6.10% 5.89% 4.02% 7.11% 3
TABLE IV
Performance of the PSF algorithm for the year 2006 in Best prediction for Spanish electricity price Worst prediction for Spanish electricity price
PRICES DEMAND 5
4.5
Price in cE/KWHr
Price in cE/KWHr
4
Jan. 4.45% (2.07) 2.25 (0.34) 5.05% (1.95) 53.21 (6.03) 4.5
Feb. 5.53% (1.52) 3.02 (0.28) 6.88% (2.62) 83.76 (9.19) 3.5
Mar. 6.30% (2.52) 3.97 (0.43) 5.31% (2.42) 59.63 (6.13) 4
Apr. 4.94% (1.47) 3.51 (0.61) 4.97% (2.22) 52.18 (7.21) 3
May 7.59% (2.13) 4.63 (0.43) 6.18% (2.39) 61.12 (5.74)
Jun. 3.34% (1.92) 2.31 (0.29) 3.75% (2.06) 44.17 (4.86) 3.5
2.5
Jul. 3.93% (1.68) 2.28 (0.20) 3.41% (1.78) 37.54 (4.01)
Aug. 5.37% (1.87) 3.49 (0.41) 3.99% (2.13) 39.86 (5.52) 3
0 5 10 15 20 25 2
Sep. 6.24% (1.74) 4.49 (0.53) 4.83% (2.16) 54.14 (6.87)
0 5 10 15 20 25
Hour Hour
Oct. 7.43% (2.33) 4.23 (0.49) 5.37% (2.25) 65.08 (8.01)
Nov. 5.19% (2.09) 3.53 (0.30) 4.86% (1.99) 50.25 (5.11)
Dec. 6.04% (1.99) 3.08 (0.33) 6.80% (2.40) 82.55 (9.97) (a) Spanish electricity price market
Mean 5.53% (1.94) 3.40 (0.39) 5.97% (2.20) 56.96 (6.55)
TABLE V
Performance of the PSF algorithm for the year 2006 in
4 Best prediction for Spanish electricity demand 4Worst prediction for Spanish electricity demand
x 10 x 10
2.2 2.9
Real demand Real demand
MW
1.9 2.5
Jan. 5.58% (1.34) 1.51 (0.52) 4.74% (3.54) 412.38 (58.02) 2.4
Feb. 8.59% (3.24) 5.15 (2.21) 4.98% (2.98) 445.12 (43.29) 1.8
Mar. 7.84% (2.98) 1.73 (0.34) 5.02% (5.27) 430.00 (38.90) 2.3
Apr. 9.92% (3.90) 1.98 (0.63) 6.03% (7.46) 519.73 (57.71) 1.7
May 12.85% (4.03) 3.21 (1.02) 4.17% (2.72) 373.22 (43.28) 2.2
Jun. 22.04% (12.34) 6.81 (2.89) 5.67% (3.84) 561.44 (60.32) 1.6 2.1
Jul. 17.11% (10.58) 8.16 (3.42) 4.91% (5.84) 481.18 (53.67) 0 5 10 15 20 25 0 5 10 15 20 25
Aug. 11.71% (5.08) 3.32 (0.40) 5.88% (6.01) 555.07 (65.05) Hour Hour
Best prediction for New York electricity price Worst prediction for New York electricity price Best prediction for Australian electricity price
200 75 19 Worst prediction for Australian electricity price
Real price Real price Real price 6000
Forecasting 70 Forecasting Forecasting Real price
18
180 Forecasting
5000
65 17
Price in $/MWHr
160
Price in $/MWHr
Price in $/MWHr
60 16 4000
Price in $/MWHr
140 55 15
3000
50 14
120
2000
45 13
100 1000
40 12
80 35
0 5 10 15 20 25 0 5 10 15 20 25 11 0
0 5 10 15 20 25 0 5 10 15 20 25
Hour Hour Hour Hour
(a) New York electricity price market (a) Australian electricity price market
Best prediction for New York electricity demand Worst prediction for New York electricity demand Best prediction for Australian electricity demand Worst prediction for Australian electricity demand
1250 1600 9500 9000
Real demand Real demand Real demand Real demand
1200 Forecasting 1500 Forecasting 9000 Forecasting Forecasting
8500
1150 1400 8500
1100 1300 8000 8000
1050 1200 7500
MW
MW
MW
MW
7500
1000 1100 7000
950 1000 6500 7000
900 900 6000
6500
850 800 5500
800 700 5000 6000
0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25
Hour Hour Hour Hour
Fig. 8. Best and worst predictions for the New York electricity Fig. 9. Best and worst predictions for the Australian electricity
market in 2006. market in 2006.
and MAE equals to 10.56% and 638.92MW, respectively of clusters is equal to some of these three values. Neverthe-
predictions in the demand time series. less, for this time series, the global minimum is reached in
the selected number of clusters from the proposed system
D. Sensitivity to the parameter K based on majority votes.
MER (%)
2
TABLE VI
MER for some weeks of the year 2002 (OMEL price).
Week Naive ANN ARIMA Mixed models WNN 25.
18th 24th Feb 2002 7.68% 5.23% 6.32% 6.15% 5.15% 5.98%
20th 26th May 2002 7.27% 6.36% 6.36% 4.46% 4.34% 4.51%
19th 25th Aug 2002 27.30% 11.40% 13.39% 14.90% 10.89% 9.11%
18th 24th Nov 2002 19.98% 13.65% 13.78% 11.68% 11.83% 10.07%
Average 15.56% 9.16% 9.96% 9.30% 8.05% 7.42%
3
"
The algorithm has been successfully applied in electric- [20] F. Martínez-Álvarez, A. Troncoso, J. C. Riquelme, and J. M.
ity prices and demand time series of Spanish, Australian Riquelme. Partitioning-clustering techniques applied to the elec-
Lecture Notes in Computer Science
and New York markets providing very competitive results. tricity price time series.
4881:990999, 2007.
,
The performance was accurate in all of them, showing thus [21] F. Martínez-Álvarez, A. Troncoso, J. C. Riquelme, and
the robustness and adaptability of the proposed approach J. S. Aguilar Ruiz. LBF: A labeled-based forecasting algorithm
Prooceed-
for time series of dierent nature. This fact is specially re- and its application to electricity price time series. In
ings of the eighth IEEE International Conference on Data Min-
markable since the approaches found in literature are usu- ing , pages 453461, 2008.
ally focussed on only one specic time series. [22] K. Metaxiotis, A. Kagiannas, D. Askounis, and J. Psarras. Ar-
[8] R. Cottet and M. Smith. Bayesian modeling and forecasting of ison of univariate methods for forecasting electricity demand up
[9] D. L. Davies and D. W. Bouldin. A cluster separation measure. [33] A. Troncoso, J. C. Riquelme, J. M. Riquelme, J. L. Martínez,
IEEE Transactions on Pattern Analysis and Machine Intelli- and A. Gómez. Electricity market price forecasting based on
Journal of Cybernetics , 4:95104, 1974. [34] A. Troncoso, J. M. Riquelme, J. C. Riquelme, A. Gómez, and
J. L. Martínez. Time-series prediction: Application to the short
Lecture Notes in Articial Intel-
[11] M. El-Telbany and F. El-Karmi. Short-term forecasting of jor-
term electric energy demand.
ligence
danian electricity demand using particle swarm optimization.
Electric power systems research , 78:425433, 2008. , 3040:577586, 2004.
[35] J. Wang and L. Wang. A new method for short-term electricity
[12] S. Fan, C. Mao, J. Zhang, and L. Chen. Forecasting electricity
Lecture Notes in load forecasting. Transactions of the Institute of Measurement
demand by hybrid machine learning model.
Computer Science , 4233:952963, 2006.
and Control , 30(3):331344, 2008.
[13] E. A. Feinberg and D. Genethliou. Applied Mathematics for Re- [36] X. Wang and M. Meng. Forecasting electricity demand using
Proceedings of the Seventh International
structured Electric Power Systems, Chapter 12 . Springer, 2005.
grey-markov model. In
Conference on Machine Learning and Cybernetics , pages 1244
[14] R. C. García, J. Contreras, M. van Akkeren, and J. B. García.
1248, 2008.
A GARCH forecasting model to predict day-ahead electricity
IEEE Transactions on Power Systems [37] R. Modeling and Forecasting Electricity Loads and
Weron.
prices.
2005.
, 20(2):867874,
Prices . Wiley, 2006.
[38] R. Weron and A. Misiorek. Forecasting spot electricity prices: A
[15] C. García-Martos, J. Rodríguez, and M. J. Sánchez. Mixed mod-
comparison of parametric and semiparametric time series mod-
els for short-run forecasting of electricity prices:
for the spanish market. IEEE Transactions on Power Systems
Application
,
els.International Journal of Forecasting , 24:744763, 2008.
[39] R. Xu and D. C. Wunsch II. Survey of clustering algorithms.
22(2):544552, 2007.
Finding groups in Data: an IEEE Transactions on Neural Networks , 16(3):645678, 2005.
[16] L. Kaufman and P. J. Rousseeuw.
Introduction to Cluster Analysis [40] Z. Xu, Z. Y. Dong, and W. Liu. Neural Networks Applications
[17]
. Wiley, 1990.
R. Kohavi. A study of cross-validation and bootstrap for accu-
in Information Technology and Web Engineering, Chapter 22 .
racy estimation and model selection. InProceedings of Interna- Borneo Publishing, 2005.
[18]
1143, 1995.
G. Li, C. C. Liu, C. Mattson, and J. Lawarrée. Day-ahead elec-
methods. IEEE Transactions on Power Systems , 22(1):376385,