You are on page 1of 4

IJCST VOL. 3, IssUE 1, JAN.

- MARcH 2012

ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

Long Term Electric Load Forecasting using Neural Networks and Support Vector Machines
Dept. of CSE, AKRG Engineering College, Nallagerla, AP, India
Abstract Electric load forecasting is an important aspect in electrical power industry. It is important to determine the future demand for power as far in advance as possible. According to the foreseen load, the company makes investments and decisions on buying energy from the generating companies, and planning for maintenance and expansion. It is therefore absolutely necessary to have some knowledge of future power consumption. Electric power distributors require a tool that allows them to predict the load in order to support its management and make more efficient its planning formulation. Accurate prediction of electric load is difficult. A large number of the classical prediction models are inappropriate for this modeling because of their requirement of linearity and seasonality. This paper presents an application of Artificial Neural Networks (ANN) and Support Vector Machines (SVM) to predict electric load. The results obtained using both the techniques are compared and the performance of SVM is found to be consistently better. Keywords Electric Load Forecasting, Neural Networks, Support Vector Machines I. Introduction Electric load is one of the key variables for electric power companies [14], since it determines its main source of income, particularly in the case of distributors. According to the foreseen load, the company makes investments and takes decisions regarding planning, maintenance and expansion. It is therefore necessary to have some knowledge of future power load. Accurate prediction of electric load is difficult, because it is determined largely by variables that involve uncertainty and whose relation with the final load is not deduced directly [18-19]. The load is also characterized as a nonlinear and non stationary process that can undergo rapid changes due to weather, seasonal and macroeconomic variations. A large number of the classical prediction models are inappropriate for this modeling because of their requirement of linearity and seasonality [10]. Neural network techniques have the potential to handle complex, nonlinear problems in a better way when compared to traditional techniques. Support vector regression is different from conventional regression techniques because it uses Structural Risk Minimization (SRM) but not Empirical Risk Minimization (ERM) induction principle which is equivalent to minimizing an upper bound on the generalization error and not the training error. Due to this feature it is expected to perform better than conventional techniques which may suffer from possible over fitting. To solve this forecasting problem various neural network and support vector machine models were analyzed and applied to a real world dataset provided in the web by the EIA department of America for the state of Alaska [20]. A. Long Term Load Forecasting Long term load forecasting forecasts within the interval range of one year or more than that. The factors that affect long term load forecasting are previous years load. The forecasting procedure depends on the manner in which historical time series data is analyzed and on the type of information available at the time the forecast is prepared. Various techniques have been applied to the problem of Long term load prediction. To implement Long term load forecasting there are traditional models. The main tradition models are employed such as Regression model, Gray forecast model and Combination model. During recent decades, numerous intelligent methods have been proposed to improve the load forecasting accuracy, such as expert system Fuzzy inference, Particle Swarm Optimisation (PSO), Artificial Neural Networks (ANN) and Support Vector Machine (SVM). II. Artificial Neural Network Artificial Neural Network (ANN) is a machine learning approach inspired by the way in which the brain performs a particular learning task [1, 17]. ANNs are modeled on human brain and consists of a number of artificial neurons. Neuron in ANNs tend to have fewer connections than biological neurons. Each neuron in ANN receives a number of inputs. A function called activation function is applied to these inputs which results in activation level of neuron. Three different classes of network architectures: Single-layer feed-forward, Multi-layer feed-forward, Recurrent. A. Multi Layer Perceptron Multi Layer Perceptron network is a class of neural networks which consists of a set of sensory units that constitute the input layer and one or more hidden layers of computation nodes and an output layer of computation nodes. A non linear activation function namely sigmoid function is widely used to generate output activation in the computation nodes. In general MLPs are trained with the back propagation algorithm to develop successful classification and regression systems. B. Back Propagation Algorithm A Back propagation network consists of at least three layers of units: 1. An input layer, 2. At least one intermediate hidden layer, and 3. An output layer. Typically, units are connected in a feed-forward fashion with input units fully connected to units in the hidden layer and hidden units fully connected to units in the output layer. When a backpropagation network is cycled, an input pattern is propagated forward to the output units through the intervening input-to-hidden and hidden-to-output weights. The output of back-propagation is interpreted as a classification decision. With Back propagation networks, learning occurs during a training phase. The steps followed during learning are: Each input pattern in a training set is applied to the input units and then propagated forward. The pattern of activation arriving at the output layer is compared with the correct (associated) output pattern to calculate an error signal. The error signal for each such target output pattern is then backpropagated from the outputs to the inputs in order to appropriately
w w w. i j c s t. c o m

Renuka Achanta

266

INTERNATiONAL JOURNAL OF COMPUTER SciENcE AND TEcHNOLOGY

ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

IJCST VOL. 3, IssUE 1, JAN. - MARcH 2012

adjust the weights in each layer of the network. After a back propagation network has learned the correct classification for a set of inputs, it can be tested on a second set of inputs to see how well it classifies untrained patterns. III. Support Vector Machines Support Vector Machines (SVM) is a recent development that serves as an alternative technique for dealing with complex classification and regression problems [11-12]. Support Vector Machine algorithm was developed by Vapnik and is based on statistical learning theory. The basic idea of Support Vector Machines is to map the original data X into a feature space F with high dimensionality through a non linear mapping function and construct an optimal hyperplane in new space [5]. SVM can be applied to both classification and regression. In the case of classification, an optimal hyperplane is found that separates the data into two classes. Whereas in the case of regression a hyperplane is to be constructed that lies close to as many points as possible [16]. A. Support Vector Regression Regression is the problem of estimating a function based on a N given dataset. Consider a dataset G = {( xi , d i )}i =1, where, xi is the input vector, d i is the desired result and N corresponds to the size of the dataset. The general form of Support Vector Regression [3-4], estimating function is (1) where, represent input feature, w and b are the co-efficients that have to be estimated from data. A parameter represents the deviation between the actual values and the regression function which can be treated as a tube around the regression function. Points outside the tube are considered as training errors. By minimizing the regularized risk function represented by Equation (2), the coefficients w and b are determined.
R( F ) =

(5) the term is included to ensure that the optimality constraints on the Lagrange multipliers and assume variable forms. Equation (5) is minimized with respect to primal variables w, b, i and i and is maximized with respect to nonnegative Lagrangian multipliers , , and . Finally , Karush Kuhn Tucker conditions are applied to Equation (4) and the dual Lagrangian form represented by maximize

d ( ) e ( + ) 2 ( )( )K ( x , x )
i =1
N

* i

i =1

* i

N N

i =1 j =1

* i

* j

is derived, subject to the constraints

(6)

(
i =1

* i) = 0 i = 1, 2,....n

The Lagrange multipliers in Equation (6) satisfy the equality .The Lagrangian multipliers and are calculated and the optimal desired weight vector of the regression hyperplane is expressed as Therefore the regression function is (7)

0 i , * i C,

Where,

1 1 || w ||2 +C 2 N

L (d , F )
i =1 e i i

(2)

(3) are user determined parameters. The term Le (di , Fi ) is the -insensitive loss function and the term 1 || w || is 2 used as measurement of the function flatness. C is a regularized constant determining the trade-off between the training error and the model flatness. Two positive slack variables i and i , represent the distance from actual values to the corresponding boundary values of the - tube. The two slack variables are zero when the data points fall within the - tube. After introducing the slack variables the risk function can be expressed in the following constrained form: minimize where, C and
2

| d Fi | e Le (di , Fi ) = i 0

| di Fi | e others

(8) where, K ( xi , x j ) is the kernel function whose value is equal to the inner product of two vectors, xi and x j , in the feature space and . For a nonlinear regression problem, the data are mapped to a high-dimensional feature space. The kernel function can simply involve the use of a mapping. Any function that satisfies Mercers theorem can be used as a kernel function. Some of the functions that satisfy this condition are shown in Table 1. Table 1: Some of the kernel functions that satisfy Mercers Theorem. Kernel Polynomial Function
[1 + ( X . X i )] p

Comment Power p is specified by the user The width is specified by the user
2

RBF

exp(

1 || X X i ||2 ) 2 2

(4) subject to di w( xi ) b e + i , i =1, 2,. N ( w.( x)) + b di e + * i =1, 2,. N i i , * 0 i =1, 2,. N i This constrained Optimisation problem is solved using the following primal Lagrangian form: minimize

1 || w ||2 +C (i + i) 2 i =1
N

SVM is fast, accurate and less prone to over fitting than other methods. It can handle high dimensional data efficiently. SVMs have been applied successfully in applications that deal with numerical attributes. They are applied successfully to handwritten character recognition, object recognition, speaker identification etc. The success of the model depends on proper setting of the parameters [6-7]. IV. Methodology The real world datasets are highly susceptible to noisy and missing data. The data can be preprocessed to improve the quality of data and thereby improve the prediction results. In this work data transformation has been applied to the data. Data transformation
INTERNATiONAL JOURNAL OF COMPUTER SciENcE AND TEcHNOLOGY

w w w. i j c s t. c o m

267

IJCST VOL. 3, IssUE 1, JAN. - MARcH 2012

ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

improves the accuracy, speed and efficiency of the algorithms used. The data is normalized using Z-score normalization [9], where, the values of an attribute, A, are normalized based on the mean () and standard deviation (A) of A. The normalized value V of V can be obtained as V = (V - )/A In this work electric load consumption is forecasted based on the historical time series data. The available data is divided into training, validation and test sets. Training set is used to build the model, validation set is used for parameter Optimisation and test set is used to evaluate the model. Separate models are developed using SVM and MLP trained with back propagation algorithm. A non linear support vector regression method is used to train the SVM. A kernel function must be selected from the functions that satisfy Mercers theorem. Polynomial kernel is adopted in this study. The polynomial kernel function requires setting of the parameter p in addition to the regular parameters C and . As there are no general rules to determine the free parameters the optimum values are set by grid search method [2, 8, 13]. The search is performed to identify the best combination of parameters. After experimentation it has been observed that the model with parameters C=1, =0.001, p =2 gives the least error. Back propagation algorithm is used to develop the MLP model. Several models [15], have been developed and tested, and finally the best model is identified based on the mean absolute error which is considered as performance measure in this work. The sigmoid activation function is used in the models. V. Results and Discussion Electric load is predicted using both neural networks and support vector machines. The errors in both the models is presented below Table 2: Comparison of NN and SVM Model Neural Networks Support Vector Machines Mean Absolute Error 0.1088 0.0916

VI. Conclusion An application of support vector regression and multi layer perceptron neural networks for electric load forecasting is presented in this paper. The performance of SVM was compared with MLP for various models. The results obtained show that SVM performs better than neural networks trained with back propagation algorithm. It was also observed that parameter selection in the case of SVM has a significant effect on the performance of the model. It can be concluded that through proper selection of the parameters, Support Vector Machines can replace some of the neural network based models for electric load forecasting. References [1] Haykin, S.,Neural Networks-A Comprehensive Foundation, Prentice Hall. 1999. [2] Jae H.Min., Young-chan Lee.,Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters, Expert Systems with Applications, 28, pp. 603614. 2005. [3] Ronan Collobert, Samy Benegio,SVM Torch: Support Vector Machines for Large-Scale Regression Problems, Journal of Machine Learning Research, 1, 2001, pp. 143160. [4] Smola A.J, Scholkopf B,A Tutorial on support vector regression, Neuro COLT Technical Report NC-TR-98-030, Royal Holloway College, University of London, UK. 1998 [5] Hsu, C.W., Chang, CC., Lin C.J.,A Practical guide to support vector classification, Technical report, Department of Computer science and information Engineering, National Taiwan University, 2008. [6] Y.Radhika., M.Shashi.,Atmospheric temperature prediction using Support Vector Machines, International Journal of Computer Theory and Engineering, Vol. 1, No. 1, pp. 55-58, 2009. [7] Hand, D.J., Heikki Mannila, Padhraic Smyth,Principles of Data Mining, The MIT Press, 2001. [8] Radhika, Y., Shashi, M.,A Recursive Algorithm for Parameter Optimisation in Support Vector Regression, International Journal of Computer Applications in Engineering, Technology and Sciences, Vol. 1, Issue 2, April 2009, pp. 451-454. [9] Han, Kamber,Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2002. [10] Samson, D.C., Downs, T., Saha, T.K.,Evaluation of Support Vector Machine Regression Based Forecasting Tool in Electricity Price Forecasting for Australian National Electricity Market Participants, J. Elect. Eloctron Eng. Austr, Vol. 22, 2002, pp. 227-234. [11] Scholkopf, B., Burges, C.J.C., Smola, A.J.,Using Support Vector Machine for Time Series Prediction, Advances in Kernel Methods, Eds. Cambridge, MA:MIT Press, pp. 242253, 1999. [12] Muller, K.R., Smola, A.J., Ratsch, G., Scholkopf, B., Kohlmorgen, J., Vapnik, V.,Predicting Time Series with Support Vector Machine, Proc. Int. Conf. Artificial Neural Networks, ICANN 97. [13] Lamamra, K., Belarbi, K., Mokhtari, F.,Optimisation of the Structure of a Neural Networks by Multi Objective Genetic Algorithms, ICGST-ARAS Journal, April 2007, pp. 1-4. [14] Khan, M.R., Ondrusek C.,Short Term Electric Demand Prognosis Using Artificial Neural Networks, Electr. Engg, 2000.
w w w. i j c s t. c o m

From the table it can be observed that SVMs give a better performance than Neural Networks. Fig. 1 and fig. 2, show the performance of both the models on the test set.

Fig. 1:Actual and predicted values using SVM model

Fig. 2: Actual and predicted values using NN model

268

INTERNATiONAL JOURNAL OF COMPUTER SciENcE AND TEcHNOLOGY

ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

IJCST VOL. 3, IssUE 1, JAN. - MARcH 2012

[15] Snehashish Chakraverty, Pallavi Gupta,Comparison of Neural Network Configurations in the Long-Range Forecast of Southwest Monsoon Rainfall Over India. Neural Computing and Applications, 2007. [16] Kim, K J.,Financial Time Series Forecasting Using Support Vector Machines, Neurocomputing, 2003, pp.307-319. [17] Sivanandam, S.N., Sumati, S., Deepa, S.N.,Introduction to Neural Networks, Using MATLAB 6.0, Tata McGraw Hill, 2006. [18] Changyin Sun, Jinya Song, Linfeng Li, Ping Ju, Implementation of Hybrid Short Term Load Forecasting System with Analysis of Temperature Sensitivities, Soft Computing, 2008, pp. 633-638. [19] Qiudan Li, Stephen Shaoyi Liao, Dandan Li,A Clustering Model for Mining Consumption Patterns from Imprecise Electric Load Time Series Data, Lecture Notes in Computer Science, Fuzzy Systems and Knowledge Discovery, SpringerVerlag Berlin Heidelberg, FSKD 2006, LNAI 4223, pp. 12171220, 2006. [20] [Online] Available: http://www.eia.gov/cneaf/electricity/epa/ epa_sprdshts.html Renuka Achanta received her B.E (Computer Science Engineering) M.Tech (Computer Science and Technology). She is currently an Assistant Professor in the Department of Computer Science Engineering at A.K.R.G. Engineering College.

w w w. i j c s t. c o m

INTERNATiONAL JOURNAL OF COMPUTER SciENcE AND TEcHNOLOGY

269