You are on page 1of 4

# IJCST VOL. 3, IssUE 1, JAN.

- MARcH 2012

## ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

Long Term Electric Load Forecasting using Neural Networks and Support Vector Machines
Dept. of CSE, AKRG Engineering College, Nallagerla, AP, India
w w w. i j c s t. c o m

Renuka Achanta

266

## IJCST VOL. 3, IssUE 1, JAN. - MARcH 2012

adjust the weights in each layer of the network. After a back propagation network has learned the correct classification for a set of inputs, it can be tested on a second set of inputs to see how well it classifies untrained patterns. III. Support Vector Machines Support Vector Machines (SVM) is a recent development that serves as an alternative technique for dealing with complex classification and regression problems [11-12]. Support Vector Machine algorithm was developed by Vapnik and is based on statistical learning theory. The basic idea of Support Vector Machines is to map the original data X into a feature space F with high dimensionality through a non linear mapping function and construct an optimal hyperplane in new space . SVM can be applied to both classification and regression. In the case of classification, an optimal hyperplane is found that separates the data into two classes. Whereas in the case of regression a hyperplane is to be constructed that lies close to as many points as possible . A. Support Vector Regression Regression is the problem of estimating a function based on a N given dataset. Consider a dataset G = {( xi , d i )}i =1, where, xi is the input vector, d i is the desired result and N corresponds to the size of the dataset. The general form of Support Vector Regression [3-4], estimating function is (1) where, represent input feature, w and b are the co-efficients that have to be estimated from data. A parameter represents the deviation between the actual values and the regression function which can be treated as a tube around the regression function. Points outside the tube are considered as training errors. By minimizing the regularized risk function represented by Equation (2), the coefficients w and b are determined.
R( F ) =

(5) the term is included to ensure that the optimality constraints on the Lagrange multipliers and assume variable forms. Equation (5) is minimized with respect to primal variables w, b, i and i and is maximized with respect to nonnegative Lagrangian multipliers , , and . Finally , Karush Kuhn Tucker conditions are applied to Equation (4) and the dual Lagrangian form represented by maximize

d ( ) e ( + ) 2 ( )( )K ( x , x )
i =1
N

* i

i =1

* i

N N

i =1 j =1

* i

* j

## is derived, subject to the constraints

(6)

(
i =1

* i) = 0 i = 1, 2,....n

The Lagrange multipliers in Equation (6) satisfy the equality .The Lagrangian multipliers and are calculated and the optimal desired weight vector of the regression hyperplane is expressed as Therefore the regression function is (7)

0 i , * i C,

Where,

1 1 || w ||2 +C 2 N

L (d , F )
i =1 e i i

(2)

(3) are user determined parameters. The term Le (di , Fi ) is the -insensitive loss function and the term 1 || w || is 2 used as measurement of the function flatness. C is a regularized constant determining the trade-off between the training error and the model flatness. Two positive slack variables i and i , represent the distance from actual values to the corresponding boundary values of the - tube. The two slack variables are zero when the data points fall within the - tube. After introducing the slack variables the risk function can be expressed in the following constrained form: minimize where, C and
2

| d Fi | e Le (di , Fi ) = i 0

| di Fi | e others

(8) where, K ( xi , x j ) is the kernel function whose value is equal to the inner product of two vectors, xi and x j , in the feature space and . For a nonlinear regression problem, the data are mapped to a high-dimensional feature space. The kernel function can simply involve the use of a mapping. Any function that satisfies Mercers theorem can be used as a kernel function. Some of the functions that satisfy this condition are shown in Table 1. Table 1: Some of the kernel functions that satisfy Mercers Theorem. Kernel Polynomial Function
[1 + ( X . X i )] p

Comment Power p is specified by the user The width is specified by the user
2

RBF

exp(

1 || X X i ||2 ) 2 2

(4) subject to di w( xi ) b e + i , i =1, 2,. N ( w.( x)) + b di e + * i =1, 2,. N i i , * 0 i =1, 2,. N i This constrained Optimisation problem is solved using the following primal Lagrangian form: minimize

1 || w ||2 +C (i + i) 2 i =1
N

SVM is fast, accurate and less prone to over fitting than other methods. It can handle high dimensional data efficiently. SVMs have been applied successfully in applications that deal with numerical attributes. They are applied successfully to handwritten character recognition, object recognition, speaker identification etc. The success of the model depends on proper setting of the parameters [6-7]. IV. Methodology The real world datasets are highly susceptible to noisy and missing data. The data can be preprocessed to improve the quality of data and thereby improve the prediction results. In this work data transformation has been applied to the data. Data transformation
INTERNATiONAL JOURNAL OF COMPUTER SciENcE AND TEcHNOLOGY

w w w. i j c s t. c o m

267

## ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

improves the accuracy, speed and efficiency of the algorithms used. The data is normalized using Z-score normalization , where, the values of an attribute, A, are normalized based on the mean () and standard deviation (A) of A. The normalized value V of V can be obtained as V = (V - )/A In this work electric load consumption is forecasted based on the historical time series data. The available data is divided into training, validation and test sets. Training set is used to build the model, validation set is used for parameter Optimisation and test set is used to evaluate the model. Separate models are developed using SVM and MLP trained with back propagation algorithm. A non linear support vector regression method is used to train the SVM. A kernel function must be selected from the functions that satisfy Mercers theorem. Polynomial kernel is adopted in this study. The polynomial kernel function requires setting of the parameter p in addition to the regular parameters C and . As there are no general rules to determine the free parameters the optimum values are set by grid search method [2, 8, 13]. The search is performed to identify the best combination of parameters. After experimentation it has been observed that the model with parameters C=1, =0.001, p =2 gives the least error. Back propagation algorithm is used to develop the MLP model. Several models , have been developed and tested, and finally the best model is identified based on the mean absolute error which is considered as performance measure in this work. The sigmoid activation function is used in the models. V. Results and Discussion Electric load is predicted using both neural networks and support vector machines. The errors in both the models is presented below Table 2: Comparison of NN and SVM Model Neural Networks Support Vector Machines Mean Absolute Error 0.1088 0.0916

VI. Conclusion An application of support vector regression and multi layer perceptron neural networks for electric load forecasting is presented in this paper. The performance of SVM was compared with MLP for various models. The results obtained show that SVM performs better than neural networks trained with back propagation algorithm. It was also observed that parameter selection in the case of SVM has a significant effect on the performance of the model. It can be concluded that through proper selection of the parameters, Support Vector Machines can replace some of the neural network based models for electric load forecasting. References  Haykin, S.,Neural Networks-A Comprehensive Foundation, Prentice Hall. 1999.  Jae H.Min., Young-chan Lee.,Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters, Expert Systems with Applications, 28, pp. 603614. 2005.  Ronan Collobert, Samy Benegio,SVM Torch: Support Vector Machines for Large-Scale Regression Problems, Journal of Machine Learning Research, 1, 2001, pp. 143160.  Smola A.J, Scholkopf B,A Tutorial on support vector regression, Neuro COLT Technical Report NC-TR-98-030, Royal Holloway College, University of London, UK. 1998  Hsu, C.W., Chang, CC., Lin C.J.,A Practical guide to support vector classification, Technical report, Department of Computer science and information Engineering, National Taiwan University, 2008.  Y.Radhika., M.Shashi.,Atmospheric temperature prediction using Support Vector Machines, International Journal of Computer Theory and Engineering, Vol. 1, No. 1, pp. 55-58, 2009.  Hand, D.J., Heikki Mannila, Padhraic Smyth,Principles of Data Mining, The MIT Press, 2001.  Radhika, Y., Shashi, M.,A Recursive Algorithm for Parameter Optimisation in Support Vector Regression, International Journal of Computer Applications in Engineering, Technology and Sciences, Vol. 1, Issue 2, April 2009, pp. 451-454.  Han, Kamber,Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2002.  Samson, D.C., Downs, T., Saha, T.K.,Evaluation of Support Vector Machine Regression Based Forecasting Tool in Electricity Price Forecasting for Australian National Electricity Market Participants, J. Elect. Eloctron Eng. Austr, Vol. 22, 2002, pp. 227-234.  Scholkopf, B., Burges, C.J.C., Smola, A.J.,Using Support Vector Machine for Time Series Prediction, Advances in Kernel Methods, Eds. Cambridge, MA:MIT Press, pp. 242253, 1999.  Muller, K.R., Smola, A.J., Ratsch, G., Scholkopf, B., Kohlmorgen, J., Vapnik, V.,Predicting Time Series with Support Vector Machine, Proc. Int. Conf. Artificial Neural Networks, ICANN 97.  Lamamra, K., Belarbi, K., Mokhtari, F.,Optimisation of the Structure of a Neural Networks by Multi Objective Genetic Algorithms, ICGST-ARAS Journal, April 2007, pp. 1-4.  Khan, M.R., Ondrusek C.,Short Term Electric Demand Prognosis Using Artificial Neural Networks, Electr. Engg, 2000.
w w w. i j c s t. c o m

From the table it can be observed that SVMs give a better performance than Neural Networks. Fig. 1 and fig. 2, show the performance of both the models on the test set.

268

## IJCST VOL. 3, IssUE 1, JAN. - MARcH 2012

 Snehashish Chakraverty, Pallavi Gupta,Comparison of Neural Network Configurations in the Long-Range Forecast of Southwest Monsoon Rainfall Over India. Neural Computing and Applications, 2007.  Kim, K J.,Financial Time Series Forecasting Using Support Vector Machines, Neurocomputing, 2003, pp.307-319.  Sivanandam, S.N., Sumati, S., Deepa, S.N.,Introduction to Neural Networks, Using MATLAB 6.0, Tata McGraw Hill, 2006.  Changyin Sun, Jinya Song, Linfeng Li, Ping Ju, Implementation of Hybrid Short Term Load Forecasting System with Analysis of Temperature Sensitivities, Soft Computing, 2008, pp. 633-638.  Qiudan Li, Stephen Shaoyi Liao, Dandan Li,A Clustering Model for Mining Consumption Patterns from Imprecise Electric Load Time Series Data, Lecture Notes in Computer Science, Fuzzy Systems and Knowledge Discovery, SpringerVerlag Berlin Heidelberg, FSKD 2006, LNAI 4223, pp. 12171220, 2006.  [Online] Available: http://www.eia.gov/cneaf/electricity/epa/ epa_sprdshts.html Renuka Achanta received her B.E (Computer Science Engineering) M.Tech (Computer Science and Technology). She is currently an Assistant Professor in the Department of Computer Science Engineering at A.K.R.G. Engineering College.

w w w. i j c s t. c o m

269