Beruflich Dokumente
Kultur Dokumente
Torder
This paper presents the results of a studly aimedl at. .. characterizing the time evolution of the probability successive statistical study. Different types of days (working
ditrbtin of th agrgae reieta when...........he loa t
E. Carpaneto and G. Chicco are with the Dipartimento di Ingegneria Elettrica, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy (e-mail enrico .carpaneto(pol ito .it, gianfranco .chi coXpol ito .it).
days and weekend days) and periods (summer and winter) were considered. For space limitations, only the results obtained for winter working days are presented here.
9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden - June 11-15, 2006
III. PROBABILITY DENSITY FUNCTIONS FOR THE STATISTICAL TESTS Various probability distributions have been used for the goodness-of-fit statistical tests [10], including two oneparameter distributions (Exponential and Rayleigh), five twoparameter distributions (Gamma, Gumbel, Weibull, Normal, Log-normal and Inverse Normal), and the three-parameter Beta distribution (whose third parameter has been set to the maximum value of the sample data). Table I contains some details on the probability distributions tested, with the corresponding expressions of the Probability Density Function (PDF) and Cumulative Distribution Function (CDF). The Chi-square, Kolmogorov-Smirnov (KS) and geometrical adaptation statistical tests have been used for investigating the goodness-of-fit of the various probability distributions as a function of the load power P.
to the critical range of values (mm, max), depending on the number of degrees of freedom, so that the test is successful if . amm, unsuccessful if m> max whereas for min < < max the result is undefined. Alternatively, the maximum level of significance Pmax0% corresponding to ;= amin sets the limits of acceptance of the chi-square test results.
B. Kolmogorov-Smirnov (KS) test The error V/ of the KS test is given by the maximum mismatch between the Empirical CDF (ECDF) obtained by the set of data under analysis and the CDF of the probability distribution under test. The error yi is compared to a critical value VJcrit and . the test is successful if yf< Vcrit If the CDF under test is fully specified by assigning all its parameters, the result of the test is independent of the distribution, and the critical values t/<gen are found in specific tables in function of the level of significance (see [1 1] p.797 and Table 1 of [12]). However, if TABLE I the CDF parameters are estimated from the data, these critical PROBABILITY DISTRIBUTIONS USED FOR THE GOODNESS-OF-FIT TESTS PDF CDF parameter values are no longer valid and must be determined by F(P) limits simulation or by specific tables. Specific tables have been found for the Exponential distribution (p.798 of [11] and /c a-I ( -x)b-1 dx l x F(a+ b)Pa-I(c - p)b-l O<P<c Table 1 of [13]) and for the Normal distribution (p.799 of [11] a >0 F(a)F(b)ca+b-l and Table I of [14]). For the other cases in which the Beta distribution parameters are extracted from the data, the critical incomplete(P/c,a,b) values corresponding to a generic distribution can be seen 7)P b 0 only as upper bounds of the actual critical values. The Exponential 1 -e b>0 assessment of the critical values is then performed by using a b P a-l -x Monte Carlo simulation. At first, a set of m = 1, ..., M values b P P>O Atecaiuain the etro eghHi ildwt a is specified, at which CDF under test is calculated. Then, pa-I K Gamnma Gamma = oJ bIl F(a)e b >0 specified number K of Monte Carlo simulations is performed. =Gamma incomplete(P/b) At each simulation, a vector of length H is filled with H random values extracted from the CDF under test. The PPb . < extraction is carried out by using H random extractions in the e) ee e P- )1 1 Gumbel b interval (0,1) from a uniform probability distribution, that are considered as the values of the CDF under test, whose 2a)] (P + a) (PNa)e + e ff r Inverse >0 corresponding values are computed from the abscissa of the ae 2bP b >0 CDF under test. Then, the simulated empirical CDF is built at NOrma'b the M predefined locations, and the KS error is computed as (lnP-b)2 P>o the absolute value of the maximum difference between the 1 Lognormal e 22 (l+erj 2nP-b a 2 2ff J points of the simulated empirical CDF and of the CDF under aPS test referred to the same values m = 1, ..., M. The K errors of (p_f,)2 P 1j Pl-erf: / N a e 2 t< the KS test are then used to build the related CDF, and the > Normal 2 critical value is evaluated for a given level of significance. 2) The Monte Carlo simulations performed in this paper assume M= 1000, K= 5000 and H set to the sample set size. 2 (b) Rayleigh I e (b) P,>o
ci
A(P)
(PBeta
bar(a)
a >
Weibull
aPi -e (b) a
b
(P~
e (b)
0 b>
A. Chi-square test The parametric chi-square test has been used, adopting the same data sample for the statistical test. The Yates correction [11] has been introduced in order to better estimate the and plots aF versus the power P, so that an exponential CDF significance level. The results of the test depend on the pre- would be represented by a straight line in the (P,aE) plane. The second representation transforms each value F into specified number and structure (uniform or non-uniform) of (2) aw =ln(- ln(1 - F)) the classes, and on the level of significance,p%o. If the value of 0/ is specified in advance, the observed value Sis compared
C. Geometrical adaptation tests. The assessment of the fitness of the Empirical CDF (ECDF) to a reference CDF has been performed by using two graphical representations, with suitable functions of the generic CDF value F. The first representation transforms each value F into (1) aF =-ln(l-F)
9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden - June 11-15, 2006
and plots aw versus ln(P), so that a Weibull CDF would be represented by a straight line in the (ln(P),aw) plane. In addition, plotting aw versus P would represent a Gumbel CDF ^ as a straight line in the (P,aw) plane.
KS TEST ERRORS FOR THE POWER AT HOUR 12:00 WITH (LEVEL OF SIGNIFICANCE 50%) KS ts err test error critical value ~~~ ~ ~~~~~~~~~~~~~~~~CDF cItICAl ve
TABLE II
IV. NUMERICAL TESTS AND RESULTS Gamma 0.0800 0.1160 Gumbel A. Testsfor a 100-customer case 0.1225 Inverse-Normal 0.0659 A first set of tests has been performed on the data obtained 0.1230 Log-Normal 0.0887 from the Monte Carlo simulation at the same time instant. An 0.0886 0.0653 example is presented for hour 12:00 of a winter working day Normal 0.1318 Rayleigh 0.3434 with N = 100 customers. The number of Monte Carlo Weibull 0.0855 0.1306 simulations carried out to obtain the ECDF for this situation is 100. The characteristics of the complete data sample include the minimum value 23.69 kW, maximum value 51.69 kW,S ,.-' ECDF S ~~~~~~~Gamma value 39.21 kW, standard deviation 5.77 kW (14.7%), ;rtr K average and skewness -0.0004. The results of the KS test with level of significance 5% are shown in Table II. The critical values of the KS test have been computed by running 5000 Monte Carlo | simulations for each probability distribution (other than Exponential and Normal), resulting in values less restrictive than vlgei 0.1360. In particular, in this case the tests are not 02[ accepted only for the Exponential and Rayleigh probability distributions, whereas all the other distributions exhibit 0 1 4 2 3 acceptable goodness-of-fit. Fig. 1 reports the various CDFs. 1D More details are reported for the Gamma CDF with average value and standard deviation equal to the ones of the data Fig. 2. KS test for the Gamma CDF. 3 if, sample (shape factor a 46.2 and scale factor b 848.5 W according to Table I). Fig. 2 shows the details of the KS test. Fig. 3 reports the results of the chi-square test with 7 degrees of freedom (maximum acceptable error 14.07). Appling theil j I jjj Yates correction, the observed error (6.85) has been acceptably low and the test has been passed with a maximum 1.1.3 level of significance 44.5%, and with a non-excessive
=
Beta Exponential
CDFVKS
0.0832 0.5101
accepted
rejected
accepted
accepted
rejected
x1class
8L
~~~~~nereNormalX
~~~~~Log-NbormalHt
/I'
0.
f0 lb
Exponential
00 0
4 4
77
0. :kt
4
S 03 &0-2'1 :f;i , 0 /v Beta
Fig. 3. Results of the chi-square test for the Gamma CDF for N 100.
d
Gumbel:
t
Gamma
o1
20
10I
2AS
25
30
) power(W
36
40
45
60
Fig. 1. ECDF of the load at hour 12:00 for N= 100 and CDFs of various probability distributions with the same average valueeanddsandardadeiation.tioF.-dr
I.s
....
4.S
POr iio
SS
Fig. 4 shows the results of the geomnetrical adaptation tests. The acceptability of the Normal distribution iS also confirmed by the high value of the shape factor of the Gamma distribution. As indicated in Fig. 5, for the Gamma CDF the KS observed error during the day never exceeds the 500
acceptance threshold.
i- ammaX
F
_
c?wslr)
9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden - June 11-15, 2006
0.3
0.2
0.1
The results obtained for hour 12:00 can be generalized by considering the results of the KS test with the various CDFs for the 1440 minutes of the day. Fig. 6 shows that the Exponential and Rayleigh distributions do not fit the ECDF satisfactorily. A zoom into a specific time interval (from hour 11:40 to hour 12:20, Fig. 7) allows for identifying the LogNormal, Inverse Normal and Gamma CDFs as the ones exhibiting the best values of goodness-of-fit, compared to the corresponding thresholds. However, as indicated for the Gamma CDF in Fig. 8, in some hours of the day the KS observed error could exceed the 5% acceptance threshold.
1440
240
480
720
960
1200
2.0
time (min)
Fig. 5. Results of the KS 5% test with the Gamma CDF for N= 100.
A/
t /
Normal
2 1.0 The results of the same tests indicated in the previous subsection are shown here on the data obtained from 100l < ' -' Monte Carlo simulations referred to hour 12:00 of a winter working day with N = 10 customers. The characteristics of the Inverse Normal complete data sample include the minimum value 1.508 kW, Log-Normal 0. maximum value 11.011 kW, average value 3.805 kW, 700 710 720 730 740 standard deviation 1.618 kW (42.5 %), and skewness 0.1109. time (min) Table III shows the results of the KS test with level of significance 5%. The Log-Normal and Inverse Normal Fig. 7. Observed errors of the KS test with significance level 5% from hour distributions exhibit the better goodness-of-fit. In this case, the 11:40 to hour 12:20 for N 10. Normal probability distribution no longer fits the data. The 0.3 Gamma CDF has shape factor 5.53 (much lower than in the previous case) and scale factor 688 W.
TABLE III 0.2 4v .; * KS TEST ERRORS FOR THE POWER AT HOUR 12:00 WITH 10 CUSTOMERS n (LEVEL OF SIGNIFICANCE 50 o) * (LEVEL OF SIGNIFICANCE 5%) ~~~~~~~~~~~~~ #,, CDF KS test error critical value Result i-KS 5% acceptance threshold
1.5
1.5
Gumbelt
Beta
accepted
0e. o.1
0
0
accepted accepted
240
480
time (min)
720
960
1200
1440
accepted
Fig. 8. Results of the KS 5% test with the Gamma CDF for N= 10.
V. EXTENDED TESTS AND RESULTS
/ Exponential
4.0
3.0
A. Extended tests for variable numbers ofcustomers The same set of tests specified in the previous subsections have been carried out for a different number of customers
Gumbel
*g
Normal
li,,ll
/
O 20
10 0.
0
/Rayleigh
and winter seasons). Each customer has a ll 11 J i . contract power of 3 kW. Some significant results are summarized in the sequel. B. 1X Time evolution of the aggregated load patterns and
1200
(from 10 to 300) considered (working day the day, and for I.the 4. types of daysfor all the 1440 minutes ofand weekend day mI/I.,t llXlil, ih,2 11
the Rayeinsummer
240
480
Fig. 6. Observed error to critical value ratio of the KS test with significance level 500for awinter working day withN= 10.
time(min)
70
72
960
1440
standard deviations (winter working days) A first result can be achieved by comparing the time evolution of the aggregated load power for different numbers of respectively. The internal filled band represents the regions
customers. Fig. 9, Fig. 10 and Fig. 11 show the load patterns for a winter weekday with N =10, N =100 and N =300,
9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden - June 11-15, 2006
deviation of the data concerning each minute. The upper and lower lines represent the maximum and minimum values obtained in the Monte Carlo simulation. It is evident how when N increases there is a reduction in the range of variation of the aggregated load power, as well as a trend to obtain more symmetrical probability distributions. In particular, Fig. 12 shows how the uncertainty of aggregated load (represented by the standard deviation in per cent of the average value) depends on the number of aggregated customers and varies during the day.
C. Evaluations at specific hours Further evaluations have been carried out by comparing the evolution of the load in function of the number of customers. A first case is presented in Fig. 13, considering the CDF of the load at hour 12:00. When the number of customers increases, the CDFs move from left to right, but the standard deviation does not increase in the same way as the increase of the average value. This fact is well highlighted by the representation of the specific power (W/customers) shown in Fig. 14, where it is clear that when N varies the average value of the specific power remains within a narrow range, whereas the standard deviation varies considerably. This fact is important to establish a reference value of specific power that can be used to make good estimates of the consumption of the group of customers tested. Extending the calculation to all the time instants allows for observing that the specific load power profile shown in Fig. 15 remains very similar.
20
350
300
-%0
a
B 200
250
B50
50
4
0
240
480
720
960
1200
1440
time (min)
N 20
O O 30 . > 25 !.. E 6
0
10
'7.
240
N =80
480
N =150 N =300
960
1200
1440
Fig. 12. Time evolution of the standard deviation of the load power in per cent
15
100
10~~~~~~~~~~~~
CL
~~~~~~~~~~~~~~~~~~~N
0.3 0.2 0.1 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~0.4
5-
240
480
720
960
1200
1440
10
20
30
40
50
60
1S0
time (min)
0.5X
load
100.
-1
12:00 forN
[kW]
10 to 100.
KTH
0.6-
0~~~~~~~~~~~~~
40
60
0.4-N
9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden - June 11-15, 2006
1000
800
4.1
600
401
t ;
extra-urban areas. It has to be stressed that these values cannot be generally applied to all residential customers, regardless of their location. In fact, for urban areas generally lower values power are expected for 0of specific due to the reduced size ofthe same number of the houses, reduced customers, lt number of persons per house, and different types of activity.
0 | <
time (min) Fig. 15. Specific load power profiles (winter working day).
240
480
720
960
1200
1440
In order to assess the most suitable probability distribution, a comparison has been made by taking into account as parameter the ratio between the observed error of the KS test and the KS error threshold for the corresponding probability distribution. For each number of customers, the probability distributions for which this ratio is the lowest at the various time instants have been identified. The results are summarized in Table IV, showing the percentage of winning time instant for the various probability distributions. From this point of view, the Gamma distribution emerges as the most promising one for the various numbers of customers. Only the Inverse Normal distribution could be a viable alternative for a low
number of customers.
TABLE IV PERCENTAGE OF WINNING TIME INSTANTS FOR THE VARIOUS PROBABILITY DISTRIBUTIONS
The whole analysis could be repeated with a different data set, with initial parameters concerning the composition of the residential customer set (e.g., including new appliances, customer preferences, or customer willingness to participate to tariff-driven programs), to perform scenario studies and assessing the effects of the penetration of new technologies on the time evolution of the aggregated residential consumption. Examples are the assessment of the distributed generation impact on residential districts, distribution system load forecasting, and simulation of unbalanced loads. The Gamma probability distribution has clearly emerged as the one with the best goodness-of-fit. This result is particularly interesting, since the simple relationships between the Gamma parameters and the average value and standard deviation, as well as the existence and easy formulation of its characteristic function, make the Gamma distribution particularly flexible and powerful for many applications.
VII. REFERENCES
[1] C.F.Walker and J.L.Pokoski, Residential load shape modeling based on
[2]
[3]
N |a
ct
ct
Vol.9, No.2, May 1994, pp.957-964. [4] A.Cagni, E.Carpaneto, G.Chicco and R.Napoli, Characterisation of the aggregated load patterns for extra-urban residential customer groups,
Proc. IEEE Melecon 2004, Dubrovnik, Croatia, May 12-15, 2004,
use load shape estimation from whole-house metered data, IEEE Trans. on Power Systems, Vol.3, No.3, August 1988, pp.986-991. A.Capasso, W.Grattieri, R.Lamedica and A.Prudenzi, A bottom-up approach to residential load modeling, IEEE Trans. on Power Systems,
E ; a |E~Q E
i > Z
z
10 20 30 40
50
t3
Z o t o zO . : Z
-=
44
[5] ISTAT, 14th general census of the population and of the houses (in WX z Italian), 2001. Available: http://dawinci.istat.it/MD/
;[6] R.Herman and J.J.Kritzinger, The statistical description of grouped
3,.951-954.
0 0
0 0
0
0 0 0 0
0
1.0 0 0 0
0
0 0.1 0 0.3
0
60 70 80 90 100
0 0 0 0
0 0 0 0
0 0 0 0 0
0 0.1 0.1 0 0
[7] J.P.Ross and A.Meier, Whole-house measurements of standby power consumption, International Conference on Energy Efficiency in Appliances, Report LBNL-45967, Berkeley Lab, September 2000, Available: http://standby.lbl.gov/articles.htmI [8] K.Rosen and A.Meier, Energy use of televisions and videocassette recorders in the U.S., Report LBNL-42393, Berkeley Lab, March 1999, Available: http://eetd.Ibl.gov/EA/Reports [9] use, ACEEE Summer A.Anglade, Global implications Buildings, Report B.Lebot, A.Meier and Study on Energy Efficiency in of standby power LBNL 46019, Berkeley Lab, August 2000. Available:
Dekker, New York, 1996. [11] K.S.Trivedi, Probability and Statistics With Reliability, Queuing and Computer Science Applications, Wiley, New York (2002). [12] L.H.Miller, Table of percentage points of Kolmogorov statistics, J. Am.
http://standby.lbl.gov/articIes.htmI
load, . be usedl time-evolution of the aggregated residential . ~~~~~~[14] mea unnon J. Amthe tt so.616)38 for normality .... to H.W.Lillilefors, On Kolmogorov-Smirnov test mean and variance unknown, J. Am. Stat. Assoc. 62 (1967) 400. into any tool for probabilistic simulation (e.g., based on
analytical or Monte Carlo simulations). The results shown are specifically referred to groups of residential customers in