Sie sind auf Seite 1von 16

Board of the Foundation of the Scandinavian Journal of Statistics

On Location, Scale, Skewness and Kurtosis of Univariate Distributions


Author(s): Hannu Oja
Reviewed work(s):
Source: Scandinavian Journal of Statistics, Vol. 8, No. 3 (1981), pp. 154-168
Published by: Wiley-Blackwell on behalf of Board of the Foundation of the Scandinavian Journal of
Statistics
Stable URL: http://www.jstor.org/stable/4615828 .
Accessed: 25/09/2012 20:27

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Wiley-Blackwell and Board of the Foundation of the Scandinavian Journal of Statistics are collaborating with
JSTOR to digitize, preserve and extend access to Scandinavian Journal of Statistics.

http://www.jstor.org
Scand J Statist 8: 154-168, 1981

On Location, Scale, Skewness and Kurtosis


of Univariate Distributions

HANNU OJA
University of Oulu

Received November 1978, in final form December 1980

ABSTRACT. This paper deals with the concepts of location, standarddeviation.More generally,there is often a
scale, skewness, and kurtosis of univariatedistributions.Some set of functions U such that Eu (X) measures the
old and some new partial orderingsare presentedfor comparing
the propertiesof distributions.The works of Bickel & Lehmann requiredpropertyof the distributionof the random
(1975, 1976) and van Zwet (1964) are brought together by variableX for all ue U. Then it is possible to base
using the notion of convexity of order k. Measures of location, an orderingon this by defining
scale, skewness and kurtosis are introducedwhich include the
classical measures as particularcases.
Xs Y if Eu(X) < Eu(Y), Vu&U. (1.1)
Key words: generalized convexity, crossings of cumulative
distribution functions, kurtosis, location, scale, shift function,
skewness Sometimes1Ucan be consideredas the set of "utility
functions".
This kind of an approachhas been used in statis-
1. Introductionand motivation tics quite often; this is reflectedby questionssuch as
"Is kurtosisreally'peakedness'?"(Darlington,1970),
In this paperwe study the propertieslocation, scale, instead of the question "What does the classical
skewness and kurtosis of univariate distributions. measure of kurtosis really measure?".It is as if
Why then are thesepropertiesimportant?The useful- psychologistswere to study from a single IQ what
ness of location and scale is more obvious since they intelligencereally is. We think that it is reasonable
have a clear interpretation,and thereis an extensive first to agree on what we mean by location, scale,
literatureconcerningthem. Skewness and kurtosis skewness and kurtosis, and only then try to find
are often consideredas secondarystatisticsindicating measuresfor them.
the stability of the primarystatistics,location and This, therefore,constitutesan alternativestarting
scale. point, which may be termed"analysisof meaning".
Historically,the need to study skewnessand kur- We try to answerthe questions:What kind of con-
tosis first came up when it was found out that the cepts are location, scale, skewnessand kurtosis?Can
normal curve often failed to give an adequaterep- we define them operationally?When is it possible to
resentationfor actual data. In Pearson'scurve sys- say that "G possessesthe propertyP more strongly
tem the model for data was then selectedon grounds than F does"?Can these propertiesbe measured?
of the observedstandardizedthird and fourth mo- Carnap(1962)writes:"Byan explicationwe under-
ments, fl, and P2. Sincethen manytests for normality stand the transformationof an inexact, prescientific
have used these statistics.Distributionsof three or concept, the explicandum,into an exact concept,the
more parameters,some possiblyindicatingskewness explicatum.The explicatummust fulfil the require-
and kurtosis, have often been looked upon as ments of similarityto the explicandum,exactness,
models of non-normality.Definitionsand orderings fruitfullness,and simplicity.Threekinds of concepts
concerningskewnessand kurtosishave been used in are distinguished:classificatory(e.g. Warm), com-
studyingrobustness,in selectinglocationparameters, parative (e.g. Warmer),and quantitativeconcepts
in reliabilitytheory,in nonparametricalstatistics,etc. (e.g. Temperature)".He continues: "The historical
How should one begin to study the notions loca- developmentof the language is often as follows: a
tion, scale, skewness and kurtosis. Usually statis- certain feature of events observedin nature is first
ticians work operationally,defininge.g. location as describedwith the help of a classificatoryconcept;
the property which is measured by the mean, or later a comparativeconcept is used instead of or in
scale as the property which is measured by the additionto the classificatoryconcept;and still later,
Scand J Statist 8
Location, scale, skewness and kurtosis 155

a quantitativeconcept is introduced.(These three indicatedsequence,zero termsbeingdiscarded(Kar-


stagesof developmentdo, of course,not alwaysoccur lin, 1968,p. 20).
in this temporalorder.)"It seems that the develop- We say that the functionsf and g crosseach other
ment of the explication of the concepts location, k times if S(f -g) =k, k =0, 1, 2, ... They cross each
scale, skewnessand kurtosishas not taken place in other at most k times if S(f-g) <k. If f has an
that temporalorder. integral function F, then S(F) ?S(f) + 1 (Karlin,
Bickel& Lehmann(1975,1976)answeredthe ques- 1968, pp. 310-311). In this articlewe are interested
tion "How should one make precise measures of in the crossingsof distributionfunctions. Karlin &
vague featuresof the shape of a population?"in the Novikoff (1963), Marshall & Proschan (1970) and
case of location, dispersion and spread, and van Barlow& Proschan(1975),for example,have earlier
Zwet (1964) introducedthe orderingsfor skewness characterizeddistributionsin this way.
and kurtosis. Our aim is to bring these works We now definethe importantconceptof convexity
togetherand makethemmorecomplete.This is done of order k, k =0, 1, ... (Karlin, 1968, p. 23; Karlin &
with the help of the notions of convexity of order Ziegler,1976):
k and the shift function A(x) = G-1F(x) - x intro-
duced in Section2.
In Section 3 we study location and see that A Definition2.1. We say that the functionf: I-R is
is convex of order 0 (i.e. postive) if and only G is convex (concave) of order 0 if
stochasticallylargerthan F. A distributionG is more
spread out than a distributionF (in the sense of f(x) >( 6)0 forallxEI
Bickel & Lehmann(1976))coincideswith the state-
ment that A is convex of order 1 (i.e. increasing)as convex (concave) of order 1 if
is seen in Section 4. van Zwet's (1964) convex
orderingF s c G means that A is convex of order2
(i.e. convex in the usual sense) and this notion is If(x1) f(x2) | > ( < )0 for all xl <X2,x1, X2EI,
discussed in Section 5 under the headline "Skew-
ness". In Section 6 we discusskurtosis. and convex (concave) of order k, k =2, 3, ..., if
Measuresof all the propertiesare also introduced, 1 1 ... 1
and amongthesethereare the classicalones. This is,
of course,one of the most importantcriteriafor our X1 X2 ... Xk+1

work, since the classicalmeasuresare generallyac- >(<) for all (2.1)


cepted as measuresof the correspondingproperties ... . .. X<2 .... < Xk+1,
X14 X2 Xk+1
and they have, to some extent, influencedthe ideas
*-- X1, X2, e I
peopleusuallyhave. Some new partialorderingsand f(x0) f(X2) f(Xk+l) ...,

measuresare also proposed.Some applicationswill


be discussedin forthcomingpapers. If the kth derivativeof f exists, then f is convex
(concave) of order k if and only if f(k)(x)>(<)0.
Another characterizationof this concept is as fol-
2. Some preliminaries lows: f is convex of order k or concave of order k
We say that the cumulative distributionfunction if and only if S(f() -Pk-l(-)), ?k for every poly-
(c.d.f.)Fis strictlyincreasingif it is strictlyincreasing nomial Pk-, of order k -1. Clearly,f is convex of
in SF = {X: 0 <F(x) < I}. In this paperwe study only order 1 means thatf is increasing,and second-order
absolutelycontinuousand strictlyincreasingc.d.f.'s. convexitycoincideswith the usual assertionthatf is
By a model we then mean a collection of such convex (Karlin,1968, pp. 280-283).
distributions. We definesome functionswhich are often needed
First we introducesome notations: in this paper:

Notation2.1. Let f be a real valuedfunctiondefined Notation2.2. Let F and G be c.d.f.'s. We then write
in Ic R and let then R(x) = R(x; F, G) = G-IF(x), x ESF,

S(f) = S(f()) = sup S(Xl), f(x2), ..., f(xn)] A(x) = R(x) - x, x ESF,
A*(x) = R-1(x) -x = F-IG(x) -x, x eSG, and
where the supremumis extendedover all sets xi <
x2 <... <x, (xiSI), n is arbitrarybut finite, and r(x)=r(x; F, G) = Gf(xX) xE SF. (2.2)
S(Y1, Y2, ..., Yn) is the number of sign changes of the
Scand J Statist 8
156 H. Oja

Among others, van Zwet (1964) and Barlow & We first give some intuitivedefinitionsregarding
Proschan (1975) have studied the function R. The location:
quantileprobabilityplot (Q - Q-plot)introducedby
Wilk & Gnanadesikan(1968) for two theoretical
c.d.f.'s is just a graphicalpresentationof R(x). If Definition3.1. We say that F and G are strongly
X is distributedaccordingto F (write X- F), then locationcomparableif A or A* is convex of order 0
Y=R(X)-G. If G is the exponentialdistribution (i.e., non-negative);we then writeFc?G.A model X
with A=1 then R is the hazard function, R(x)- is a location model if F, G EJ=>-FcoG.We say that G
- log (1 - F(x)) = - log F(x), and r is the failure rate
is to the right of F if A is convex of order 0, i.e., if
Fs oG.
function of F, that is, r(x) = R'(x) =f(x)/F(x). The
alternativerepresentationof r(x), The relation c? is symmetricaland reflexive,but
not transitive.It is clear that Fc?G if and only if F
f(x) 1- G(R(x)) f(x)/F(x) and G do not cross each other anywhere.The model
,(x) = 1 - F(x) g(R(x)) g(R(x))/G(R(x)) ( 7={F(@ +a): aER} is a location model, but not
every location model is of this kind.
The shift functionA is convex of order0 (A(x) > 0)
shows that in generalr is the quotientof the failure if and only if F(x) > G(x),Vx. Therefore,the ordering
rate functions of F and G, evaluated at the same :o coincides with the well-known "stochastic or-
quantile (F(-) = G(R(-))). It is also remarkable that dering". The relation <O is a partial ordering in
every model :, i.e., it is reflexive, transitive, and
fF1l(u) antisymmetrical.If X is a location model, then
r(F-'(u)) = - (u) (2.4)
<:7, <O>is an orderedset, since any two elements
in : are s o-comparable.
uE(0, 1). Parzen(1979) discussed"the density-quan- If we studythe crossingsof densitieswe can define
tile function"fF-', stating,e.g., that a stronger relationthan < o by requiringthat f and
g cross each other exactly once, the sign of f-g
1 changing from + to -, where f and g are the
Entr (F) = - f log fF-(u) du (2.5) respective densities of F and G. This relation is,
however,not transitive.
If we want to find other orderings which seem
Tarter & Kowalski (1972) used the function suitable for considering location, we can study
0kD-1F(x)/f(x) to test normality. Clearly this func- orderings of the type (1.1) and use as "test func-
tion coincides with (r(x))-1 if G = (D. tions" the classes
Doksum & Sievers(1976) used the responsefunc-
tion A(x) = R(x) - x in graphical comparisons of two = {u: u increasing and continuous},
populations. Doksum (1974) called A also a shift ?2 {max (,t): teR}
function since X, when shifted by A(X), has the
same distributionas Y, i.e., X+ A(X) G. and
Now we are ready to define usefulrelationssome
of which are studiedin Sections3-6: 3 {min (, t): tEER}.

Definition2.2. Let X be a randomvariablewith the It is well known (see, for example, Stoyan (1977))
c.d.f. Fand let Ybe a randomvariablewith the c.d.f. that
G. Then we write F ; kG or, equivalently,X s k Y if cG
A is convex of orderk, k =0, 1, FsOGF cG/F (3.1)

3. Location The orderings sc, and s c. measure,not only loca-


tion, but also scale as is seen in Section4. Functions
Location has been discussed thoroughly in many u EC,,can sometimesbe interpretedas utility func-
articles(see Bickel, 1976; Bickel & Lehmann,1975; tions of monetaryrewards.
Huber, 1972;etc.) and thereis nothingnew concern- Let I be the selected model, not necessarilya
ing it in this paper. Our use of languagemay, how- location model, and s the selectedorderingof loca-
ever, differfrom that of the earlierwriters.We think, tion ( s o, s s , or -sX). We shall now define
like Bickel & Lehmann(1975), that it is possible to when a function T: I - R is a measureof location.
speak about location in asymmetricalmodels, too. (If X- F then we denote aX + b a x F+ b):
Scand J Statist 8
Location, scale, skewness and kurtosis 157

Definition3.2. The functionT: J - R is a measureof when and how is the comparisonof scale possible?
location in J if This question can be answeredin many ways, one
of which is the following:
(L1) T(ax F+ b) = aT(F) +b, a, bER, VYFe ,
Definition4.1. We say that F and G are strongly
and scale comparableif A or A* is convex of order 1
(L2) T(F)< T(G) if F,GeZj and F s G. (i.e., increasing);we then write Fc1G. A model 7
is a location-scale model if F, Ge:=> Fc1G. We say
It is easy to see that if 'T: 7 - R satisfies the that Fhas scale not larger(or is not more spreadout)
conditions (L1) and (L2) and if F is symmetrical than G if A is convex of order 1 (increasing),i.e., if
about ,u, then 'f(F) =ys. Furthermore, if .F is a sym- Fs 1G.
metricalmodel, i.e., it includessymmetricaldistribu- The relationcl is clearlyreflexiveandsymmetrical,
tions only, and if T'1: - R and '2: .- R are two but not transitive.It can be easily seen that Fc1G if
measures of location in 7, then T1(F) =TF2(F) for and only if S(A( )-a) 1 for all aeR, i.e., if and
all FEZ. only if F(-) and G(- +a) cross each other at most
once for any a ER. The model 7 = {F(a - + b): a, b CR,
Remark3.1. Bickel & Lehmann(1975) stated the a >O} is a location-scalemodel, but not every loca-
conditions(L1)and (L2)in a differentform. Doksum tion-scalemodel is of that kind.
(1975) discussed,for all increasingand continuous The ordering ;s is identical with the "spread"-
distributions F, "the location interval" [QF, OF] orderingintroducedby Bickel & Lehmann(1976).
which contains all the values T(F) of the different They stated for possiblyasymmetricalF and G that
measuresof location (w.r.t. sQ. G is more more spreadout than F is
Suppose now that s is the selected orderingof
location.It is theneasyto see that amongthe classical G-1(v) - G-1(u) > F-1(v) - F-1(u) for all 0 < u < v <1,
measuresof location the mean (4.1)

=
i.e., if any two quantiles of G are at least as far
1,(F) -{xdF(x) F-'(u) du
apart as the correspondingquantilesof F. The in-
equalitiesin (4.1) hold if and only if A is increasing.
satisfiesthe requirements(L1)and (L2)in models 3 Doksum (1969) used the same orderingto indicate
such that yu1(F)is finite for all FE 7. Similarly,the heavy tails. We think, however,like Bickel & Leh-
median mann, that tailweightshould be specifiedby a scale-
free ordering.
82(F) = F-(i) = inf {x: F(x) > i} We now study some propertiesof s 1. (We write
FfkG if F'kG and G ?kF):
satisfies(L1)and (L2)if the model j is such that F
is strictly increasing in 11a(F)for all FEJ. The mode Theorem 4.1. The relation ' 1 is transitive and
is valid as a measureof location in any model which
contains symmetrical and unimodal distributions Ft1G-23a: F() = G(- +a).
only. An importantlarge class of measuresof loca-
tion is the set of the symmetricallyweightedquantile Proof. ConsiderF, G and H such that F s 1G and
averages G sl H. Then AL(x)=R1(x) -x = G-F(x) -x and
A,(x) = R,(x) - x = H-1G(x) - x are increasing, i.e.,
K(8 = F-(u) dK(u), for all xl < x2,
RA(x2) - RL(xL)> x2 -xl

whereK is any distributionfunction on (0, 1), sym- and


metrical relative to the point i, and F-1(u)=
inf {x: F(x) > u}, 0 < u < 1. In additionto thesemeas- R2(y2)- R2(yL)> Y2 -Y1 for all Yi < Y2.
ures, a great deal of work has been done regarding
=
the (R)- and (M)-location parameters(Hodges & Set now R3 R2R1 =H-1F. If xl <x2, then
Lehmann,1963;Huber, 1964;etc.). R3(x2)- R3(xl) = R2Rl(x2) - R2RL(x1)
? R1(x2)- R1(xi)
4. Scale
> Xa - X.
Scale has usually been discussed in symmetrical
models only. We try to avoid this restriction.But Thus A3(x) = R3(x) - x is increasing, and F:51H.
ScandJ Statist8
158 H. Oja

Secondly, A(x)=G-1F(x)-x is increasing if and


only if A*(x) = F-1G(x) -x = - A(F-1G(x)) is de-
creasing.HenceF 1G-A(x) =a for some a-F(*) =
G(- +a) for some a. [1
In symmetricalmodels scale is usuallyunderstood
as a measureof "averagedeviation"from the sym-
metry centre. Bickel & Lehmann(1976) defined in
this spiritan ordering :disp by

F<dspG if F* 0oG*, (4.2)

("G is more dispersed than F"), where F*(x)=


F(x + /F) - F(- x + F), x >O; G*(x)= G(x + G)-
G( -x +?G), x O, and I'F and /G are the symmetry
centresof F and G. They expressed(4.2) also in the
equivalentform
G-1(v) - G-1(j) > F-1(v) -F-1(1) Fig. 1. A situation in which FS * G, but not FS 1 G.

respectively for v S i. (4.3)


F s 1Gdoes not. Yet anothersufficientconditionfor
Yet another form of this condition is obtained by Fs *G is that the densitiesf( + [IF) and g( + G)
setting v =F(x): cross each other exactly twice and the last sign of
f(. +9'F) -g(* +JG) iS --
A(x) ? 11G- UF resp. for X S /F. (4.4) In the next theoremwe comparethe given defini-
tions supposingthat the expectationsare finite:
Birnbaum (1948) discussed the "peakedness"-or-
deringof a similarform. He saidthat Yis morepeak- Theorem 4.2. The following
implications hold, where
ed about Yi than Z about z1 if IY-y11 ?oIZ-z 1. the implications inside the
brackets refer to sym-
Birnbaumdid not separate scale from peakedness, metrical distributions
only:
as we do.
We now define two new orderingsof scale, s * 10 F:<G=>(FsdispG F*G =>-Fs*G.
and ? *, which are weaker than ?1 and maybe
easier to use. In a location-scale model all three 20 If F and G are strongly scale comparable then
orderings, s !5, and s coincide. Also, <* is F<
lG-(F?diSpG )F; - G Fs G.
strong enough to order the variances of any two
distributions. Proof. 10 follows easily from It
definitions. 20. is
enough to prove that F:? G holds if F and G are
Definition4.2. We say that stronglyscale comparableand Fs **G. If FcOG,i.e.,
if b A is monotone, and
Fs**G 3a,b: A(x) resp. for x5a.
(4.5) A(X) g b x5a
resp. for
If F and G have finite first moments/F and PG, we for some a and b, then A must be increasing.Thus,
say (replacingb above by /1G - F) that
A is convex of order 1, and F;s1 G. O
Fs*G if 3a:A(X)guG-,IF resp.for x5a.
Remark 4.1. It is clear that even if we do not
(4.6) assume that the expectations are finite, it is still
true that F s l G > (F s dis G=>)Fs** G. Moreover,if
The ordering s ** can be used as follows: Fc1G then F s 1G(F s disP G)F s G. r*
Fs-**G or Gs5**F if there is beR such that
S(F(-) -G(- +b)) =1, i.e., if for some b the c.d.f.'s Corollary 4.1. If :T is location-scale model, then s 1
F(-) and G(- +b) cross each other exactly once. is transitive in X and any two distribution in : are
Furthermore,Fs! G or G;!l F, if F(- +?F) and Moreover,if F, G e z, then
s **-comparable.
G(- + IG) cross each other exactly once. This is il-
lustratedby the next figure,whereF s; G holds but F G 3a: F(.)
**G = G(' +a).
Scand J Statist 8
Location, scale, skewness and kurtosis 159

It follows from the conditions(SC1)and (SC2)that


T(F) > 0, VFE , and '(4C,O)) = 0 for all constants
c, if s is such that I[C.s,)F for-all FeZT. If T is
a measureof scale and a >0, then so is the function
a *Uf: F->a *(f(F.
Remark4.2. Bickel& Lehmann(1976)havesimilar
G F definitions for "dispersion"(symmetricalmodels)
and "spread"(asymmetricalmodels).
The next theoremanswersthe questionabout how
one can find measuresof scale when using <:

Theorem 4.3. Let F and G be the c.d.f.'s with finite


expectations tF and IG- Then

F G =>F( * + IZF) 's S1G(* +FG)


=>F(- +#p) Ss2G(- +PG)- (4.10)
Proof. Suppose X-F, Y-G and F;,'*G. Then
Fig. 2. A situation in which FS** G, but aF= V3 > aG= 1. thereexists x0 such that

F(x +P1F): G(X+PG) resp. for x 5 xo.


Proof. The proof easily follows from Theorems (See Fig. 1.) If t > then
x0, clearly E max (X - /tF, t) ?
4.1 and 4.2. C] Emax (Y -0G, t). If t < x0, then E max (X -JF, t) =
t - Emin (X - F, t) t - Emin (Y-/PG, t)=
Furtherpossibleorderingsof the type (1.1) follow Emax
(Y-JUG, t). Therefore, X- F < C2 Y-1G or
from using the functionclasses
F( +/F)/ s 2G( + I G). From (4.9) it follows that
S1= {u:u increasingand convex} (4.7) F( * +Y/F) ;Ss1G( + G)
The second implication is proved in Stoyan
and (1972). C]

S2 - {u: u convex}.
Remark4.3. If in Definition4.2 we use some other
(4.8)
measuresof location (e.g. medians) instead of ex-
Accordingto Whitt (1980),the ordering <s, "com- pectations UF and /UG, then Theorem4.2 still holds
bines monotonicityand variability",and the strictly but Theorem 4.3 is not true. We do not present
stronger ordering s. contains only "variability". measuresof scale for these orderingsin this paper.
It can be shown that
Corollary 4.2. The standard deviation
Fsc2 G--Fs s,lG. (49

aF (Xf-r F)2dF(x)]
=
(4.11)
For this and more referencessee Stoyan (1977). A
function u eS, can be interpretedas a risk taker's
utilityfunctionin a game context. is a measure of scale for < * (and thus, by Theorem
Let now 7 be any selected model and s the 4.2, also for ; 1 and ; disp).
selectedorderingof scale. We define what we mean It is easy to find cases in which Fs** G, but
by a measureof scale: F> aG. Let, for example, F to be the c.d.f. of
U(O,6) and G the c.d.f. of N(O,1). (See Fig. 2.)
Definition4.3. The functionT: -R is a measureof Yet other measuresof scale for ' l can be found
scale if by using the next theorem:

(SCJ) T(a x F+ b) = Ia T(F), Va, b eR, VFeS, Theorem4.4. Suppose that F< 1G. If X1, X2, ...,Xn
and Y1, Y2, ..., Yn,are random samples from F and G,
and respectively, and

(SC2) T(F) <T(G) if F, Ge 7, and FsG. D =X(j) -X(j_), i = 2, ..., n,


Scand J Statist 8
160 H. Oja

and The function in (4.13) therefore measures simul-


taneously scale and deviation from normality. For
D' = Y(t)- Y(-,1), i = 2, ..., n, this see, for example, Rao (1965, pp. 131-132) and
Vasicek(1976).
where X(j) and Y(j) are the respective i'th order
statistics, i= 1, ..., n, then
5. Skewness
Di;50D', i-2,...,n. (4.12)
How may one distinguish skewness from other
propertiesof distributions.It is clear that the dis-
(For the notation see Definition 2.2.) tributions F and a x F+ b, a >0, should have the
same "degreeof skewness".Therefore,we think that
Proof. Suppose that F1 G and X1, X2, ..., Xn is a it is reasonableto studythe propertieslocation, scale
random sample from F. Since A is increasing, and skewness together. We propose the following
A(X(i)) AA(X(jf_1)),i = 2, ..., n, or definitionwhichis in line with the earlierDefinitions
3.1 and 4.1:
R(X(j))- R(X(i -1) >,X(f)- X(j - ), i - 2, ..-.,n.
Definition5.1. We say that F and G are strongly
Therefore, skewnesscomparableif A or A* is convex of order 2
(i.e., convex in the usual sense); we denote this by
P(R(X(i)) - R(X(j-1)) > y) > P(X(f) - X(i-l) >y), Fc2G. A model 7 is location-scale-skewness model if
Vy, i=2, ...,n F, G Et = Fc2G. We say that F is not more skew to
the right than G if A is convex of order 2, i.e., if
i.e., X(j)-X(j_l) ?OR(X(2)) -R(X(j.l)), i=2, ..., n. The F~ 2G.
proof follows from the fact that R(X1),R(X2),.... The function A is convex (of order 2) if and only
R(Xn)are independentand distributedaccordingto if R is convex (of order 2). On the other hand, R
G. O is convex if and only if R-1 is concave. Therefore,
A is convex if and only if A* is concave. From this
Corollary 4.3. If T is a measure of location and 4 follows that Fc2G-A is convex or concave
transforms the c.d. f. F of X to the c.d.f. of i Xl - X, 5S(A(-) -a* -b) < 2 for all a, bER-F() and
where X1, X.2 Fare independent,then TF is a measure G(a*+ b) cross each other at most twice, Va > 0, b.
of scale. The relation c2 is clearly symmetrical but not
transitive.It is also reflexive,and in fact something
From Corollary4.3 we get as special cases, e.g., more:
the measuresof scale
F( )c2F(a +b), for all a>0,b and F, since by
?1(F)= E(IX, -X2j) choosing G(-) = F(a * + b) we get R(x) = (x - b)/a.
Let C be such a collectionof increasingconvex or
and
increasingconcave functionsthat
=
1~~~~~~
E(X1- X),
--2 c2EC c>cl1c2 or cj` cl is increasingconvex
cl,
or increasingconcave.
where Xl and X2 are independentand distributed Then the model
accordingto F. For other measuresof this kind see 7 = {Fc: cEC} (5.1)
Bickel & Lehmann(1976).
Finally notice that the function is a location-scale-skewnessmodel. For example, if
C = {(AX*)a:A> 0, a > 0} and F(x) =1 - e-', x > 0, then
a2(F)l- eEntr(F) (4.13)
:Tin (5.1) is the Weibulldistributionfamily
is a measureof scale for ' 1 because {1 -e(-A ': A > 0, a > 0} (5.2)

A(x) is increasing where a is called the shape parameterand A is a


scale parameter(Barlow & Proschan, 1975, p. 73).
r(x) > Entr() Another interestingexample is the Box-Cox model
fF-I(u) >gG-I(u)#- (1964), wherenow
log fF-I(u) >log g G-I(u)-=
c 0A /a: A, u,ca> 0} and F= 0.
Entr (F) s Entr (G).
Scand J Statist 8
Location, scale, skewness and kurtosis 161

On the other hand, every location-scale-skewness If x= x3 or x2 ==x3 or x2 = x4 we directly get the


modelcan be presentedin the form (5.1) by choosing same inequalities.Thus (5.3) holds. El
FE: and settingC = {F-'G: GE J}.
The ordering 2 coincides with "the convex Theorem 5.1. The relation < 2 is transitive and
ordering"proposedby van Zwet (1964)and was also
used by Bickel & Doksum (1969). It is clear that if
.7 is a location-scale-skewness model, then any two F 2G3a >0, b: F(-) = G(a- +b).
elementsin J are < 2-comparable. Some otherprop-
erties of s 2 are containedin the following: Proof. ConsiderF, G and H such that Fs2 G and
G ! 21H.Then R1 = G-1F and R2 =H-1G are convex,
Lemma 5.1. The distributionF is not more skew to the i.e.,
right than the distribution G(Fs5 2G) if and only if
R1(X4) -
RI(x3) x for all X < X4
R(x4) - R(x3)>X4 - X3 for all x 3 and (5.3) a
- R1(x1) -
R1(x2) X2 X1 Xi < X2 < X4
R(x2) - R(x1) x2 -X1 X1< X2 < x4.

Proof. Notice that F< 2G if and only if R is con- and similarly for R2. Set now R3 = R2R- = H-IF. If
vex (of order2). Let us supposefirst that (5.3) holds. xl < X3< X4 and X1 < X2 < X4, then
Then(R(x4) - R(x2))(X2- xl) > (R(x2) - R(x1))(x4 -X2)
for all xl <X2 <X4 (set x3 =X2 in (5.3)), i.e., R3(x4) - R,(x4) R2(RL(x4))- R2(Rl(x3))
R(x1) (X4 - x2) - R(X2)(X4 - xl) + R(X4)(X2 - xI) ?>0 for R3(x2) - R3(xl) R2(R1(x2)) - R2(Rl(xL))
all X1 <X2 <X4. Thus

Rl(x4) - Rl(x3) x4 - XS
1 1 1
Rl(x2)- Rl(xl) X2 - X1
Xi x2 X4 >0 for all xi<x2<x4,
R(xl) R(x2) R(x4) Thus, R, is convex and Fs 2H.
which meansthat R is convex. Secondly,R is convexif and only if R-1 is concave.
Suppose now that R is convex. Then Hence F- 2G--R(x)-ax b for some a >O and b-
0 for
F(-) =G(a +b) for some a >O and b.
R(x1) (x3 - X2) - R(X2) (X3 - Xl) + R(x3) (X2 X
-X1)
all X1 <X2 <X3, i.e., R(x3)(x3-xl) - R(x2)(x3 -Xi) > For an alternativeproof see also van Zwet (1964,
p. 49). 0
R(x3)(x3 -x2) -R(xj)(x3 -x2) for all X1 <X2 <X3
Therefore
In reliabilitytheory one often studies"skewness"
R(x3)- R(X2) X3 - X2 for all x <x <x propertiesof distributions.The distributionF of a
IL 2
R(x3) - R(xl) x3 - xx 3, nonnegativerandomvariableis said to be increasing
failure rate (IFR) if R = G-1Fis convex and decreasing
and failure rate (DFR) if R = G-1F is concave, where
G(x)=1 -e-5. In other words, F is IFR if Fs2G
R(x3)- R(X2) X3- X2 and DFR if G ! 2F, where G is the exponential
R(x2) - R(x1)> x2 - xLfor all X X2 x3 distribution.A propertyrelated to this is that F is
said to have an increasing failure rate average (F is
If X1 < X2 < x3 < X4 then IFRA) if R(x)/x is increasing and a decreasingfailure
rate average (F is DFRA) if R(x)/x is decreasing.
R(x4) - R(x3) R(x4) - R(x3) R(x3) - R(x2)
(For definitions and references, see Barlow &
R(x2) - R(x1) R(X3) - R(X2) R(X2) - R(x1)
Proschan,1975.) More generally,we could say that
X4-X3 x3-X2 X4-X3
F is IFR with respectto G if R is convex(i.e., convex
ordering)or F is IFRA with respectto G if R(x)/x
x3-X2 x2-Xl x2-X1
is increasing("star-shapedordering"discussed,e.g.,
and if xI <x3 <X2 <X4 then in Barlow & Proschan(1966), Doksum (1969) and
Lawrence(1975)).
R(x4) - R(x3) (R(X4) - R(X2)) + (R(x2) - R(x3)) The < 2-orderingis thus a natural extension of
R(x2) - R(x1) R(x2) - R(x1) IFR-orderingfor random variableswhich are not
necessarily positive. A generalizationfor IFRA-
(X4-X2) + (X2-X3) X4-X3 orderingwould be the 5star-ordering definedas fol-
X2-X1 X2-Xi lows:
11- 811923 Scand J Statist 8
162 H. Oja

Definition5.2. We say that

rG if 3y: R(X)-R(Y)
F;S star
x-y
is increasingin S(F) - {y} (5.4)

Doksum (1975) also discussed skewness. He de-


fined the "symmetry function" OF(X)= j[x -F-1F(x)]
where F(x) = 1 - F( - x), i.e., X Ft - X-F. The
function OF applies to a comparisonof the distribu-
tions F andF, sinceOF is concaveif F < 2F andconvex
if F s 2F(OF(X) - jA(x) if G =F). Doksum defined
skewness qualitativelyas follows: F is skew to the >
//
right if OF attains its minimumvalue at the median
of F, and F is strongly skew to the right if OF iS U-
shaped.
Analogouslywith scale,we now introducetwo new
orderings of skewness, s * and s *$. They turn out
to be weaker than the convex ordering < 2, and Fig. 3. A case in which F< * G.
model.
coincide with it in a location-scale-skewness
The standardizedthird moment preservesthe or-
deringdeterminedby *2. In the next theoremwe state the relationshipsbe-
tween the given definitions. We assume that the
respectiveexpectationsand variancesare finite.
Definition5.3. We say that
Fs** G if 3a, b, x < x2: A(x) ax + b Theorem 5.2.

resp. for Xi xX2 (5.5)


x<x1 or x x2 10F~
'F<starG
If F and G have finite expectationsPF and PG and 20 If F and G are strongly skewness comparable then
variances c4 and aG,we say (replacingabove a by
(-G aF)rF and b by 11G- (aGIr)AUF) that F? 2G F<starG-F* G -Fs* G.
FS *G if 3X1?/IF X2:
Proof. 10 From the definitionsit follows directly
+ rG IF/
that F<2G =F<*F<**G.
A(X)) -rX
aF aF Let us suppose that F< 2G, i.e., A and R are
convex. Choose y ESF. If z >y, then
<
resp. for xi < x x2 (5.6)
x <x or x >x2
hz(x) = R(y) + (x -y) R(z) - R(y)
z-y
The ordering <.2 can now be used as follows:
F< G or G **F if there are a>O and b such is the straightline whichcontainsthe points (y, R(y))
that F(-) and G(a- +b) cross each other exactly and (z, R(z)). Now if y < x < z then R(x) s h,(x), and
*
twice. The relations F ! G or G <2 F, on the other thus the inequality
hand, hold true if the "standardized"distributions
F(aF5 +YF) and G(aG +/iG) cross each other ex- R(x) - R(y) R(z) - R(y)
actly twice. (See Fig. 3.) A sufficientcondition for
x-y z-y
Fs!~*G is also that the densitiesaOFf(a5.F-+?F) and
aGg(aG' +?UG) cross each other exactly three times
> and we get
and the last sign of aF ff(aF + UF) -G g(aG J+ G) holds true. If x <y <z, then R(x) h,(x)
the same inequalitysince x -y is negative. Lastly, if
is -. It is also notable that in applying *2 we
x sz <y then
A
just compare with the linear shift function AL
such that E(X + A(X)) = E(X + AL(X)) and R(x) > R(y) + (x -y) R(y) - R(z)
Var (X + A(X)) = Var (X + AL(X))whereX- F. y-z

Scand J Statist 8
Location, scale, skewness and kurtosis 163

and the requiredinequalityfollows again.Therefore, and


F ' starG.
Now supposethat (SK2) T(F) <vT(G) if F, Ge7 and FsG.

R(x) - R(y) If T is a measureof skewnessand if v: R - R is


x-y an increasing odd function (v( - a) = - v(a), Va), then
the function voT: F-v(f(F)) is also a measureof
is an increasingfunction of x for some y. Choose skewness.It is thereforepossible to find an infinite
z >y. Then numberof measuresof skewnessfrom a single such
measure.From conditions (SK1)and (SK2)follows
R(x) I R(y) -
> + (x - y) R(z) R(yj) that T(F) = 0, if F is symmetricaland s reflexive.
~~z-y
resp. for Remark5.2. As far as the authorknows, thereare
x<y or x,z no definitionsof this kind in the literature,for only
If we set the propertiesof the alreadyexisting "measuresof
skewness"are studied.
R(z) - R(y) The following theorem helps us to find measures
a= -1 of skewnessfor ' 2
z-y

and Theorem 5.3. Let 7= {t: t(x) = u max ((x, 0)) -


u ( - min (x, 0)), u is an increasing convex function on
R(z) - R(y)
b =R(y) (0, oo)}. If EX= EY=0 and Var X=Var Y= 1 then
zR-y
X 2 Y=X2sign (X) s }2sign (Y). (5.7)
we get the condition
Proof. Suppose that X- F, Y G, 1F ==IG = 0,
A(x) ax+b resp. for Yxz aF 1 and X s 2 Y. Then there exist xl < Os x2
x<y or x>z aG
= =

such that
required for F * *G.
20 It is enough to prove that if Fs **G and Fc2G F(x) G(x) resp. for X1xx 2
x<1or x>x2.
then Fs2G. SupposeFc2Gand Fs**G. Then A is
convex or concave.If A is convex, there is nothing (See Fig. 3.) If we denote X+ =max (X, 0), X_=
to prove. If A is concave then it must be a straight -min (X, 0), Y, =max (Y, 0) and Y_ = -min (Y, 0),
line, and thereforeFs2 G. a we get the equalities
Corollary 5.1. If 3 is a location-scale-skewness model EX+ - EX_ = EY+ - EY_ = 0 (5.8)
then 2** is transitive in I and any two distributions
in X are <s**-comparable.If F, G E7, then and

F, ** G a > 0, b: F(-) = G(a +b). EX+ + EX-2 =EY+2 + EY = 1. (5.9)

Proof. The proof follows from Theorems5.1 and We first show that X2 S52 Y+ and Y_ s s2X-.
5.2. 0 Supposethat EX+< EY+.If t > x2, we immediately
see that
Remark5.1. It is clear that if we do not suppose
that the expectationsand the variancesare finite, Emax (X+, t) < Emax (Y+, t).
it is still true that Fs2G=>F?starG=Fs**G, and
if Fc2G then Fs**G=> F2G. If t < x2 then Emax(X+,t) = EX+ + t - Emin(X+, t) s
Assume now that Y is the selectedmodel and s EX+ + t - E min (Y+, t) < EY+ + t - E min (Y+, t) =
the selectedskewnessordering.We now state what E max (Y+, t), since max (Z, t) +min (Z, t) =Z + t
we mean by saying that T: : - R is a measure of for any random variable Z and min (Y+, t) sO
skewness: min (X+, t). Thus X+ sr,2 Y+, i.e., X+ ' s, Y+, and
so also x2 s2Y+2. Therefore E(X+2)s E(Y+), and
Definition5.4. The functionT: .X - R is a measureof consequentlyby (5.9) E( Y2) < E(X!). Now we can
skewness in X if apply the above reasoningfor Y-2 and X2 insteadof
X+ and Y+, and see in exactly the same way that
(SK1) T(a x Ft b) = sign (a) *T(F), Va, b, FE 7,
Scand J Statist 8
164 H. Oja

If we, on the otherhand,supposethat EX+> EY+, Thus (5.12) holds, since R(X1), R(X2), ..., R(X,,) is a
then EX_ > EY_. As earlier we then see that Y_ s s2X_ randomsamplefrom G. El
and y2 S X In particular E(Y) E(X-
< ). There-
fore, by (5.9) E(X4) < E(Yf) and consequently If Di and Di, i =2, ..., n, are as in Theorem 4.4
X+IS S2.Y and Fs 2G, then it can be easily shown that also
Thus, 4 <y2
S2 and Y2 SS2 X2 and if t Eg then
Et (X2 sign (X)) - Eu (X+)- Eu (X2) ai sign (Dj - bi, Di):) : ad1sign (D; - bUD;),
(5.13)
< Eu (Y+) -Eu (Y!)
whereai2> 0, bij> 0, 2 < i <j < n, and
= Et (Y2 sign (Y)). O
ai isO2E,
Remark5.3. If in Definition5.3 we use some other 0 '(5.14)
measures of location and scale in place of expecta-
tions jF and G and standarddeviationsaF and aG, where 0 <a2
a3,... ?a,. Results similar to (5.14)
then Theorems5.2 and 5.3 are not true. We do not are discussedin Barlow & Proschan(1966, Theorem
study these orderings. 3.12) and Bickel & Doksum (1969, Remark 2.3).
Many of the test statisticsfor testing exponentiality
Corollary 5.2. If X s~* Y then against IFR or DFR alternativesare of the form
(5.13) or (5.14) (see, for example, Gail & Gastwirth
IX-EX \2k+1 (Y-EY 2k+1 (1978), Lin & Mudholkar(1980), and Proschan &
(5.10)
E/Vr) <EV-r Pyke (1967)).Thus,if v is a measureof location then

k = 1, 2, ..., if the expectations exist. y1(F) = T( ai1 sign (D; - bjD)), (5.15)
i<1

Proof. The function u(x) =x(2k+l)/2, k =1, 2, ..., is whereaij> 0, bi2>0, 2 < i <j <n, and
increasing and convex on [0, oo). Thus t(x)=
u(max (x, 0)) - u ( - min (x, 0)) - x I(2k+1)/2 sign (x) EV7
On the otherhand, t(x2 sign (x)) = Ixl2k+l sign (x) = y(F) = l(2jD) (5.16)
1
.2k+

Remark5.4. van Zwet(1964,pp. 10-15) provedthe where 0 < a2 < a3 < ... < an, preserve the ordering de-
correspondingresult for his convex ordering < 2. He terminedby S 2. A simple special case of (5.16) is
also found (pp. 16-17) that the Pearsonmeasureof
skewness -
(X(3) X(a)) (5.17)
X3) - X(1)
EX-Md(X) (5.11)
l/Var X where X(l), X(2), X(3) is an ordered sample from F.
From Lemma5.1 follows also that if FS 2G then
does not preservethe ordering s 2. Therefore,it is
not the measureof skewnessfor S * or s ** either.
Othermeasuresof skewnessfor s 2 can be found
by using the next theorem. A(xl) AA(x x2) (X2 2)A(X)

Theorem 5.4. Let Di and D', i=2, ..., n, be as in for all x1<x
Theorem4.4. If Fs 2G then or

R(x1)+R(x2) - 2R(xi 2) > 0, Vxl <


x2-
__ s-o D 2 <i<j?sn (5.12)

Proof. Suppose that Fs 2G. Let X1, X2, ..., X,, be From this follows that the measures
a randomsamplefromF. Thenif i <j, by Lemma5.1
F-1(c) + F'(1 - a) - 2F 1(?)
_ lJR(X,i))- R(X,i_l ) o E(0 51
\ X,i - X,i_ 1\ .
(5.18)
Scand J Statist 8
Location, scale, skewness and kurtosis 165

are nonnegativewhen F is skew to the right (i.e., -= {F: F* E 0} = eI - )


G s 2F for some symmetricalG), and nonpositive
when F is skewto the left (i.e., F s 2G for some sym- + (I - i e-(A2 -Il) ) Ir[",)( ):A>O, oc>O,} (6.1)
metrical G). Hinkley (1975) used the symmetry
condition y,:(F)s=0 to obtain an estimate for A in is the respectivelocation-scale-kurtosismodel, the
the Box-Cox model. double Weibulldistributionfamily.
If Z7is a location-scale-kurtosismodel, then any
two distributionsin I are < ,-comparable.The next
6. Kurtosis theoremfollows directlyfrom the definitions.
Theredoes not seemto be universalagreementabout
the meaningand interpretationof the termkurtosis. Theorem 6.1. The relation ; s is transitive and
In symmetricalmodels one often means by kurtosis
the 'sharpness'or 'peakedness'of the density func- F w8G-:3a > 0, b: F( ) = G(a- +b).
tion about the symmetrypoint. This interpretation
has beencritizedby manyauthors.Kaplansky(1945), Proof. See van Zwet (1964, pp. 64-66). 0
for example, discussedfour examples and showed
that in the case of symmetricaland standardizeddis- Lawvvrence (1975) studied a weaker orderingthan
tributionsthe fourth moment does not preservethe "the s-ordering"< He called it the r-orderingand
,.
orderingdeterminedby peakedness.Finucan (1964) defined it as follows: F srG if R(x)/x is increasing
interpretsthe standardizedfourth moment fl2as an (decreasing)for x > 0 (x <0), and F(O) = G(O)= i. We
indicator of "a prominentpeak and a prominent generalizethis definition allowing F and G to be
tail". Ali (1974) states that l2 measuresonly tailed- symmetricalabout /F and
fG, respectively:
ness. Darlington(1970)writesthat "kurtosis"is best
describednot as a measureof peakednessversusflat-
ness but as a measureof unimodalityversusbimodal- Definition6.2. We write
ity. Chissom (1970) points out that the tails of the
distributionaffect drasticallythe kurtosisvalue. The if R(x) G is increasing(decreasing)for
common characteristicin these studies is that kur- F<,G
-
X -F
tosis is seen only through f2. X > 1F(X<ILF). (6.2)
We now approachthe problem from a different
viewpoint. Through this section we study only sym- We propose two more orderings for kurtosis,
metrical distributions.(Kurtosis in asymmetrical mo- denoted by <~* and s *. They are weaker than s<s
dels is brieflydiscussedin Section 7.) We offer the and Sr, and coincide with them in location-scale-
followingdefinitionwhereF* and G*are as in (4.2). kurtosis models. Furthermore,Fs! * G implies that
The ordering Ss was first defined by van Zwet the corresponding standardized fourth moments
(1964). satisfy f2(F) < f,8(G).

Definition6.3. We write
Definition6.1. We say that F and G are strongly
kurtosis comparable if F*c2G*; we denote this by Fs **G if 3a, b, x <x2?x3:
FcsG. A model :T is a location-scale-kurtosis model
if F, G E =- Fc8G. We say that F does not have more A(x) ax+b resp. for x<x1 or x26x<x3
kurtosis than G if F* < G*; we then write F< 8G. x1 x<x2 or x>x3.

The functionsA and R areconcave-convexor con- (6.3)


vex-concave about the symmetrypoint of F if and If F and G have finite expectations1F and jtG
only if FcsG.Yet anothercharacterizationof Fc8G is and variancesaF and a', we say (replacingabove,
that F( *) and G(a* + b) crosseach otherat most three a by (crG - aF)IaF and b by/G - (aGIaCF)/11F)that
times, Va >0, b. The relation c' is symmetricalbut
not transitive. Moreover, F( )cF(a- +b), for all Fs *G if 3x AF 6 X2:
a >0, b and F. Theseresultsfollow directlyfrom the
respectivepropertiesof c2.
A(X) 9G X + G
Location-scale-kurtosis modelscan be foundeasily aF aF
with the help of location-scale-skewness
models.For
or IIF<x<x2
example,if 7. is the Weibulldistributionfamilythen resp. for x<x. (6.4)
the model xl < x< ,F or x>x2

ScandJ Statist8
166 H. Oja

We comment briefly on two related notions in-


troduced by Finucan (1964) and Karlin (1968).
Finucanprovedthe followingresult.If thereare two
distributionswith densitiesf and g, each being sym-
metricalwith mean zero and variancev, and if

f(x)<g(x) for a<lxl<b F(a, +cu)


while
f(x)>g(x) for lxl<a or Ixl>b
==2-0~~'
then the fourth moment u4 is greater for the f
distributionthan for theg distribution.This ordering
implies our ordering <* but the converse is not
necessarilytrue. Karlin (1968, p. 326) introduceda
concept of peakedness of order k: a random variable
X is less peaked of order k than another random Fig. 4. A situation in which FS * G.
variable Y if

q(u) = P(|X| < u)-P(I YI < u)


R(x) - r
is increasing(decreasing)for
either changes sign k times and is nonpositive for X-PF
sufficiently large u, or changes sign less than k X > Up(X < AF).
times. If both X and Y are symmetricalabout 0,
the statement"X is less peaked of order 0 than Y" Then by Definition 6.2 F S rG.
is identical with Y ! dispX, and from the assertion Secondly, suppose that F S rG. Then R(x) =
"X is less peaked of order 1 than Y" follows that h(x -F) (X - pF)+ IG and the straight line L(x) =
either X s diSpY or Y:sdisp X orX s Y. a (x - pF) + /1G cross each other at most three times
The ordering < works as follows: Fs ** G or for all a. Select a = CGICF. If now
G!**F if there exist a>O and b such that F(-)
and G(a- +b) cross each other exactly three times. R(x) 5-x-- MG
FU+ when x 2IuF
Similarly, Fs* G or G < *F if F(CF? +/1F) and CF CF

G(aG? +IG) cross each other exactly three times. A then


typical case in which Fs~;*Gis presentedin Fig. 4.
/2
In the next theorem,wherewe comparethe given
(x - yF)j > (R(x) - G)2' VX+=IF,
definitions,we assume that the respectiveexpecta- aF
tions and variances are finite:
which impliesthat
Theorem6.2.
10 FsG s = 2 E ( rF
(X-/F)) > EF(R(X)2-G) = 2

20 If F and G are strongly kurtosis comparable then such that


Therefore, there must be xI. IF S x2
(X1 + X2)/2 = MF and
F G F-rGF F *GF **G.
(G IG
Proof.Suppose that F< 8G, i.e., R is concave-
10 A(X) S> (r+ (8G
< rF CIF
convex about 1F. Then the function R(- +1F) -YG
is concave-convexabout the origin and resp. for xx, or /F?X<X2
x1 xX< p or x > x2,
h(x) R(x+ [IF)-pG
x that is, Fs! *G. From Definition 6.3 follows directly
that FGs* GFs** G.
is increasing (decreasing) for x > 0 (x < 0). Thus 20 It is sufficient to demonstratethat if Fss** G
h(x - for x >ILF(X <IhF),
F) is increasing(decreasing) and Fc3G then F:5, G. Suppose that Fc8G and
i.e., Fs **G. Then A is concave-convexor convex-con-
Scand J Statist 8
Location, scale, skewness and kurtosis 167

and
(K2) T(F) <v(G) if F, GE7 and F<G.

Consequently, if h: R - R is an increasing func-


tion and F is a measureof kurtosis,then hT is also
a measureof kurtosis.
We now try to answerthe question:How to find
such measuresof kurtosisfor s *?

Theorem 6.2. If X and Y are symmetrical, and EX=


R EY=O and Var X=Var Y= 1, then
* (6.5)
X;!s y X2 <lY2 =X ys2Y

Proof. It follows directlyfrom the definitionsthat


XYs *=>X2 < Y2.

Fig. 5. The transformation R for which F* * G, but not


F: G. The proof follows now from Theorem4.3. See also
Finucan(1964). E

cave about HF. If A is concave-convex, there is Corollary 6.2. If X and Y are symmetrical and X s * Y
nothing to prove. If A is convex-concaveand, as is then
supposed,
E _EX)2k E( Y-EY)2k (6.6)
A(x)-<ax+b resp. for x<xl or x2 < x< X3 /Vr l/Var
x1<x<x2 or x>x3

for some a, b and xl?x2<x3, then A must be a k = 1, 2, ..., if the expectations exist.
straightline. ThereforeF s sG. Ol
Remark 6.2. van Zwet (1964), pp. 20-21) proved
Corollary 5.1. If5 is a location-scale-kurtosis model the correspondingresultfor his s-ordering.
then < ** is transitive in 3 and any two distributions Other measuresof kurtosis for 5, can be easily
in 5 are *"-comparable. If F, Ge S then found as follows: If $ is a measureof skewnessfor
< 2 and F*(x) = F(x + 1SF) - F( - x +IF), x > 0, then
F **G3a > 0, b: F(x) G(ax + b).
T(F) = 4(F*)
Proof. The proof follows from Theorems6.1 and
6.2. Li is the respectivemeasureof kurtosisfor s

In Fig. 5 we present an interesting situation. 7. Concludingremarks


There the transformationR moves the probability
mass from the "shoulders"to the centre. This is As we saw in Section 4, in the case of scale, the
just the propertyof unimodality-bimodality
or "lack "spread"-ordering?1 of Bickel & Lehmann(1976)
of shoulders" introduced of Darlington (1970). impliesthe dispersionordering s dis.p in symmetrical
Therefore, the ordering s ** contains, not only models. The analogous question for kurtosis is: Is
peakednessversus tailedness,or peakednessversus there an orderingof kurtosis, which applies also in
flatness,but also unimodalityversusbimodality. asymmetricalmodels and which implies s in sym-
Assume now that I is the selected model which metricalmodels?
containsonly symmetricaldistributionsand c is the At least in one case thereis an obvious way to do
selectedkurtosisordering.We introducea definition this: For unimodal distributions,whose modes are
for measuringkurtosis: located at the same quantile,kurtosis can be com-
paredessentiallyas in symmetricalmodels by saying
Definition6.3. The functionT: 5 -*R is a measureof that Fs5 G wheneverR is concave-convexabout the
kurtosisin 5 if mode of F. In general,the fact that R is concave-
convex does not necessarilycorrespondto our image
(KI) T(axF+b) = T(F), VatO,b and VFEC, of "F does not have more kurtosis than G". This
Scand J Statist 8
168 H. Oja

kind of a transformationR may, for example, not Doksum, K. A. & Sievers, G. L. (1976). Plotting with
preserveunimodality. confidence: Graphical comparisons of two populations.
Biometrika 63, 421-434.
The ordering < is an alternative ordering of Finucan, H. M. (1964). A note on kurtosis. J. R. Statist.
kurtosis. This is supported by the following: By Soc., Ser. B, 5, 360-361.
Definition 2.2, F ? 3G if A is convex of order 3. Gail, M. H. & Gastwirth, J. L. (1978). A scale-free goodness-
Thus F'? 3G if and only if r is convex (in the usual of-fit test for the exponential distribution based on the
Gini statistic. J. R. Statist. Soc., Ser. B, 40, 350-357.
sense). Furthermore,from F S 3Gfollows that F and Hinkley, D. V. (1975). On power transformations to sym-
GP2cross each other at most three times for every metry. Biometrika 62, 101-111.
polynomialP2: SF-SG of order2 (GP2 is not neces- Hodges, J. L. & Lehmann, E. L. (1963). Estimates of loca-
sarilya distributionfunction).The ordering s 3 also tion based on rank tests. Ann. Math. Statist. 34, 598-611.
Huber, P. J. (1964). Robust estimation of a location para-
implies s, in symmetricalmodels. It is, however, meter. Ann. Math. Statist. 35, 73-101.
not always transitive. Huber, P. J. (1972). Robust statistics: a review. Ann. Math.
An open questionis also the interpretationof the Statist. 43, 1041-1067.
orderings 4 t,<5,.... If A is convex or order 5, Kaplansky, I. (1945). A common error concerning kurtosis.
J. Amer. Statist. Ass. 40, 259.
for example, then a typical case is that A (and R) Karlin, S. (1968). Total positivity. Stanford University Press,
is concave-convex-concave-convex.This could be Stanford.
interpreted so that G has more tendency to bimodality Karlin, S. & Novikoff, A. (1963). Generalized convex in-
than F. equalities. Pacific J. Math. 13, 1251-1279.
Karlin, S. & Ziegler, Z. (1976). Some applications to in-
equalities of the method of generalized convexity. J. Anal.
Acknowledgements Math. 30, 281-303.
Lawrence, M. J. (1975). Inequalities of s-ordered distribu-
Thanks are due to referees for their comments. The sugges- tions. Ann. Statist. 3, 413-428.
tions and criticisms of Professor Elja Arjas have improved Lin, C. & Mudholkar, G. S. (1980). A test of exponentiality
the manuscript considerably. based on the bivariate F distribution. Technometrics22,
79-82.
Marshall, A. W. & Proschan, F. (1970). Mean life of series
References and parallel systems. J. Appl. Prob. 7, 167-174.
Parzen, E. (1979). Nonparametric statistical data modeling.
Ali, M. M. (1974). Stochastic ordering and kurtosis measure. J. Am. Statist. Assoc. 74, 105-121.
J. Am. Statist. Assoc. 69, 543-545. Proschan, F. & Pyke, R. (1967). Tests for monotone failure
Barlow, R. E. & Proschan, F. (1966). Inequalities for linear rate. Proc. 5th Berkeley Symp., vol. 3 (ed. L. LeCam and
combinations of order statistics from restricted families. J. Neyman). University of Berkeley Press, Berkeley.
Ann. Math. Statist. 37, 1574-1592. Rao, C. R. (1965). Linear statistical inferenceand its applica-
Barlow, R. E. & Proschan, F. (1975). Statistical theory of tions. Wiley, New York.
reliability and life testing. Holt, Rinehart and Winston, Stoyan, D. (1972). fiber einige Eigenschaften monotoner
New York. stochastischer Prozesse. Math. Nachr. 52, 21-34.
Bickel, P. J. (1976). Another look at robustness: A review of Stoyan, D. (1977). Qualitative Eigenschaften und Abschdt-
reviews and some new developments. Scand. J. Statist. 3, zungen stochastischer Modelle. Akademie-Verlag, Berlin.
145-168. Tarter, M. E. & Kowalski, C. J. (1972). A new test for and
Bickel, P. J. & Doksum, K. A. (1969). Tests for monotone class of transformations to normality. Technometrics14,
failure rate based on normalized spacings. Ann. Math. 735-744.
Statist. 40, 1216-1235. Vasicek, 0. (1976). A test for normality based on sample
Bickel, P. J. & Lehmann, E. L. (1975). Descriptive statistics entropy. J. R. Statist. Soc., Ser. B, 38, 54-59.
for nonparametric models. I. Introduction. II. Location. Whitt, W. (1980). The effect of variability in the GI/G/S
Ann. Statist. 3, 1038-1069. queue. J. Appl. Prob. 17, 1062-1071.
Bickel, P. J. & Lehmann, E. L. (1976). Descriptive statistics Wilk, M. B. & Gnanadesikan, R. (1968). Probability plotting
for nonparametric models. III. Dispersion. Ann. Statist. 4, methods for the analysis of data. Biometrika 55, 1-17.
1139-1158; IV. Spread. Manuscript. van Zwet (1964). Convex transformationsof random variables.
Birnbaum, Z. W. (1948). On random variables with compar- Math. Centrum, Amsterdam.
able peakedness. Ann. Math. Statist. 37, 1593-1601.
Box, G. E. P. & Cox, D. R. (1964). An analysis of transforma-
tions. J. R. Statist. Soc., Ser. B, 26, 211-257. Hannu Oja
Carnap, R. (1962). Logical foundations of probability. The Department of Applied Mathematics and Statistics
University of Chicago Press, Chicago. The Faculty of Science
Chissom, B. S. (1970). Interpretationof the kurtosis statistics. University of Oulu
Amer. Statist. 24 (4), 19-23. SF-90570 Oulu 57
Darlington, R. B. (1970). Is kurtosis really peakedness?Amer. Finland
Statist. 24 (2), 19-20.
Doksum, K. A. (1969). Starshaped transformations and the
power of rank tests. Ann. Math. Statist. 40, 1167-1176.
Doksum, K. A. (1974). Empirical probability plots and
statistical inference for nonlinear models in the two-
sample case. Ann. Statist. 2, 267-277.
Doksum, K. A. (1975). Measures of location and asymmetry.
Scand. J. Statist. 2, 11-22.

Scand J Statist 8

Das könnte Ihnen auch gefallen