Estimating Equations For The Gamma Distribution

Maximum Likelihood-Like Estimators for
the Gamma Distribution
Zhi-Sheng YE and Nan CHEN
Department of Industrial & Systems Engineering
National University of Singapore, Singapore, 117576
Abstract
It is well-known that maximum likelihood (ML) estimators of the two parameters
in a Gamma distribution do not have closed forms. This poses difficulties in some ap-
plications such as real-time signal processing using low-grade processors. The Gamma
distribution is a special case of a generalized Gamma distribution. Surprisingly, two out
of the three likelihood equations of the generalized Gamma distribution can be used as
estimating equations for the Gamma distribution, based on which simple closed-form
estimators for the two Gamma parameters are available. Intuitively, performance of the
ML-like estimators should be close to the ML estimators. The study consolidates this
conjecture by establishing the asymptotic behaviours of the new estimators. In addi-
tion, the closed-forms enable bias-corrections to these estimators. The bias-correction
significantly improves the small-sample performance.
Keywords: Estimating equations; bias-correction; generalized Gamma distribution; asymp-
totic efficiency.
1
1 Introduction
The Gamma distribution is a two-parameter distribution with probability density function
(PDF)
xk1
fgam (x) = exp (x/) , x > 0, (1)
k (k)
where k > 0 is the shape parameter and > 0 is the scale parameter. Due to the moderate
skewness, the Gamma distribution is a useful model in many areas of statistics when the
normal distribution is not appropriate. For example, it is often used to model frailty and
random-effects. In queueing theory, the gamma distribution is often used as a distribution
for waiting times and service times (Whitt 2000). It is widely used in environmetrics such
as environmental monitoring and rainfall size (Bhaumik and Gibbons 2006; Krishnamoorthy
and Tian 2008). The Gamma distribution is a useful model for lifetime (Meeker and Escobar
1998, Chapter 5.2). It is also a popular model in signal processing (Vaseghi 2008), and
physical and biological sciences (e.g., Bhaumik et al. 2009).
The most popular method in the estimation of the two parameters in the Gamma dis-
tribution is the maximum likelihood (ML) method. Nevertheless, there are no closed-form
expressions for the ML estimators. This poses difficulties in real-time data/signal process-
ing using battery-constrained, memory and CPU deficient mobile hand-held devices (Song
2008). Although the moment estimators for the two Gamma parameters have closed-forms,
they are not efficient estimators under either small samples or large samples, see Figures 1,
2 and 3 below. In order to obtain simple yet efficient estimators for the Gamma parameters,
we need to think outside the box of the two conventional inference methods.
A model outside the box of the Gamma distribution is the generalized Gamma distribu-
2
tion. It is a useful extension of the Gamma distribution with PDF
xk1
exp (x/) , x > 0,

fgg (x) = k
(2)
(k)
where > 0 is a parameter and () is the Gamma function. This distribution proposed by
Stacy (1962) is a flexible model that contains the Gamma, Weibull and lognormal distribu-
tions as special cases. Many studies have focused on parameter inference for the generalized
Gamma distribution. See Lawless (1980) and Song (2008), among others. Inference in this
distribution is generally hard. The Gamma distribution is a special case of the generalized
Gamma when = 1. Surprisingly, two estimating equations for the Gamma distribution
can be obtained by first treating the Gamma-distributed data as if they are generalized
Gamma distributed and then obtaining the three likelihood equations based on the gener-
alized Gamma distribution. Estimators based on the two estimating equations have simple
closed forms. We show that both the small sample performance and the asymptotic effi-
ciency of the estimators are almost the same compared to the ML estimators counterpart.
In addition, the closed-forms enable bias-correction to these estimators, which significantly
improves the performance in terms of bias and mean squared errors (MSEs).
The paper is organized as follows. Section 2 derives the ML-like estimators for the Gamma
distribution by looking outside to the generalized Gamma distribution. Large sample prop-
erties of the new estimators are investigated in Section 3. Section 4 studies bias-correction
to the new estimators.
2 The New Estimators
Let X gam(k, ) and X1 , X2 , , Xn be n i.i.d. copies of X, where k and are parameters
of interest and need estimation. Obviously, X gg(k, , ) with = 1. For now, let us
3
pretend that X follows the above generalized Gamma distribution with unknown. Then
the log-likelihood function based on the observed X1 , X2 , , Xn is

n
1 X
(k 1) ln Xi (Xi /) .

lgg (k, , ) = ln k ln ln (k) +
n i=1
The likelihood equations are obtained by taking the partial derivatives of lgg with respect to
k, and , respectively:
n
X
0 = (k) ln + ln Xi , (3)
n i=1
n
1X
0 = k + (Xi /) , (4)
n i=1
n n
kX 1X
0 = 1/ + ln(Xi /) (Xi /) ln(Xi /), (5)
n i=1 n i=1
where () = d ln (x)/dx is the digamma function. Solving the above system of equations
gives the ML estimators of (k, , ). In particular, from (4), we can express as a function
of k and :
!1/
Xi
P
(k, ) = .
nk
Substitute the above display into (5) to give
Xi
P
n
k() = .
n Xi ln Xi ln Xi Xi
P P P
Now, return to the Gamma distribution. We already know that = 1. Use this fact in
the above two displays to obtain the ML-like estimators for k and as
P
n Xi
k = P P P , (6)
n Xi ln Xi ln Xi Xi
and
1 X X X
= n X i ln X i ln X i Xi . (7)
n2
4
From the viewpoint of estimating equations, k and are obtained based on the two estimat-
ing equations (4) and (5), while the two estimating equations originate from the likelihood
equations of the generalized Gamma distribution.
Another common parametrization of the Gamma distribution is to replace by a rate
parameter = 1/. Under this parametrization, we can go through the above procedure
again to obtain an estimator for as
n2
= P P P , (8)
n Xi ln Xi ln Xi Xi
which is simply the inverse of . On the other hand, the estimator for k remains the same
as (6).
Since the two estimating equations for the Gamma parameters are essentially likelihood
equations of the generalized Gamma distribution, it is expected that the performance of the
proposed estimators should be similar to the ML estimators. In the next section, we show
that the asymptotic efficiency of the proposed estimators are almost the same compared with
the ML estimator counterparts.
3 Large Sample Properties
In this section, we first show that the new estimators are strongly consistent in Theorem
1. Then, the asymptotic normality is established and the asymptotic covariance matrix is
derived in Theorem 2.
Theorem 1 The estimators k, and given in (6), (7) and (8) are strong consistent
estimators of , k and , respectively.
Proof Given the n i.i.d. copies of X gam(k, ), let X, Y , Z be the empirical means
of X, ln X, X ln X, respectively. The mean of X is k. Based on the moment generating
5
function of ln X:
(k + z) z
Mln X (z) = , (9)
(k)
the mean of ln X is (k) + ln . To obtain E[X ln X], note that

xk ln x xk ln x
Z Z
(k + 1)
E[X ln X] = k
exp(x/)dx = exp(x/)dx.
0 (k) (k) 0 k+1 (k + 1)
The above formula implies
E[X ln X] = k[(k + 1) + ln ].
According to the strong law of large numbers,
(X, Y , Z) a.s. (k, (k) + ln , k[(k + 1) + ln ]).
Define two functions
g1 (x, y, z) = z xy, g2 (x, y, z) = x/(z xy).
Both g1 and g2 are continuous at (x, y, z) = (k, (k) + ln , k[(k + 1) + ln ]). An
application of the continuous-mapping theorem yields that
= g2 (X, Y , Z) a.s. k[(k + 1) (k)].
For the arguments on the right-hand side of the above display,
d d (k + 1) d
(k + 1) (k) = [ln (k + 1) ln (k)] = [ln ] = [ln k] = 1/k.
dt dt (k) dt
we have a.s. . By the continuous-mapping theorem again, k = g1 (X, Y , Z) a.s.
k. Since = 1/, its strong consistency is immediate based on the continuous-mapping
theorem.
6
Theorem 2 When n , the two estimators k and in (6) and (7) are asymptotically
normally distributed as

2
0 k [1 + k1 (1 + k)] k[1 + k1 (k + 1)]
n(k k, ) d N , . (10)
2
0 k[1 + k1 (k + 1)] [1 + k1 (k)]
Proof Continue with the proof in Theorem 1 and let X gam(k, ). Then E[X] = k
and E[X 2 ] = k2 + k 2 2 . Based on the moment generating function (9) of ln X, define two
quantities:
vk E[ln X] = (k) + ln ,
uk E[(ln X)2 ] = 1 (k) + 2 (k) + 2(k) ln + ln2 ,
where 1 () is the trigamma function equal to d(x)/dx. By making use of these two
quantities, we can have E[X ln X] = kvk+1 , E[(X ln X)2 ] = 2 k(k + 1)uk+2 , E[X ln2 X] =
kuk+2 , and E[X 2 ln X] = 2 k(k + 1)vk+2 . Based on the above expectations, we can show
after tedious calculations that

n[(X, Y , Z) (k, vk , kvk+1 )] d N (03 , ) ,
where 03 is a zero vector with 3 elements, and

2 k 2 k(1 + vk+1 )

= .

1 (k) k1 (k + 1) + vk+1

2 k(1 + vk+1 ) k1 (k + 1) + vk+1 2 k[(k + 1)uk+2 kvk+1
2
]
Because k = g1 (X, Y , Z) and = g2 (X, Y , Z), the partial derivatives of (g1 , g2 ) with re-
spected to the three arguments (x, y, z) and evaluated at (x, y, z) = (k, (k)+ln , k[(k+
1) + ln ]) are

g1 g1 g1 kvk+1
x y z
k 2
k
A = .

g2 g2 g2
x y z
vk k 1
7

An application of the delta method yields that n(k k, ) is normally distributed with
mean 02 and variance matrix AA0 . After tedious simplifications, we can show that

2
k [1 + k1 (1 + k)] k[1 + k1 (k + 1)]
AA0 = .
2
k[1 + k1 (k + 1)] [1 + k1 (k)]
Therefore, the theorem follows.
We compare the asymptotic efficiency of the new estimators, the ML estimators and the
moment estimators. ML estimators of k and have to be obtained by solving the likelihood
equations numerically. The moment estimators of k and are
( Xi )2 Xi2 ( Xi )2
P P P
n
km = P 2 P , m = P .
n X i ( Xi ) 2 ( Xi )
The asymptotic variance matrix, which is also the Cramer-Rao lower bound, for the ML
estimators of (k, ) is obtained by first deriving the Fisher information matrix and then
inverting it, which is given by

1 k
. (11)
k1 (k) 1

2
1 (k)
The asymptotic variance matrix for the moment estimators can be obtained through the
delta method. Figure 1 shows the asymptotic variances of the three different estimators for
k and . Because the variances of k and / do not depend on , we fixed = 1 and
vary k over the interval [0.1, 3], as shown Figure 1. The asymptotic variances of the moment
estimators are much higher than the others. On the other hand, the two variance curves
of the proposed estimators and the ML estimators are almost the same. Simulation in the
next section shows the same conclusion under small samples. Nevertheless, due to the simple
closed forms, the proposed estimators can be calibrated to yield smaller biases under small
samples, as shown in the next section.
8
25 35
new esimator
new esimator
MLE
MLE
30 moment
moment
20
25
asymptotic var of
asymptotic var of k
15 20
15
10
10
5
5
0 0
0 1 2 3 0 1 2 3
k k
(a) (b)
Figure 1: Asymptotic variances of the new estimators, ML estimators and moment estimators
under different values of k: The left panel is for k and the right panel is for .
9
4 Small Sample Properties
In this section, an unbiased estimator for the scale parameter is obtained by calibrating
the ML-like estimator . Unbiased estimators for the rate and the shape parameters are not
available. Nevertheless, we give a method to calibrate the corresponding ML-like estimators
by comparing the exact covariance and asymptotic covariance between the two estimators
and . A Monte Carlo simulation is used to show the good performance of the calibrated
estimators in terms of bias and MSEs.
4.1 Bias correction
Theorem 3 An unbiased estimator for the scale parameter is
n 1 X X X
= = n Xi ln Xi ln Xi Xi .
n1 n(n 1)
While an unbiased estimator for 1/k is

P P P
1 =
n 1 n Xi ln Xi ln Xi Xi
kg k = P .
n1 (n 1) Xi
Proof First, express as

" n
#
1 X X
= 2 (n 1) Xi ln Xi Xi ln Xj .
n i=1 i6=j
Note that Xi are i.i.d. gam(k, ), and Xi and ln Xj are independent when i 6= j. According
to the proof in Theorem 1, E[X ln X] = k[(k + 1) + ln ], E[X] = k and E[ln X] =
(k) + ln , Direct calculation yields
1
E[] = {(n 1)nk[(k + 1) + ln ] n(n 1)k[(k) + ln ]} .
n2
Simplify the above display to give
n1
E[] = .
n
10
n
Therefore, an unbiased estimator for is = n1
.
On the other hand, note that k in (6) can be expressed as

P Xi
n
k = P Xi P Xi .
ln Xi ln Xi
P
n
This expression suggests that k is independent of the scale parameter . Based on the results
P
in Pitman (1937, Section 6), k is independent of i Xi . Therefore,
P
n Xi h X i
E =E n Xi E[k 1 ] = n2 kE[k 1 ].
k
P P P
But based on (6), the above display is equal to E[n Xi ln Xi ln Xi Xi ], which is
equal to n(n 1). Therefore, E[k 1 ] = n1 1

n
k . An unbiased estimator for k 1 is then
n
n1
k 1 .
Next, we will show that the estimator k can be calibrated to yield a smaller bias. First
note that
n1
cov(k, ) = E[k ] E[k]E[] = k E[k].
n
On the other hand, Theorem 2 suggests that the asymptotic covariance between k and is
Acov(k, ) = k[1 + k1 (k + 1)]/n.
Equate the previous two displays to yield
nk + k[1 + k1 (k + 1)]
E[k] = .
n1
If we expand 1 () as a Laurent series Abramowitz and Stegun (1972, Eqn. 6.4.12) and
n+2
keep the first term only, the right-hand side can be approximated by n1
k. Therefore, a
biased-corrected estimator for k can be

P
n1 n(n 1) Xi
k = k = P P P .
n+2 (n + 2) [n Xi ln Xi ln Xi Xi ]
11
2.5
MLE MLE MLE
new estimator new estimator new estimator
calibrated 0.6 calibrated 2 calibrated
moment moment moment
2
0.5
1.5
bias/rMSE of
bias/rMSE of
bias/rMSE of k
1.5
0.4
0.3 1
1
0.2
0.5 0.5
0.1
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
k k k
(a) (b) (c)
Figure 2: Absolute values of the biases (thin lines) and the rMSEs (bold lines) of the new
estimators, the calibrated estimators, the ML estimators and the moment estimators when
the sample size is n = 20: The left panel is for k, the middle is for and the right for .
Similarly, by looking into the covariance and asymptotic covariance between and , a
biased-corrected estimator for the rate parameter can be obtained as
n1 n2 (n 1)
= = P P P .
n+2 (n + 2) [n Xi ln Xi ln Xi Xi ]
4.2 Simulation
A simulation is used to assess the performance of the proposed estimators and the effects of
calibration. Because the variance of k and the asymptotic variance of / are independent
of , we set = 1 in the simulation and vary k from 0.2 to 5. We consider two sample
sizes n = 20 and n = 50. The results under different sample sizes give the same conclusion.
Under each sample size, the absolute biases and root MSEs (rMSEs) of different estimators
of k, and are obtaiend based on 100,000 simulation replications.
The results are shown in Figures 2 and 3. According to the results, the performance
12
0.5 0.9
MLE MLE
1.2 new estimator new estimator
MLE 0.45 0.8
calibrated calibrated
new estimator moment
0.4 moment
1 calibrated 0.7
moment
0.35
0.6
0.8
bias/rMSE of
bias/rMSE of
bias/rMSE of k
0.3
0.5
0.25
0.6 0.4
0.2
0.3
0.4 0.15
0.2
0.1
0.2
0.05 0.1
0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
k k k
(a) (b) (c)
Figure 3: Absolute values of the biases (thin lines) and the rMSEs (bold lines) of the new
estimators, the calibrated estimators, the ML estimators and the moment estimators when
the sample size is n = 50: The left panel is for k, the middle is for and the right for .
of the proposed estimators k and , in terms of biases and rMSEs, is almost the same
compared with the ML estimators. The bias calibration to k, and significantly reduces
their biases and improves the performance of these estimators. On the other hand, the
moment estimators always have larger biases and rMSEs. It is interesting to observe that
the unbiased estimator has a larger rMSE compared with . This is because the weight
n/(n 1) used in the calibration of is larger than 1. The calibration decreases the bias
but increases the variance. The increase in the variance overtakes the decrease in the bias,
leading to an increase in the rMSE.
References
Abramowitz, M. and Stegun, I. A. (1972), Handbook of Mathematical Functions: With For-
mulas, Graphs, and Mathematical Tables, no. 55, Courier Dover Publications.
13
Bhaumik, D. K. and Gibbons, R. D. (2006), One-sided approximate prediction intervals for
at least p of m observations from a gamma population at each of r locations, Techno-
metrics, 48(1), 112119.
Bhaumik, D. K., Kapur, K., and Gibbons, R. D. (2009), Testing parameters of a gamma
distribution for small samples, Technometrics, 51(3), 326334.
Krishnamoorthy, K. and Tian, L. (2008), Inferences on the difference and ratio of the means
of two inverse Gaussian distributions, Journal of Statistical Planning and Inference, 138
(7), 20822089.
Lawless, J. F. (1980), Inference in the generalized gamma and log gamma distributions,
Technometrics, 22(3), 409419.
Meeker, W. Q. and Escobar, L. A. (1998), Statistical Methods for Reliability Data, John
Wiley & Sons.
Pitman, E. J. (1937), The closest estimates of statistical parameters, in Mathematical
Proceedings of the Cambridge Philosophical Society, Cambridge Univ Press, vol. 33, pp.
212222.
Song, K.-S. (2008), Globally convergent algorithms for estimating generalized gamma dis-
tributions in fast signal and image processing, IEEE Transactions on Image Processing,
17(8), 12331250.
Stacy, E. W. (1962), A generalization of the gamma distribution, The Annals of Mathe-
matical Statistics, 11871192.
Vaseghi, S. V. (2008), Advanced Digital Signal Processing and Noise Reduction, John Wiley
& Sons.
14
Whitt, W. (2000), The impact of a heavy-tailed service-time distribution upon the M/GI/s
waiting-time distribution, Queueing Systems, 36(1-3), 7187.
15

Estimating Equations For The Gamma Distribution

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Estimating Equations For The Gamma Distribution

Hochgeladen von

Copyright:

Verfügbare Formate

Maximum Likelihood-Like Estimators for

the Gamma Distribution

Zhi-Sheng YE and Nan CHEN

Department of Industrial & Systems Engineering

National University of Singapore, Singapore, 117576

It is well-known that maximum likelihood (ML) estimators of the two parameters

distribution is a special case of a generalized Gamma distribution. Surprisingly, two out

conjecture by establishing the asymptotic behaviours of the new estimators. In addi-

tion, the closed-forms enable bias-corrections to these estimators. The bias-correction

significantly improves the small-sample performance.

Keywords: Estimating equations; bias-correction; generalized Gamma distribution; asymp-

The Gamma distribution is a two-parameter distribution with probability density function

random-effects. In queueing theory, the gamma distribution is often used as a distribution

physical and biological sciences (e.g., Bhaumik et al. 2009).

In addition, the closed-forms enable bias-correction to these estimators, which significantly

to the new estimators.

2 The New Estimators

Let X gam(k, ) and X1 , X2 , , Xn be n i.i.d. copies of X, where k and are parameters

the log-likelihood function based on the observed X1 , X2 , , Xn is

Substitute the above display into (5) to give

equations of the generalized Gamma distribution.

Another common parametrization of the Gamma distribution is to replace by a rate

again to obtain an estimator for as

the ML estimator counterparts.

3 Large Sample Properties

estimators of , k and , respectively.

of X, ln X, X ln X, respectively. The mean of X is k. Based on the moment generating

the mean of ln X is (k) + ln . To obtain E[X ln X], note that

The above formula implies

According to the strong law of large numbers,

(X, Y , Z) a.s. (k, (k) + ln , k[(k + 1) + ln ]).

Define two functions

g1 (x, y, z) = z xy, g2 (x, y, z) = x/(z xy).

Both g1 and g2 are continuous at (x, y, z) = (k, (k) + ln , k[(k + 1) + ln ]). An

application of the continuous-mapping theorem yields that

= g2 (X, Y , Z) a.s. k[(k + 1) (k)].

For the arguments on the right-hand side of the above display,

we have a.s. . By the continuous-mapping theorem again, k = g1 (X, Y , Z) a.s.

k. Since = 1/, its strong consistency is immediate based on the continuous-mapping

uk E[(ln X)2 ] = 1 (k) + 2 (k) + 2(k) ln + ln2 ,

after tedious calculations that

where 03 is a zero vector with 3 elements, and

Therefore, the theorem follows.

moment estimators. ML estimators of k and have to be obtained by solving the likelihood

equations numerically. The moment estimators of k and are

inverting it, which is given by

k and . Because the variances of k and / do not depend on , we fixed = 1 and

samples, as shown in the next section.

available. Nevertheless, we give a method to calibrate the corresponding ML-like estimators

estimators in terms of bias and MSEs.

4.1 Bias correction

Theorem 3 An unbiased estimator for the scale parameter is

While an unbiased estimator for 1/k is

Proof First, express as

to the proof in Theorem 1, E[X ln X] = k[(k + 1) + ln ], E[X] = k and E[ln X] =

(k) + ln , Direct calculation yields

Simplify the above display to give

On the other hand, note that k in (6) can be expressed as

equal to n(n 1). Therefore, E[k 1 ] = n1 1

Acov(k, ) = k[1 + k1 (k + 1)]/n.

Equate the previous two displays to yield

biased-corrected estimator for k can be

(a) (b) (c)