Articulo

STATISTICS IN MEDICINE
Statist. Med. 18, 2749}2761 (1999)
A WALD TEST COMPARING MEDICAL COSTS BASED

ON LOG-NORMAL DISTRIBUTIONS WITH ZERO
VALUED COSTS
WANZHU TU AND XIAO-HUA ZHOU*

Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202-2859,
U.S.A.
Regenstrief Institute for Health Care, 4th Floor, 1001 West Tenth Street, Indianapolis, IN 46202-2859, U.S.A.
SUMMARY
Medical cost data often exhibit strong skewness and sometimes contain large proportions of zero values.
Such characteristics prevent the analysis of variance (ANOVA) F-test and other frequently used standard
tests from providing the correct inferences when the comparison of means is of interest. One solution to the
problem is to introduce a parametric structure based on log-normal distributions with zero values and then
construct a likelihood ratio test. While such a likelihood ratio test possesses excellent type I error control
and power, its implementation requires a rather complicated iterative optimization program. In this paper,
we propose a Wald test with simple computation. We then conduct a Monte Carlo simulation to compare
the type I error rates and powers of the proposed Wald test with those of the likelihood ratio test. Our
simulation study indicates that although the likelihood ratio test slightly outperforms the Wald test, the
performance of the Wald test is also satisfactory, especially when the sample sizes are reasonably large.
Finally, we illustrate the use of the proposed Wald test by analysing a clinical study assessing the e!ects of
a computerized prospective drug utilization intervention on in-patient charges. Copyright 1999 John
Wiley & Sons, Ltd.
1. INTRODUCTION
In medical cost data analysis, it is often of interest to compare the mean costs of di!erent
treatment or patient groups. One di$culty involved in such a comparison is that the cost data are
often skewed due to the relatively high costs in#icted by a few patients. In the presence of strong
skewness, the most frequently used tests, such as the ANOVA F-test, often fail to provide correct
inferences. In many cases, the skewed cost data can be modelled by log-normal distributions.
Besides the skewness, another complication that often accompanies the cost data is the inclusion
of zero values that prevents the use of the logarithmic transformation. Therefore, how to test the
* Correspondence to: Xiao-Hua Zhou, Division of Biostatistics, Department of Medicine, Indiana University School of
Medicine, Regenstrief Health Center, 4th Floor, 1001 West Tenth Street, Indianapolis, IN 46202-2859, U.S.A. E-mail:
azhou@iupui.edu
Contract/grant sponsor: AHCPR
Contract/grant numbers: R29HS08559, R03HS09543
Contract/grant sponsor: NIH
Contract/grant number: R01MH58875
CCC 0277}6715/99/202749}13$17.50 Received May 1998

Copyright 1999 John Wiley & Sons, Ltd. Accepted December 1998
2750 W. TU AND X.-H. ZHOU
mean equality of several independent populations containing both log-normal and zero observa-
tions becomes a question of great practical importance.
To address this question, Zhou and Tu examined the currently available methods, including the
ANOVA F-tests (on both the original scale and log scale), the Brown}Forsythe test, the Welch test,
and the non-parametric Kruskal}Wallis tests.} Unfortunately, none of these tests was found
adequate in providing correct inferences in the log-normal situation. Zhou and Tu then proposed
a likelihood ratio test for comparing the mean equality by introducing a parametric structure
modelling the log-normal distribution and the occurrences of zero observations. This likelihood
ratio test is shown to be superior to the existing methods in terms of type I error rate and power.
However, the implementation of the likelihood ratio test requires an iterative optimization
algorithm, which could potentially restrict the use of the procedure among medical researchers.
In this paper, we propose a Wald type test that is easily implemented on various computing
platforms. In Section 2, we describe an example that motivated the analysis. The example is
a randomized clinical trial on the e!ects of a computerized prospective drug utilization interven-
tion on health care charges. In Section 3, we focus on the formulation of the problem and the
development of the new test. In Section 4, we compare the performance of the new test with
those of the various existing tests through an extensive Monte Carlo simulation. In Section 5, we
analyse the previously introduced example as an illustration of the proposed analysis. Finally,
we conclude the paper with a brief discussion in Section 6.
Since log-normal data are abundant in medical research, the proposed procedure has the
potential of being used in a wide variety of applications other than cost analysis. For example,
the age of onset for most genetic diseases, the blood pressure measures of patients in a speci"c
age group, and the numbers of premature ventricular contractions (PVCs) during a one-minute
period from EKG readings of patients with heart diseases are all known to be log-normally
distributed.}
2. AN EXAMPLE
Each year, a great portion of the U.S. health care resources is consumed by improperly prescribed
medications. One programme aimed at reducing such waste is called the prospective drug
utilization review (DUR). It is a process of reviewing physicians' prescriptions and comparing
them to the established clinical guidelines before dispensing. Tierney et al. has conducted a large
randomized clinical trial to study the e!ects of the prospective DUR programme on various
health care costs based on a computerized medical record system intervention. One of the goals of
the study was to test the e!ect of the intervention on the average inpatient charges among
di!erent groups of congestive heart failure patients. The study includes three patient groups:
a control group and two intervention groups. The "rst intervention group includes patients
whose physicians receive the computer-generated protocol-base treatment recommendations.
The second intervention group includes patients whose physicians and pharmacists both receive
such recommendations.
One characteristic of these charge data is that they contain large proportions of zeros. For
example, the "rst intervention group consisted of 142 patients, and 108 of them were not
hospitalized during the study period. The second intervention group consisted of 113 patients,
and 85 of them were not hospitalized. Among the 119 patients in the control group, 98 patients
were not hospitalized. Since patients without hospitalizations have zero inpatient charges, which
are desirable outcomes, they should not be discarded.
Copyright 1999 John Wiley & Sons, Ltd. Statist. Med. 18, 2749}2761 (1999)
WALD TEST COMPARING MEDICAL COSTS 2751
Figure 1. Q-plots for the in-patient charge data in the example
Further examinations reveal the strong positive skewness of the inpatient charges. This is best
illustrated with the Q-plots (three top panels) given in Figure 1. Applying the log-transformation
on the positive charges, we see that the data are normalized (three bottom panels of Figure 1). The
normalization is also con"rmed by the Shapiro}Wilk test with p-values of 0)6726, 0)1118 and
0)7196, which further justi"es the use of the log-normal distributions.
As observed by Zhou and Tu, the standard methods were no longer appropriate in the
analysis of such data, due to the strong skewness and high percentages of zero values.
3. METHODOLOGY
3.1. Formulation of the problem
Assume that we have K independent populations and let > ,2, > be a random sample from
H LHH
the jth population, where j"1, 2,2, K. Let M "E(> ) be the jth population mean, where
H GH
i"1,2,2, n . We are interested in testing.
H
H : M "M "2"M (1)
)
against the alternative H : M OM for some jOj. We assume that the probability of having
H HY
a zero response from the jth population is d , which may di!er from population to population,
H
and that the non-zero observations follow a log-normal distribution. Denoting the number of
zero observations in the jth sample as n and that of the non-zero observations as n , we
H H
therefore have n "n #n . Without loss of generality, we assume that in the jth sample the
H H H
non-zero observations come "rst, that is, log (> )"n &N(k , p), for i"1,2, n and > "0,
GH H H H H GH
for i"n #1,2, n ; and n &Bin (n , d ). This modelling structure is an extension of the
H H H H H
one-sample delta distribution proposed by Aitchison in a one-sample estimation problem.
Owen and DeRouen also studied the model for one-sample interval estimates. Similar log-
normal models have also been used by several authors for the purpose of estimating the
proportions of the cured cases in survival analysis. Though appearing to be based on
similar modelling ideas, the log-normal survival models are quite di!erent from our model.
The major di!erences include: (i) the case of cured patient is rarely observable in survival
setting, but the patients with zero costs are readily observable; (ii) the likelihood functions
in survival analysis depend on the censoring mechanism, which is di!erent from the case
with zero costs; (iii) the focus of the survival analysis is also di!erent from that of ours. In survival
analysis, it is often of interest to estimate the proportions of the cured cases. Here we use the
model as a basis for the inference on the overall mean equality of independent populations on the
original scale.
Under this formulation, the likelihood function is

) 1 LH
¸(h; Y) J (1!exp (l ))LH (exp (l ))LH p\LH exp ! (log (y )!k ) (2)
H H H 2p GH H
H H G
where l "log (1!d ) and h"(l , k , p,2, l , k , p ). To test the null hypothesis de"ned by
H H ) ) )
(1), we "rst note that the logarithm of the expected value of > can be written as a linear function
GH
of the parameters, k , p and l :
H H H
log M "l #k #p/2. (3)
H H H H
Therefore, instead of testing (1), one can test H : l #k #p/2"g for j"1,2, K, which is
H H H
equivalent to (1).
3.2. A Wald test

Under the model formulation speci"ed in Section 3.1, Zhou and Tu proposed a likelihood
ratio test and showed that the test has excellent type I error control and power as compared
to the other methods. Their testing procedure, however, does require rather complicated
numerical computation. This di$culty in computation arises from the fact that the restricted
maximum likelihood estimators do not have closed form solutions. Thus an iterative numerical
algorithm has to be used in order to solve the restricted maximum likelihood estimating
equations.
In this paper, we consider an alternative approach by rewriting the null hypothesis of interest
(1) as the equivalent hypothesis:
H :C2h"0 (4)

where

1 1 !1 !1 ! 2 0 0 0

C2" $ $ $ $ $ $ $ $ $ $ .
1 1 0 0 0 2 !1 !1 !

To construct a test statistic for (4), we than "nd the maximum likelihood estimators for the
vector of the unknown parameters h:
n 1 LH 1 LH
l( "log H , k( " log y , pL " (log y !kL ). (5)
H n n GH H n GH H
H H G H G
Let I(h) be the information matrix associated with the likelihood function (2). We obtain the
following likelihood-based Wald statistic:
=(h< 2C) (C2I(h< )\C)\(h< 2C)2.
In the Appendix we show that = has the following explicit form:
) n exp (l( ) (l( #k( #p( /2!(l( #k( #p( /2))

=" H H H H H
1!exp (l( )#p( #p( /2
H H H H

) \
! (n exp (l( )) (1!exp (l( )#p( #p( /2)\
H H H H H
H

) (n exp (l( )) (l( #k( #p( /2!(l( #k( #p( /2))
; H H H H H . (6)
1!exp (l( )#p( #p( /2
H H H H
Under the null hypothesis H :C2h"0, the statistic = converges in distribution to s with
P
r"K!1.
The expression of = given by (6) may be further simpli"ed in some special cases. When the
samples do not contain zero observations (d "0), for instance, = can be reduced to
G

) n (kL #pL /2!(kL #pL /2)) 1 ) n (kL #pL /2!(kL #pL /2))
=" H H H ! H H H .
pL #pL /2 ) n (pL #pL /2)\ pL #pL /2
H H H H H H H H H H
(7)
In other words, if data follow log-normal distributions strictly, = can be used to test the
equality of means of several log-normal populations. Furthermore, when K"2, = is reduced to
a test statistic similar to the one given by Zhou et al.
From (6) and (7) it is clear that the computation of test statistic = only involves the su$cient
statistics d) , k( and p( , which can often be readily obtained from any statistical package. When
H H H
data sets are small, it is often possible to compute the value of = with a calculator.
The formulation we use to construct the Wald test for mean equality represents a general class
of tests. It is very #exible in forming new tests. Depending on the hypotheses of interest, di!erent
C matrices can be used to form new tests. For instance, if a test for the equality of variance on the
log scale (H :p"2"p ) is desired, a Wald type test may be formed using the following
)
Table I. Designs for parameter con"gurations in the simulation study
Design d k p d k p d k p

1 0)0 1)1 0)4 0)0 1)2 0)2 0)0 0)8 1)0
2 0)0 2)5 1)5 0)0 3)0 0)5 0)0 1)0 4)5
3 0)0 2)5 2)0 0)0 3)0 1)0 0)0 1)0 5)0
4 0)0 1)2 1)0 0)0 1)2 1)0 0)0 1)2 1)0
5 0)0 1)0#1)0/(n 1)0 0)0 1)0#2)0/(n 1)5 0)0 1)0 0)8
6 0)0 1)0#2)5/(n 1)0 0)0 1)0#3)5/(n 1)5 0)0 1)0 0)8
7 0)1 2 0)4 0)2 1)8178 1)0 0)3 1)4513 2)0
8 0)2 2 0)4 0)3 1)8335 1)0 0)1 1)0822 2)0
9 0)3 3)0 2)0 0)2 3)3665 1)0 0)1 1)2487 5)0
10 0)1 3)0 4)0 0)2 3)6178 3)0 0)3 4)2513 2)0
11 0)8319 8.844 1)4975 0)8380 8)973 1)3134 0)7699 9)041 0)4756
12 0)1 1)0#1)0/(n 1)0 0)2 1)0#2)0/(n 1)0 0)3 1)0000 0)8
13 0)3 1)0#1)0/(n 0)6 0)2 1)0#2)0/(n 1)0 0)1 1)0000 1)0

C matrix:

0 0 1 0 0 !1 2 0 0 0
C2" $ $ $ $ $ $ $ $ $ $ .
0 0 1 0 0 0 2 0 0 !1
This test provides a way to check the homoscedasticity assumption. When such an assumption
holds, the standard tests will be able to provide correct inference, given the proportions of zeros in
the samples are the same.
4. A SIMULATION STUDY
In this section we report the results of a simulation study performed to evaluate the type I error
rates and the powers of the Wald test under the model formulation.
Throughout the simulation study we consider the case of testing the mean equality of three
independent populations, that is, K"3. We use a nominal signi"cance level of a"0)05 and
N"10,000 simulated samples for each parameter setting to ensure the margin of error is less
than 0)005 with 95 per cent con"dence.
The parameter con"gurations we use in the simulation are shown in Table I. These settings are
selected to represent various situations with di!erent levels of skewness. Designs 1}4 and 7}11
represent the cases where the null hypothesis is satis"ed, and designs 5, 6, 12 and 13 represent the
situations where the null hypothesis is not true. The "rst six settings are the situations where no
zero values are involved.
The pseudo-random samples used in the simulation are generated in the following way: for
each sample, we "rst generate the number of zero observations with a binomial generator
(subroutine RNBIN from the IMSL library), then "ll in the non-zero observations with log-
normal variates; to generate log-normal variates, we use subroutine DRNNOR to obtain the
normal random numbers and then exponentiate them to get the log-normal variates.
Table II reports the type I error rates of the Wald test, likelihood ratio test and those of the
various standard tests in strictly log-normal samples (no zero values involved). The standard tests
Copyright 1999 John Wiley & Sons, Ltd.
Table II. Type I error rates for the various tests (designs 1}4)
Sample size Design F-test F-test Welch test F* test K}= test Wald test Likelihood
(log scale) (original scale) ratio test
25 1 0)3853 0)0833 0)0876 0)0811 0)3370 0)0696 0)0566

2 0)9856 0)2361 0)3211 0)2299 0)9753 0)0801 0)0557
3 0)9656 0)1803 0)2749 0)1733 0)9464 0)0834 0)0555
4 0)0478 0)0378 0)0429 0)0331 0)0501 0)0603 0)0559
WALD TEST COMPARING MEDICAL COSTS

50 1 0)6622 0)0819 0)0811 0)0805 0)6125 0)0672 0)0553
2 1)0000 0)2466 0)2973 0)2445 1)0000 0)0735 0)0537
3 0)9999 0)2115 0)2719 0)2086 0)9994 0)0712 0)0542
4 0)0514 0)0437 0)0516 0)0423 0)0513 0)0592 0)0517
100 1 0)9300 0)0721 0)0710 0)0715 0)9003 0)0565 0)0513
2 1)0000 0)2343 0)2582 0)2337 1)0000 0)0599 0)0514
3 1)0000 0)2148 0)2504 0)2135 1)0000 0)0556 0)0496
4 0)0514 0)0440 0)0526 0)0434 0)0488 0)0522 0)0512
200 1 0)9987 0)0692 0)0603 0)0686 0)9966 0)0531 0)0535
2 1)0000 0)2132 0)2195 0)2126 1)0000 0)0552 0)0512
3 1)0000 0)2042 0)2202 0)2041 1)0000 0)0531 0)0515
4 0)0483 0)0440 0)0501 0)0438 0)0486 0)0494 0)0538
400 1 1)0000 0)0689 0)0610 0)0689 1)0000 0)0554 0)0532
2 1)0000 0)1943 0)1905 0)1941 1)0000 0)0557 0)0491
3 1)0000 0)1935 0)1979 0)1935 1)0000 0)0553 0)0515
4 0)0500 0)0494 0)0529 0)0494 0)0516 0)0506 0)0509
1000 1 1)0000 0.0629 0)0527 0)0628 1)0000 0)0525 0)0488
2 1)0000 0)1943 0)1454 0)1651 1)0000 0)0555 0)0493
Statist. Med. 18, 2749}2761 (1999)
3 1)0000 0)1935 0)1595 0)1689 1)0000 0)0548 0)0497

4 0)0478 0)0496 0)0514 0)0496 0)0566 0)0525 0)0502
10,20,30 1 0)2862 0)0579 0)0829 0)0558 0)3118 0)0818 0)0627
2 0)9761 0)2503 0)2993 0)1878 0)9713 0)0709 0)0673
3 0)9464 0)2183 0)2163 0)1426 0)9418 0)0712 0)0666
4 0)0486 0)0396 0)0662 0)0385 0)0479 0)0685 0)0621
2755
Table III. The empirical powers of the various tests (designs 5 and 6)
Design Sample size F-test F-test Welch test F* test K}= Wald Likelihood
(log scale) (original test test ratio test
scale)
5 25 0)1957 0)2549 0)2439 0)2372 0)1848 0)3922 0)4276

50 0)2079 0)4118 0)3899 0)4018 0)1950 0)5361 0)5722
100 0)2113 0)6336 0)5928 0)6301 0)1992 0)7350 0)7530
200 0)2036 0)8547 0)8269 0)8538 0)1945 0)9144 0)9192
400 0)2066 0)9814 0)9748 0)9813 0)1993 0)9916 0)9930
1000 0)2054 1)0000 0)9998 1)0000 0)1977 1)0000 1)0000
6 25 0)5575 0)4808 0)5980 0)4542 0)5547 0)7334 0)7189
50 0)5678 0)6682 0)7289 0)6586 0)5651 0)8270 0)8286
100 0)5764 0)8410 0)8617 0)8389 0)5726 0)9273 0)9310
200 0)5759 0)9593 0)9612 0)9589 0)5635 0)9876 0)9880
400 0)5809 0)9967 0)9964 0)9966 0)5755 0)9992 1)0000
1000 0)5795 1)0000 1)0000 1)0000 0)5743 1)0000 1)0000
under consideration include ANOVA F-tests (on both original scale and log scale), two modi"ed
versions of the F-test (Welch test and Brown}Forsythe test F*), and the non-parametric
Kruskal}Wallis test.
From Table II, it is clear that the likelihood ratio test has the best type I error rate (closest to
the nominal level 0)05), and is closely followed by the Wald test. When the sample sizes are large,
the type I error rates of the two are quite close and both converge to the nominal level. Simulation
also indicates that all the standard tests have rather in#ated type I error rates. In general, the
ANOVA F-test on the original scale, its modi"ed versions, and the non-parametric test fail to
control for type I error rates because of the strong skewness of the data. The F-test on the
log-transformed data fails because it is actually testing a di!erent hypothesis H* :k "k "k .

This hypothesis is in general not equivalent to H :M "M "M , unless p "p "p . This is

supported by our observation based on additional simulation (not reported here): as the sample
sizes increase to a very large number (10,000, for instance), the type I error rates for the F-test on
the log scale and the Kruskal}Wallis test actually become worse; the type I error rates for the
other standard tests generally move in the direction of the nominal level.
Similarly we tabulated the empirical powers of the tests in Table III. The powers reported in
Table III clearly indicate the advantages of the likelihood ratio test and Wald test. Notice that the
empirical powers reported here are not adjusted for the in#ated type I error rates of the tests.
Since the type I error rates for the standard tests are generally larger than those of the Wald test
and likelihood ratio test, the adjustment would only decrease the powers of these standard tests.
Hence, we can still conclude that the Wald test and the likelihood ratio test have the best power
among the tests considered.
For the general situation with both zero and non-zero log-normal observations, we simply
concentrate on the comparison of the two newly proposed procedures, Wald test and the
likelihood ratio test, with the standard ANOVA F-test. Table IV gives the type I error rates of
these tests when samples contain zeros.
Simulation results in Table IV suggest that both the Wald test and the likelihood ratio test
perform well in the large sample situations, even when data contain large proportions of zeros.
Table IV. Type I error rates for the likelihood ratio test and Wald test (designs 7}11)
Design Sample size Wald test Likelihood ANOVA F-test

ratio test
7 50 0)0679 0)0573 0)1514

100 0)0574 0)0506 0)4429
200 0)0498 0)0475 0)6892
400 0)0546 0)0542 0)9019
1000 0)0494 0)0477 0)9911
50,75,100 0)0628 0)0558 0)1011
8 50 0)0621 0)0565 0)4117
100 0)0603 0)0529 0)6475
200 0)0562 0)0509 0)8999
400 0)0526 0)0467 0)9832
1000 0)0474 0)0502 0)9992
50,75,100 0)0530 0)0503 0)4143
9 50 0)0704 0)0574 0)4091
100 0)0595 0)0509 0)4215
200 0)0584 0)0511 0)6406
400 0)0524 0)0506 0)7797
1000 0)0481 0)0502 0)8855
50,75,100 0)0613 0)0552 0)3613
10 50 0)0562 0)0539 0)0589
100 0)0486 0)0534 0)1071
200 0)0440 0)0497 0)2182
400 0)0484 0)0516 0)4886
1000 0)0487 0)0491 0)8916
50,75,100 0)0803 0)0552 0)0568
11 119,142,113 0)0691 0)0504 0)9993
Although the likelihood ratio test uniformly out-performs the Wald test, the type I error rates of
the two are reasonably close in most of the cases. For either test, the proportions of zero
observations do not appear to have a!ected the type I error rates in any obvious pattern, but the
likelihood ratio test does appear to be more robust against the unequal sample sizes and the
severity of the skewness. As expected, the ANOVA F-test on the original scale has even worse
type I error control than it has in the cases where samples only contain log-normal observations.
Besides the skewness of the data, a new problem arises because the zeros accumulate a probability
mass which violates the unimodality of the underlying normal data assumption required by the
F-test. Thus the ANOVA F-test should not be trusted in providing valid inferences in this
situation.
Turning to the powers of the tests, we report the simulation results in Table V. From Table V,
we conclude that both the Wald test and the likelihood ratio test are similar and both have
reasonable power to detect the true di!erences in the means. Since the type I error rates of the
ANOVA F-test are not correctly controlled, its empirical power does not re#ect the true power of
the test, even though it appears to be greater numerically than those of the Wald and the
likelihood ratio tests.
In summary, simulation results indicate a slight advantage of the likelihood ratio test over the
Wald test. The performance of the proposed Wald test, however, is also satisfactory. Its type I
Table V. The empirical powers of the likelihood ratio test and the Wald test (designs
12 and 13)
Design Sample size Wald test Likelihood ANOVA F-test

ratio test
12 50 0)4402 0)4008 0)5720

100 0)6349 0)6209 0)9615
200 0)8467 0)8410 0)9987
400 0)9799 0)9775 1)0000
1000 1)0000 1)0000 1)0000
50,75,100 0)4301 0)5986 0)7389
13 50 0)3546 0)3511 0)4342
100 0)5937 0)5993 0)6410
200 0)8889 0)8833 0)9808
400 0)9962 0)9962 0)9997
1000 1)0000 1)0000 1)0000
50,75,100 0)3747 0)3340 0)3842
error rate is usually close to that of the likelihood ratio test, especially when the sample sizes
are reasonably large. It provides an easy alternative when the likelihood ratio testing procedure
is not computationally available. Another advantage of the Wald test revealed by simulation is
the low consumption of the computing power. As we have observed in our simulation study,
it often takes 50}100 times more computing time to carry out a likelihood ratio test than a
Wald test.
5. ANALYSIS OF THE IN-PATIENT CHARGE DATA

In the in-patient charge example, we are interested in testing the equality of the average in-patient
charges of three di!erent intervention groups. Let M be the expected charges in the jth group.
H
Our null and alternative hypotheses are H :M "M "M , versus H : at least one of M 's is
H
di!erent.
As we pointed out in Section 2, the data can be viewed as from populations containing both
log-normal observations and zero values. The sample means and sample standard deviations are
$2674)97 and $12905)09 for the control group, $2470)22 and $6434)60 for the "rst intervention
group, and $2324)57 and $6561)51 for the second intervention, respectively. Using only the
positive charge observations, we obtain the sample means and standard deviations of $15158)19
and $27998)40 for group 1, $10316)79 and $9669)75 for group 2, and $9381)29 and $10485)18 for
group 3, respectively.
The Wald test and the likelihood ratio test yield p-values of 0)908 and 0)907, respectively. Both
tests lead to the same conclusion that the interventions have no e!ects on the average in-patient
charges.
6. CONCLUDING REMARKS
In this paper, we proposed a test for comparing the overall mean equality of log-normal
populations with additional zero valued observations. An alternative way of handling the zeros
is to use a two-step procedure which "rst tests whether the proportions of zero costs in the
independent populations are the same, then use a separate analysis on the non-zero costs. As we
have commented at the end of Section 3.2, di!erent Wald type tests may be constructed for these
separate hypotheses by changing the form of matrix C. The focus of this paper, however,
remains in testing the overall mean equality. The problem of comparing the mean equality of the
log-normal populations with additional zero observations on the original scale is related
to the general problem of retransformation bias. Focusing on the estimation problems,
Duan proposed a non-parametric method based on the resampling principle. In
estimating the parameters of interest, Duan's approach is fairly general in terms of trans-
formation function and data distribution and can be shown to have the optimal property of
consistency. In this paper we emphasize the testing situation in a parametric framework.
We have proposed a Wald test based on a fully parametric structure for populations containing
both zero and log-normal observations. Compared to the previously proposed likelihood ratio
test, the strength of this new procedure lies in its simplicity and good large sample performance.
From a practical point of view, it has the potential of being used in a variety of applications in
medical research. As an alternative to the likelihood ratio test, it can be carried out quite easily
without losing much type I error control and power. Finally, as in the case of analysis of
variances, the comparison of several population means inevitably leads to the question
of pairwise comparison when the null hypothesis is rejected. A simple approach to the question is
to develop simultaneous con"dence intervals for the pairwise di!erences of the population means,
and then adjust the multiplicity using Bonferroni's method. This can be easily done in our
parametric model formulation. However, the resulting con"dence intervals from this approach
are conservative. We are developing more e$cient and appropriate approaches for the problem
of multiple comparisons. Another extension of the model currently under investigation is the
incorporation of other covariates into the modelling structure. This can generally be handled
in a regression setting.
APPENDIX: THE DERIVATION OF THE PROPOSED TEST STATISTIC =

Because the vectors (> ,2, > ),2, (> ,2, > ) are independent
L ) L))

I (h)

I(h)" \
I (h)
)
where

*l/*l *l/*l *k *l/*l *p
H H H H H
I (h)"!E *l/*k *l *l/*k *l/*k *p .
H H H H H H
*l/p *l *l/*p *k *l/*p *p
H H H H H H
Since
="(h< 2C) (C2I(h< )\C)$h< 2C)2
to simplify the expression for =, we need to calculate the inverse of C2I(h< )\C.
After some algebraic manipulation, we obtain

n elH/1!elH 0 0
H
l
I (h)" 0 n e H/p 0 .
H H H
l
0 0 n e H/(2p)
H H
To simplify the notation, we let
1!exp (l )#p#p/2
d" H H H
H n exp (l )
H H
for j"1,2, K. Simple algebra leads to
C2I(h< )\C"diag (d ,2, d )#d (1,2, 1)2 (1,2, 1).

)
To calculate the inverse of C2I(h< )\C, we use the following lemma.
¸emma
(diag (a ,2, a )#b(1,2, 1)2 (1,2, 1))\"

L
1
diag (a\,2, a$! (a\,2, a\) (a\,2, a\ ).
L b\# L a\ L L
G G
Using lemma 1, we obtain

) \
(C2I(h< )\C)\"diag (d\,2, d\)! d\ (d\,2, d\)2 (d\,2, d\).
) H ) )
H
Pre- and post-mutiplying vector
h< 2C"(l; #k; #p; /2!(l( #k( #p( /2),2, l( #k( #p( /2!(l( #k( #p( /2))
) ) )
we have
) n exp (l( ) (l( #k( #p( /2!(l( #k( #p( /2))

=" H H H H H
1!exp (l( )#p( #p( /2
H H H H

) \
! (n exp (l( ) (1!exp (l( )#p( #p( /2)\
H H H H H
H

) (n exp (l( )) (l( #k( #p( /2!(l( #k( #p( /2))
; H H H H H .
1!exp (l( )#p( #p( /2
H H H H
This completes the derivation of (6).
ACKNOWLEDGEMENTS
This work was supported in part by AHCPR grants R29HS08559 and R03HS09543 and NIH
R01MH58875. We would like to thank the two reviewers for many helpful suggestions that
resulted in an improved version of this manuscript. We are also grateful to Dr. Susan M. Perkins
of Indiana University for her careful proof-reading of the manuscript.
REFERENCES
1. Zhou, X. H. and Tu, W. &Comparison of the means several independent populations when their samples
contain log-normal and possibly zero observations', Biometrics, 55, 641}646 (1999).
2. Brown, M. B. and Forsythe, A. B. &The small sample behavior of some statistics with heterogeneous
variances', ¹echnometrics, 30, 129}132 (1974).
3. Hollander, M. and Wolfe, D. Nonparametric Statistical Methods, Wiley, New York, 1973.
4. Welch, B. L. &On the comparison of several mean values', Biometrika, 38, 330}336 (1951).
5. Parkin, T. B. &Evaluation of statistical methods for determining di!erences between samples from
lognormal populations', Agronomy Journal, 85, 747}453 (1993).
6. Armenian, H. and Khoury, M. &Age at onset of genetic diseases', American Journal of Epidemiology, 113,
596}605 (1981).
7. Makuch, R. W., Freeman, D. H. and Johnson, M. F. &Justi"cation for the log-normal distribution as
a model for blood pressure', Journal of Chronic Diseases, 32, 245}250 (1979).
8. Berry, D. A. &Logarithmic transformations in ANOVA', Biometrics, 43, 439}456 (1987).
9. Tierney, W. M., Overhage, M., Murray, M., Zhou, X. H., Harris, L. and Wolinsky, F. ¹he Final Report of
the Computer-based Prospective Drug ;tilization Review Report (1993}1997), U.S. Agency for Health
Care Policy and Research, Bethesda, MD, 1998.
10. Aitchison, J. &On the distribution of a positive random variable having discrete probability mass at the
origin', Journal of the American Statistical Association, 50, 901}908 (1955).
11. Owen, W. J. and DeRouen, T. A. &Estimation of the mean for log-normal data containing zeros and
left-censored values, with applications in the measurement of worker exposure to air contamination',
Biometrics, 36, 707}719 (1980).
12. Boag, J. W. &Maximum likelihood estimates of the proportion of patients cured by cancer therapy',
Journal of the Royal Statistical Society, Series B, 11, 15}44 (1949).
13. Maller, R. and Zhou, X. Survival Analysis with ¸ong-term Survivors, Wiley, New York, 1996.
14. Ser#ing, R. J. Approximation ¹heorems of Mathematical Statistics, Wiley, New York, 1980.
15. Zhou, X. H., Gao, S. J. and Hui, S. L. &Methods for comparing the means of two independent log-normal
samples', Biometrics, 53, 1129}1135 (1997).
16. Duan, N., Manning, W. G., Morris, C. N. and Newhouse, J. P. &A comparison of alternative models for
the demand for medical care', Journal of Business and Economic Statistics, 1, 115}126 (1983).
17. Duan, N. &Smearing estimate: a nonparametric retransformation method', Journal of the American
Statistical Association, 78, 605}610 (1983).

Articulo

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Articulo

Hochgeladen von

Copyright:

Verfügbare Formate

STATISTICS IN MEDICINE

Statist. Med. 18, 2749}2761 (1999)

A WALD TEST COMPARING MEDICAL COSTS BASED

WANZHU TU AND XIAO-HUA ZHOU *

CCC 0277}6715/99/202749}13$17.50 Received May 1998

Figure 1. Q-plots for the in-patient charge data in the example

3.2. A Wald test

=(h< 2C) (C2I(h< )\C)\(h< 2C)2.

In the Appendix we show that = has the following explicit form:

) n exp (l( ) (l( #k( #p( /2!(l( #k( #p( /2))

Table I. Designs for parameter con"gurations in the simulation study

25 1 0)3853 0)0833 0)0876 0)0811 0)3370 0)0696 0)0566

WALD TEST COMPARING MEDICAL COSTS

3 1)0000 0)1935 0)1595 0)1689 1)0000 0)0548 0)0497

5 25 0)1957 0)2549 0)2439 0)2372 0)1848 0)3922 0)4276

Design Sample size Wald test Likelihood ANOVA F-test

7 50 0)0679 0)0573 0)1514

Design Sample size Wald test Likelihood ANOVA F-test

12 50 0)4402 0)4008 0)5720

5. ANALYSIS OF THE IN-PATIENT CHARGE DATA

APPENDIX: THE DERIVATION OF THE PROPOSED TEST STATISTIC =

="(h< 2C) (C2I(h< )\C)\(h< 2C)2

After some algebraic manipulation, we obtain

C2I(h< )\C"diag (d ,2, d )#d (1,2, 1)2 (1,2, 1).

(diag (a ,2, a )#b(1,2, 1)2 (1,2, 1))\"

) n exp (l( ) (l( #k( #p( /2!(l( #k( #p( /2))

Das könnte Ihnen auch gefallen

WANZHU TU AND XIAO-HUA ZHOU*

=(h< 2C) (C2I(h< )\C)\(h< 2C)2.

) n exp (l( ) (l( #k( #p( /2!(l( #k( #p( /2))

="(h< 2C) (C2I(h< )\C)\(h< 2C)2

C2I(h< )\C"diag (d ,2, d )#d (1,2, 1)2 (1,2, 1).

(diag (a ,2, a )#b(1,2, 1)2 (1,2, 1))\"

) n exp (l( ) (l( #k( #p( /2!(l( #k( #p( /2))