Sie sind auf Seite 1von 6

LOGLINEAR MODEL FOR TWO DIMENSIONS

Loglinear models treat all variables the same, no response - explanatory distinction. When there is one response and
one or more explanatory variables, the logit model is more natural.

Loglinear models model the cell counts, the nij. It assumes a Poisson distribution and uses a log link in the general
linear model framework. Even though it assumes a Poisson distribution, it can be used for all types of sampling
( multinomial and independent multinomial as well ).

Setting ij = E( nij ) we have

ln( ij ) = + i + j + ij
X Y XY
and this is called the loglinear model.

It is the interaction term that describes the relationship between X and Y. The main effect terms ( in this case the
iX and Yj ) will always be in the models along with the intercept term.

In order to uniquely estimate the parameters we need to set constraints. The constraints are not unique but the fit of
various models is unique regardless of the constraints. R again uses indicator variables and sets the first level of all
parameters ( numerically or alphabetically ) equal to 0, i.e., anything with a 1 in its subscript will be set to 0.

Term # of Parameters # of Independent Parameters

1 1

iX I I1

Yj J J1

ijXY IJ (I 1)(J 1)
IJ

This model is called the saturated model because we have I J independent parameters to estimate and I J nij to use
to estimate them. We can estimate everything in the model but then we have no degrees of freedom left over to test
any hypotheses. We have no measure of error since ij = nij .

ni n j
ln( ij ) = + i + j ij =
X Y
Independence Model:
n
Example - Loglinear Model for 2 x 2 Tables
In a study in France, 139 people were given regular doses of Vitamin C over a period of time and another 140
people were given a placebo. At the end of the time period, each person was classified as to whether or not they had a
cold during the period. The results are given below with the expected values for the independence model given in
parentheses.

Cold
No Yes
Placebo 109 (115.91) 31 (24.09) 140
Vitamin C 122 (115.09) 17 (23.91) 139
231 48 279

> treat <- c("placebo","placebo","vitC","vitC")


> cold <- c("no","yes","no","yes")
> count <- c(109,31,122,17)
> data <- data.frame(treat,cold,count)
> data
treat cold count
1 placebo no 109
2 placebo yes 31
3 vitC no 122
4 vitC yes 17

> data.ind <- glm(count~factor(treat)+factor(cold),family=poisson(link=log))


> summary(data.ind)
Call: glm(formula = count ~ factor(treat) + factor(cold), family = poisson(link = log))
Deviance Residuals:
1 2 3 4
-0.6487 1.3484 0.6382 -1.4918
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.752848 0.088812 53.516 <2e-16 ***
factor(treat)vitC -0.007168 0.119738 -0.060 0.952
factor(cold)yes -1.571217 0.158626 -9.905 <2e-16 ***
Null deviance: 135.4675 on 3 degrees of freedom
Residual deviance: 4.8717 on 1 degrees of freedom AIC: 34.004

> cbind(treat,cold,fitted(data.ind))
treat cold
1 "placebo" "no" "115.913978494624"
2 "placebo" "yes" "24.0860215053783"
3 "vitC" "no" "115.086021505376"
4 "vitC" "yes" "23.9139784946256"

> data.sat <- glm(count~factor(treat)+factor(cold)+treat*cold,family=poisson)


> summary(data.sat)
Call: glm(formula = count ~ factor(treat) + factor(cold) + treat * cold, family = poisson)
Deviance Residuals:
[1] 0 0 0 0
Coefficients: (2 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.69135 0.09578 48.979 < 2e-16 ***
factor(treat)vitC 0.11267 0.13180 0.855 0.3926
factor(cold)yes -1.25736 0.20355 -6.177 6.53e-10 ***
treatvitC NA NA NA NA
coldyes NA NA NA NA
treatvitC:coldyes -0.71345 0.32932 -2.166 0.0303 *
Null deviance: 1.3547e+02 on 3 degrees of freedom
Residual deviance: -5.7732e-15 on 0 degrees of freedom AIC: 31.132

Loglinear Logit Connection for 2 x 2 Tables


> treat <- c("placebo","vitC")
> no <- c(109,122)
> yes <- c(31,17)
> datalogit <- data.frame(treat,no,yes)
> datalogit
treat no yes
1 placebo 109 31
2 vitC 122 17
> n <- no+yes
> logit.ind <- glm(yes/n~1,weights=n,family=binomial)
> summary(logit.ind)
Call: glm(formula = yes/n ~ 1, family = binomial, weights = n)
Deviance Residuals:
1 2
1.496 -1.623
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.5712 0.1586 -9.905 <2e-16 ***
Null deviance: 4.8717 on 1 degrees of freedom
Residual deviance: 4.8717 on 1 degrees of freedom AIC: 16.45
> cbind(treat,no,yes,fitted(logit.ind))
treat no yes
1 "placebo" "109" "31" "0.172043010752688"
2 "vitC" "122" "17" "0.172043010752688"

> logit.sat <- glm(yes/n~factor(treat),weights=n,family=binomial)


> summary(logit.sat)
Call: glm(formula = yes/n ~ factor(treat), family = binomial, weights = n)
Deviance Residuals:
[1] 0 0
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.2574 0.2035 -6.177 6.53e-10 ***
factor(treat)vitC -0.7134 0.3293 -2.166 0.0303 *
Null deviance: 4.8717e+00 on 1 degrees of freedom
Residual deviance: -1.4433e-14 on 0 degrees of freedom AIC: 13.578
> cbind(treat,no,yes,fitted(logit.sat))
treat no yes
1 "placebo" "109" "31" "0.221428571428571"
2 "vitC" "122" "17" "0.122302158273381"

General Two-Way Table Example


One thousand recent purchasers of American-made cars are randomly sampled and each purchase is classified
according to manufacturer and size of car. Use the results below to test for independence between these two variables
at = .01 .
Y = Size of Car
Small Intermediate Large
A 157 (140.83) 126 (135.04) 58 (65.13) 341
X = Manufacturer B 65 (79.30) 82 (76.03) 45 (36.67) 192
C 181 (158.18) 142 (151.67) 60 (73.15) 383
D 10 (34.69) 46 (33.26) 28 (16.04) 84
413 396 191 1000

> manfact <- c(rep("A",3),rep("B",3),rep("C",3),rep("D",3))


> size <- c("small","intermed","large")
> size <- c(rep(size,4))
> count <- c(157,126,58,65,82,45,181,142,60,10,46,28)
> cars <- data.frame(manfact,size,count)
> cars.ind <- glm(count~factor(manfact)+factor(size),family=poisson)
> summary(cars.ind)
Call: glm(formula = count ~ factor(manfact) + factor(size), family = poisson)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.9503 -1.0723 -0.0554 1.4464 2.6968
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.90554 0.06677 73.473 < 2e-16 ***
factor(manfact)B -0.57439 0.09023 -6.366 1.94e-10 ***
factor(manfact)C 0.11615 0.07445 1.560 0.119
factor(manfact)D -1.40107 0.12181 -11.502 < 2e-16 ***
factor(size)large -0.72914 0.08810 -8.277 < 2e-16 ***
factor(size)small 0.04203 0.07033 0.598 0.550
Null deviance: 405.21 on 11 degrees of freedom
Residual deviance: 50.61 on 6 degrees of freedom AIC: 134.76

> cbind(manfact,size,fitted(cars.ind))
manfact size
1 "A" "small" "140.833"
2 "A" "intermed" "135.036"
3 "A" "large" "65.1309999999999"
4 "B" "small" "79.2959999999999"
5 "B" "intermed" "76.032"
6 "B" "large" "36.672"
7 "C" "small" "158.179"
8 "C" "intermed" "151.668"
9 "C" "large" "73.1529999999999"
10 "D" "small" "34.692"
11 "D" "intermed" "33.264"
12 "D" "large" "16.044"

> cars.sat <- glm(count~factor(manfact)+factor(size)+manfact*size,family=poisson)


> summary(cars.sat)
Call: glm(formula = count ~ factor(manfact) + factor(size) + manfact * size, family = poisson)
Coefficients: (5 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.83628 0.08909 54.287 < 2e-16 ***
factor(manfact)B -0.42956 0.14189 -3.028 0.00247 **
factor(manfact)C 0.11955 0.12239 0.977 0.32868
factor(manfact)D -1.00764 0.17227 -5.849 4.94e-09 ***
factor(size)large -0.77584 0.15868 -4.889 1.01e-06 ***
factor(size)small 0.21996 0.11961 1.839 0.06591 .
manfactB NA NA NA NA
manfactC NA NA NA NA
manfactD NA NA NA NA
sizelarge NA NA NA NA
sizesmall NA NA NA NA
manfactB:sizelarge 0.17578 0.24412 0.720 0.47149
manfactC:sizelarge -0.08564 0.22110 -0.387 0.69850
manfactD:sizelarge 0.27940 0.28746 0.972 0.33106
manfactB:sizesmall -0.45230 0.20466 -2.210 0.02711 *
manfactC:sizesmall 0.02271 0.16393 0.139 0.88984
manfactD:sizesmall -1.74602 0.36884 -4.734 2.20e-06 ***
Null deviance: 4.0521e+02 on 11 degrees of freedom
Residual deviance: 3.7303e-14 on 0 degrees of freedom AIC: 96.152

Das könnte Ihnen auch gefallen