Loglinear Models in Two Dimensions - Thursday March 2 & Tuesday March 14

LOGLINEAR MODEL FOR TWO DIMENSIONS
Loglinear models treat all variables the same, no response - explanatory distinction. When there is one response and
one or more explanatory variables, the logit model is more natural.
Loglinear models model the cell counts, the nij. It assumes a Poisson distribution and uses a log link in the general
linear model framework. Even though it assumes a Poisson distribution, it can be used for all types of sampling
( multinomial and independent multinomial as well ).
Setting ij = E( nij ) we have
ln( ij ) = + i + j + ij
X Y XY
and this is called the loglinear model.
It is the interaction term that describes the relationship between X and Y. The main effect terms ( in this case the
iX and Yj ) will always be in the models along with the intercept term.
In order to uniquely estimate the parameters we need to set constraints. The constraints are not unique but the fit of
various models is unique regardless of the constraints. R again uses indicator variables and sets the first level of all
parameters ( numerically or alphabetically ) equal to 0, i.e., anything with a 1 in its subscript will be set to 0.
Term # of Parameters # of Independent Parameters
1 1
iX I I1
Yj J J1
ijXY IJ (I 1)(J 1)
IJ
This model is called the saturated model because we have I J independent parameters to estimate and I J nij to use
to estimate them. We can estimate everything in the model but then we have no degrees of freedom left over to test
any hypotheses. We have no measure of error since ij = nij .
ni n j
ln( ij ) = + i + j ij =
X Y
Independence Model:
n
Example - Loglinear Model for 2 x 2 Tables
In a study in France, 139 people were given regular doses of Vitamin C over a period of time and another 140
people were given a placebo. At the end of the time period, each person was classified as to whether or not they had a
cold during the period. The results are given below with the expected values for the independence model given in
parentheses.
Cold
No Yes
Placebo 109 (115.91) 31 (24.09) 140
Vitamin C 122 (115.09) 17 (23.91) 139
231 48 279
> treat <- c("placebo","placebo","vitC","vitC")

> cold <- c("no","yes","no","yes")
> count <- c(109,31,122,17)
> data <- data.frame(treat,cold,count)
> data
treat cold count
1 placebo no 109
2 placebo yes 31
3 vitC no 122
4 vitC yes 17
> data.ind <- glm(count~factor(treat)+factor(cold),family=poisson(link=log))

> summary(data.ind)
Call: glm(formula = count ~ factor(treat) + factor(cold), family = poisson(link = log))
Deviance Residuals:
1 2 3 4
-0.6487 1.3484 0.6382 -1.4918
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.752848 0.088812 53.516 <2e-16 ***
factor(treat)vitC -0.007168 0.119738 -0.060 0.952
factor(cold)yes -1.571217 0.158626 -9.905 <2e-16 ***
Null deviance: 135.4675 on 3 degrees of freedom
Residual deviance: 4.8717 on 1 degrees of freedom AIC: 34.004
> cbind(treat,cold,fitted(data.ind))
treat cold
1 "placebo" "no" "115.913978494624"
2 "placebo" "yes" "24.0860215053783"
3 "vitC" "no" "115.086021505376"
4 "vitC" "yes" "23.9139784946256"
> data.sat <- glm(count~factor(treat)+factor(cold)+treat*cold,family=poisson)

> summary(data.sat)
Call: glm(formula = count ~ factor(treat) + factor(cold) + treat * cold, family = poisson)
Deviance Residuals:
[1] 0 0 0 0
Coefficients: (2 not defined because of singularities)
(Intercept) 4.69135 0.09578 48.979 < 2e-16 ***
factor(treat)vitC 0.11267 0.13180 0.855 0.3926
factor(cold)yes -1.25736 0.20355 -6.177 6.53e-10 ***
treatvitC NA NA NA NA
coldyes NA NA NA NA
treatvitC:coldyes -0.71345 0.32932 -2.166 0.0303 *
Null deviance: 1.3547e+02 on 3 degrees of freedom
Residual deviance: -5.7732e-15 on 0 degrees of freedom AIC: 31.132
Loglinear Logit Connection for 2 x 2 Tables

> treat <- c("placebo","vitC")
> no <- c(109,122)
> yes <- c(31,17)
> datalogit <- data.frame(treat,no,yes)
> datalogit
treat no yes
1 placebo 109 31
2 vitC 122 17
> n <- no+yes
> logit.ind <- glm(yes/n~1,weights=n,family=binomial)
> summary(logit.ind)
Call: glm(formula = yes/n ~ 1, family = binomial, weights = n)
Deviance Residuals:
1 2
1.496 -1.623
Coefficients:
(Intercept) -1.5712 0.1586 -9.905 <2e-16 ***
> cbind(treat,no,yes,fitted(logit.ind))
treat no yes
1 "placebo" "109" "31" "0.172043010752688"
2 "vitC" "122" "17" "0.172043010752688"
> logit.sat <- glm(yes/n~factor(treat),weights=n,family=binomial)

> summary(logit.sat)
Call: glm(formula = yes/n ~ factor(treat), family = binomial, weights = n)
Deviance Residuals:
[1] 0 0
Coefficients:
(Intercept) -1.2574 0.2035 -6.177 6.53e-10 ***
factor(treat)vitC -0.7134 0.3293 -2.166 0.0303 *
Residual deviance: -1.4433e-14 on 0 degrees of freedom AIC: 13.578
> cbind(treat,no,yes,fitted(logit.sat))
treat no yes
1 "placebo" "109" "31" "0.221428571428571"
2 "vitC" "122" "17" "0.122302158273381"
General Two-Way Table Example

One thousand recent purchasers of American-made cars are randomly sampled and each purchase is classified
according to manufacturer and size of car. Use the results below to test for independence between these two variables
at = .01 .
Y = Size of Car
Small Intermediate Large
A 157 (140.83) 126 (135.04) 58 (65.13) 341
X = Manufacturer B 65 (79.30) 82 (76.03) 45 (36.67) 192
C 181 (158.18) 142 (151.67) 60 (73.15) 383
D 10 (34.69) 46 (33.26) 28 (16.04) 84
413 396 191 1000
> manfact <- c(rep("A",3),rep("B",3),rep("C",3),rep("D",3))

> size <- c("small","intermed","large")
> size <- c(rep(size,4))
> count <- c(157,126,58,65,82,45,181,142,60,10,46,28)
> cars <- data.frame(manfact,size,count)
> cars.ind <- glm(count~factor(manfact)+factor(size),family=poisson)
> summary(cars.ind)
Call: glm(formula = count ~ factor(manfact) + factor(size), family = poisson)
Deviance Residuals:
Min 1Q Median 3Q Max
-4.9503 -1.0723 -0.0554 1.4464 2.6968
Coefficients:
(Intercept) 4.90554 0.06677 73.473 < 2e-16 ***
factor(manfact)B -0.57439 0.09023 -6.366 1.94e-10 ***
factor(manfact)C 0.11615 0.07445 1.560 0.119
factor(manfact)D -1.40107 0.12181 -11.502 < 2e-16 ***
factor(size)large -0.72914 0.08810 -8.277 < 2e-16 ***
factor(size)small 0.04203 0.07033 0.598 0.550
> cbind(manfact,size,fitted(cars.ind))
manfact size
1 "A" "small" "140.833"
2 "A" "intermed" "135.036"
3 "A" "large" "65.1309999999999"
4 "B" "small" "79.2959999999999"
5 "B" "intermed" "76.032"
6 "B" "large" "36.672"
7 "C" "small" "158.179"
8 "C" "intermed" "151.668"
9 "C" "large" "73.1529999999999"
10 "D" "small" "34.692"
11 "D" "intermed" "33.264"
12 "D" "large" "16.044"
> cars.sat <- glm(count~factor(manfact)+factor(size)+manfact*size,family=poisson)

> summary(cars.sat)
Call: glm(formula = count ~ factor(manfact) + factor(size) + manfact * size, family = poisson)
Coefficients: (5 not defined because of singularities)
(Intercept) 4.83628 0.08909 54.287 < 2e-16 ***
factor(manfact)B -0.42956 0.14189 -3.028 0.00247 **
factor(manfact)C 0.11955 0.12239 0.977 0.32868
factor(manfact)D -1.00764 0.17227 -5.849 4.94e-09 ***
factor(size)large -0.77584 0.15868 -4.889 1.01e-06 ***
factor(size)small 0.21996 0.11961 1.839 0.06591 .
manfactB NA NA NA NA
manfactC NA NA NA NA
manfactD NA NA NA NA
sizelarge NA NA NA NA
sizesmall NA NA NA NA
manfactB:sizelarge 0.17578 0.24412 0.720 0.47149
manfactC:sizelarge -0.08564 0.22110 -0.387 0.69850
manfactD:sizelarge 0.27940 0.28746 0.972 0.33106
manfactB:sizesmall -0.45230 0.20466 -2.210 0.02711 *
manfactC:sizesmall 0.02271 0.16393 0.139 0.88984
manfactD:sizesmall -1.74602 0.36884 -4.734 2.20e-06 ***
Residual deviance: 3.7303e-14 on 0 degrees of freedom AIC: 96.152

Loglinear Models in Two Dimensions - Thursday March 2 & Tuesday March 14

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Loglinear Models in Two Dimensions - Thursday March 2 & Tuesday March 14

Hochgeladen von

Copyright:

Verfügbare Formate

LOGLINEAR MODEL FOR TWO DIMENSIONS

Setting ij = E( nij ) we have

Term # of Parameters # of Independent Parameters

> treat <- c("placebo","placebo","vitC","vitC")

> data.ind <- glm(count~factor(treat)+factor(cold),family=poisson(link=log))

> data.sat <- glm(count~factor(treat)+factor(cold)+treat*cold,family=poisson)

Loglinear Logit Connection for 2 x 2 Tables

> logit.sat <- glm(yes/n~factor(treat),weights=n,family=binomial)

General Two-Way Table Example

> manfact <- c(rep("A",3),rep("B",3),rep("C",3),rep("D",3))

> cars.sat <- glm(count~factor(manfact)+factor(size)+manfact*size,family=poisson)

Das könnte Ihnen auch gefallen

Loglinear Models in Two Dimensions - Thursday March 2 &amp; Tuesday March 14

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Loglinear Models in Two Dimensions - Thursday March 2 &amp; Tuesday March 14

Hochgeladen von

Copyright:

Verfügbare Formate

LOGLINEAR MODEL FOR TWO DIMENSIONS

Setting ij = E( nij ) we have

Term # of Parameters # of Independent Parameters

> treat <- c("placebo","placebo","vitC","vitC")

> data.ind <- glm(count~factor(treat)+factor(cold),family=poisson(link=log))

> data.sat <- glm(count~factor(treat)+factor(cold)+treat*cold,family=poisson)

Loglinear Logit Connection for 2 x 2 Tables

> logit.sat <- glm(yes/n~factor(treat),weights=n,family=binomial)

General Two-Way Table Example

> manfact <- c(rep("A",3),rep("B",3),rep("C",3),rep("D",3))

> cars.sat <- glm(count~factor(manfact)+factor(size)+manfact*size,family=poisson)

Das könnte Ihnen auch gefallen

Loglinear Models in Two Dimensions - Thursday March 2 & Tuesday March 14

Loglinear Models in Two Dimensions - Thursday March 2 & Tuesday March 14