Sie sind auf Seite 1von 2

1.

Solution:
a. Since Gender is a nominal variable that has three categories: Male, Female and
Unknown, it will be brought into the model through dummy variables. Since there are 3
values for Gender, 2 dummy variables will be required, which can be set-up as follows:
Gender
Dummy 1
Dummy 2
Unknown
0
0
Male
1
0
Female
0
1
The equation will be: Y = Intercept + Coeff1*Dummy1 + Coeff2*Dummy2 + Error
The coefficient of Dummy 1 will correspond to Males, the coefficient of Dummy2 will
correspond to females, and wherever the gender is unknown, both will be zero.
b. To test whether the categorical predictor (Gender, in this case), as a whole, is significant
is equivalent to testing whether there is any heterogeneity in the means of the levels of
the predictor. When there are no other predictors in the model, this is a classical ANOVA
problem.
When there are other predictors in the model, you have two options to test for the
significance of a categorical predictor:
1. The likelihood ratio test: Suppose you have an outcome Yi, quantitative predictors
Xi1,...,Xip and the categorical predictor Ci with k levels. The model without the
categorical predictor is:
Yi=0+1Xi1+...+pXip+i
Next, you can fit the model with the categorical predictor:
Yi=0+1Xi1+...+pXip+jjBj+i
where Bj is a dummy variable which is 1 if Bi=j and 0 otherwise.
The k'th level is the reference level, which is why there are only k1 terms in the
sum. Then, under the null hypothesis that Bi has no effect,you can calculate the pvalue to test for significance.
2. F-test: If you fit the "full" model (i.e. the model with all of the predictors, including
the categorical predictor), and call this g1, and the model without the categorical
predictor, and call this g0, then the ANOVA(g1,g0) will test this hypothesis for you as
well.
Both the approaches require normality of the errors. Also, the likelihood ratio test is
a very general tool used for nested comparisons, although the F-test is more familiar
in comparing linear regression models.

2. Solution:
a. Since blkdef is a continuous variable, its coefficient (-0.5308) represents the difference in
the predicted value of death penalty for each one-unit difference in blkdef, if all other
factors remain constant. This means that if blkdef differed by one unit, and others did
not differ, dealth penalty will differ by coefficient (-0.5308) units, on average. The
negative sign signifies that black defendants are less likely to get death penalty on
average. At 5% level of error (95% confidence), the p-value is 0.3291. So, this coefficient
is not significant and we can say that whether the defendant is black or not does not
really matter.
b. Coefficient of whitevict is 1.5563. Odds ratio is given as exp(B) = exp(1.5563) = 4.74.
Since the odds ratio is greater than 1, the exposure to this variable, whitevict, results in
higher odds of outcome, death penalty. Since the p-value associated with whitevict is
0.0115, which is lesser than the alpha level of 0.05, this variable is statistically significant.
The interpretation of the odds ratio is that if the victim is white the odds (probability of
death penalty/ probability of no death penalty) is 4.74. This is very high.
c. The below table has the odds ratios and probabilities of death penalty for various
variables:

S. No.

Parameter

Odds Ratio

Probability

blkdef

0.59

0.37

whtvict

4.74

0.83

aggcirc

1.45

0.59

fevict

1.45

0.59

stranger

6.00

0.86

multvic

1.22

0.55

multstab

4.23

0.81

yngvict

1.13

0.53

So, the required probability = p1*p2*p4*2*p3*p5*p6*p7*p8 = 4.34%


If the defendant was not black, probability = 11.71%
d. No, this model does not permit the analysis of interaction effects. This is because to
include interaction effects, we would need to move to multiple regression, where the
variables will be off the kind XY, YZ, ZX instead of just X,Y,Z for example.
e. One reason to include them would be that the actual model might include higher-order
terms, being a multiple regression model. In some cases, there might be a strict legal
requirement whereby although the data might not make it seem like they are significant,
having a young victim or multiple victims make the defendant more likely to get death
penalty.

Das könnte Ihnen auch gefallen