Beruflich Dokumente
Kultur Dokumente
In answering these below, paste the Stata output only when it is asked. When
pasting output, use the copy as picture option. When testing a hypothesis, be sure
to mention the distribution of the test statistic, its degrees of freedom, the level of
significance and the associated critical value. DO NOT USE THE STATA test
COMMAND.
It would be easiest if you inserted your answer between the questions below and
returned this document. Rename the document as `your name.docx and upload it
on LMS.
You have to do this exam by yourself. You are allowed to consult the textbook and
your notes. You are NOT allowed to consult anybody whether by speaking, by text
messages or email or any other means. Violations will attract penalties as per
Ashoka policy.
1. (a) Regress log of wages on a constant and the female dummy. Paste output
here.
The coefficient on the female dummy is -0.69. This measures the percentage
difference in daily wages between males and females, keeping everything else
constant (as it is regressed on log of wages). So basically, a woman earns 69%
less than a man on a daily basis.
(c) Test the null hypothesis that the coefficient on female dummy is -0.5 against
the alternative that the coefficient on female dummy is less than -0.5. Show your
workings. [5+5+10]
From the regression we see that the coefficient on the female dummy is -0.69. As
this lies in the rejection region, therefore, we reject the null.
2. (a) Regress log of wages on a constant, the female dummy, age of the
individual and the square of age. Paste your output here.
(b) Controlling for age and the square of age does not seem to substantially
change the coefficient of the female dummy. Why is that so? [5+5]
The coefficient of the female dummy is still -0.69, which means that the
coefficient of the female dummy has not changed substantially. This is because
the age parameter is in no way related to the gender of the wage earner. Age
would affect factors like efficiency, health, etc. So, even by regressing both age
and age2, we are not affecting the female dummy.
3. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age and the social group dummies for scheduled caste,
for scheduled tribe and for other backward caste. Note the omitted category is
the general castes (or forward castes). Paste your output here.
. regress lwage female age age2 scd std obc
(b) Test the null hypothesis that none of the social group dummmies matter, i.e.,
controlling for sex, age and square of age, the average of log wages is the same
for all categories: scheduled castes, scheduled tribes, other backward castes and
the general (forward) castes. Do NOT use the Stata test command.
Df= 993
H0: scd=std=obc=0
H1: scd=std=obc0
(c) Test the null hypothesis that relative to the general (forward) castes,
scheduled castes and other backward castes suffer the same extent of
discrimination. If this requires new regressions, paste the output in your
answer. [5+15+15]
Df= 993
We need to regress using an restricted set of variables and then compare with
the unrestricted.
4. (a) Regress log of wages on a constant, the female dummy, age of the
individual the square of age, the social group dummies for scheduled caste, for
scheduled tribe and for other backward caste, and the education dummies for
illiterate, literate, primary, secondary, and higher secondary. Paste the output
here.
. regress lwage female age age2 scd std obc illiterate literate primary secondary higher_secondary
(b) Compare the above regression with the regression in question 3 (without the
education dummies). Does the inclusion of education dummies alter the
discrimination against women, scheduled castes, scheduled tribes and other
backward castes? Why? [5+15]
5. (a) To the explanatory variables in the regression in Qn 4(a), add land owned
(LandO) and land possessed (LandP) and re-run the regression. DO NOT paste
the output.
(b) Is either of the land variables individually significant at the 5 or 10% level?
At the 5% level, critical region lies beyond 1.96 and -1.96. Here, the variable
landP has a t-stat of 0.97. Since this value does not lie in the critical region, it is
not significant at the 5% level. At 10% level, critical region lies beyond 1.645 and
-1.645. Again, as the t-stat is 0.97, it still does not lie in the critical region. So, we
reject landP at both levels.
For landO, we have a t-stat of -0.22. This also does not lie in both the critical
regions mentioned above (at 5% and 10% levels).
So, we can conclude that neither variable is individually significant at the 5 or
10% level.
(c) Now drop land owned (LandO) and re-run the regression. Is the included
land variable significant at the 5 or 10% level?
At the 5% level, critical region lies beyond 1.96 and -1.96. Here, the variable
landP has a t-stat of 1.91. Since this value does not lie in the critical region, it is
not significant at the 5% level. At 10% level, critical region lies beyond 1.645 and
-1.645. As the t-stat is 1.91, it lies in the critical region and is therefore
significant.
In (b) we observe that when we regress both land owned and land possessed, we
do not see any one of the two variables significantly impacting the regression.
This is because of multicollinearity, as both the variables are closely interlinked.
But, when landO is removed in part (c), we see that the impact of landP on the
regression equation increases significantly (as it becomes significant at the 10%
level).