Sie sind auf Seite 1von 5

# Some feedback assignment 1

## Just what is relevant for assignment 2, rest will be discussed in question

hour for later assignments and exam

Economics

## Interpretation: significance (and level); ceteris paribus; in case of

dummy/categorical variables, mention reference category (and interpret all
categories one by one, dont just say worse sphealth increases
expenditures); dont just mention coefficient but interpret it

## Teresa Bago dUva

Erasmus School of Economics
Department of Applied Economics

## Predicted/fitted equation does not include the error term

Question 3: include interaction and understand what interaction means
(difference in age effect between males and females)
Conclusion question: Dont just repeat the detailed results.

14 September 2016
1

assignment 2

Do file: make sure to open data and log file correctly (and close log file)

## Example probit PT use: estimated

equation for probability
Probit regression

## Log likelihood = -636.96328

Number of obs
LR chi2(2)
Prob > chi2
Pseudo R2

=
=
=
=

3000
54.65
0.0000
0.0411

2. Scatterplot

-----------------------------------------------------------------------------ptuse |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban | -.1178419
.0782359
-1.51
0.132
-.2711814
.0354976
age |
.013089
.0018674
7.01
0.000
.009429
.016749
_cons |
1.201824
.0834919
14.39
0.000
1.038183
1.365465
------------------------------------------------------------------------------

## P r y 1 | age, urban 1.201824 0.1178419.urban 0.013089.age

where . is the cumulative distribution function (CDF) of the
standard normal distribution
- How to compute? In exercise lecture do file.
- Or statistical table of standard normal distribution.

## Difference between black female and black male for

two ages
Simply indicate in scatterplot (can even write by hand on printed graph)
Should be clear you understand what the effect is

Exercise lecture

## 3. Linear regression model vs probit

Goal is to talk about effects of variables, based on what you got
in questions 2 and 3
No need for graphs
Effects of all variables
Can only compare what is comparable (think of interpretation)

4. Marginal effects

Stata commands:

margins, dydx(*)

Model VCE
: OIM

10

## . margins, dydx(*) at(urban=0 age=50)

Conditional marginal effects
Model VCE
: OIM

Number of obs

3000

Expression
: Pr(ptuse), predict()
dy/dx w.r.t. : 1.urban age
at
: urban
=
age
=

Expression
: Pr(ptuse), predict()
dy/dx w.r.t. : 1.urban age

Number of obs

3000

0
50

-----------------------------------------------------------------------------|
Delta-method
|
dy/dx
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban |
Urban | -.0081151
.0058582
-1.39
0.166
-.0195971
.0033668
age |
.0008388
.0001187
7.07
0.000
.0006062
.0010714
-----------------------------------------------------------------------------Note: dy/dx for factor levels is the discrete change from the base level.

-----------------------------------------------------------------------------|
Delta-method
|
dy/dx
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban |
Urban | -.0117787
.0084613
-1.39
0.164
-.0283625
.0048051
age |
.001381
.0002193
6.30
0.000
.0009513
.0018108
-----------------------------------------------------------------------------Note: dy/dx for factor levels is the discrete change from the base level.

## In the case of dummy

variables, Stata always
calculates:
Pr ptuse 1 | urban 1, age 50
Pr ptuse 1 | urban 0, age 50

## . margins, dydx(*) at(urban=1 age=50)

Conditional marginal effects
Model VCE
: OIM
Expression
: Pr(ptuse), predict()
dy/dx w.r.t. : 1.urban age
at
: urban
=
age
=

## On average, living in an urban area decreases probability of () by 1.17 pp, compared

to living in a rural area (), ceteris paribus. Effect insignificant at ().
11

Number of obs

3000

1
50

-----------------------------------------------------------------------------|
Delta-method
|
dy/dx
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban |
Urban | -.0081151
.0058582
-1.39
0.166
-.0195971
.0033668
age |
.0010318
.0001194
8.64
0.000
.0007978
.0012658
-----------------------------------------------------------------------------Note: dy/dx for factor levels is the discrete change from the base level.

## For individuals living in

an urban area and aged 50,
increases probability () by
0.1pp (). Effect significant
at ().
12

variables

## Different base category?

logit ptuse i.urban ib2.age_categ

## . logit ptuse i.urban i.age_categ

Logistic regression
Log likelihood = -617.88621

Number of obs
LR chi2(4)
Prob > chi2
Pseudo R2

=
=
=
=

Logistic regression

3,000
92.80
0.0000
0.0698

-----------------------------------------------------------------------------ptuse |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban |
Urban | -.1917626
.1653453
-1.16
0.246
-.5158335
.1323083
|
age_categ |
2 | -.4105884
.180612
-2.27
0.023
-.7645815
-.0565953
3 |
1.02875
.2284758
4.50
0.000
.5809455
1.476554
4 |
2.114608
.4316005
4.90
0.000
1.268687
2.96053
|
_cons |
2.559948
.1677366
15.26
0.000
2.23119
2.888706
------------------------------------------------------------------------------

## logit ptuse urban age20_39 age40_59 age60_plus

Logistic regression
Log likelihood = -617.88621

Number of obs
LR chi2(4)
Prob > chi2
Pseudo R2

=
=
=
=

3,000
92.80
0.0000
0.0698

-----------------------------------------------------------------------------ptuse |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban | -.1917626
.1653453
-1.16
0.246
-.5158335
.1323083
age20_39 | -.4105884
.180612
-2.27
0.023
-.7645815
-.0565953
age40_59 |
1.02875
.2284758
4.50
0.000
.5809455
1.476554
age60_plus |
2.114608
.4316005
4.90
0.000
1.268687
2.96053
_cons |
2.559948
.1677366
15.26
0.000
2.23119
2.888706
------------------------------------------------------------------------------

## Model output (ie,

estimated
coefficients are
exactly the same, if
same reference
category)

Marginal effects
different: obtained
after model with
i.age_categ are
correct; obtained
after model with
separate dummies
are wrong.
13

## Log likelihood = -617.88621

Number of obs
LR chi2(4)
Prob > chi2
Pseudo R2

=
=
=
=

3,000
92.80
0.0000
0.0698

-----------------------------------------------------------------------------ptuse |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban |
Urban | -.1917626
.1653453
-1.16
0.246
-.5158335
.1323083
|
age_categ |
1 |
.4105884
.180612
2.27
0.023
.0565953
.7645815
3 |
1.439338
.2217098
6.49
0.000
1.004795
1.873882
4 |
2.525197
.4279787
5.90
0.000
1.686374
3.36402
|
_cons |
2.14936
.164105
13.10
0.000
1.82772
2.471
-----------------------------------------------------------------------------margins, dydx(*)
Average marginal effects
Model VCE
: OIM

Number of obs

3,000

Expression
: Pr(ptuse), predict()
dy/dx w.r.t. : 1.urban 1.age_categ 3.age_categ 4.age_categ
-----------------------------------------------------------------------------|
Delta-method
|
dy/dx
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------urban |
Urban | -.0099922
.0084661
-1.18
0.238
-.0265855
.0066012
|
age_categ |
1 |
.0358715
.0158533
2.26
0.024
.0047996
.0669434
3 |
.0857047
.0135678
6.32
0.000
.0591122
.1122971
4 |
.1054683
.0131294
8.03
0.000
.0797352
.1312014
-----------------------------------------------------------------------------Note: dy/dx for factor levels is the discrete change from the base level.

Implications for
interpretation?
interpretation is always
compared to the
reference category

## Can choose base

category:
With separate
dummies, change
the one left out
With i.categ, for
example:
ib2.age_categ

14

Question 6
Probit models
Same xs as question 5, except that age will enter the model differently:
One in categories (up to you which and how many categories)
Other one not in categories but also not (just) age
In both cases, need to create new variables

5. Question 6

## If different goodness of fit measures give contradictory results:

not black or white.
If one measure can be considered dominant (think of measures
which relate to each other), can also mention this.

15

16

Question 8
Cannot see from the data how long a person has smoked, that is why
need to assume something: they all started at the same time (does not
matter when)
=> what do you know about duration of smoking?

6. Question 8

## Include other explanatory variables in Q8a? Up to you

8b:
Smoking explains results for educated above? In principle, this
refers to the results of Question 5, but can also be results of
Question 8a if different that Q5.
Need to do Stata analysis
17

18

7. Other question(s)
for each explanatory variable (question 5). Show Stata output with
marginal effects for all the variables included in the model. Interpret just
for the two variables mentioned.

Good luck!

19

20