Beruflich Dokumente
Kultur Dokumente
Submitted by:
Sharvari Parikh
PRN-17060242036
Assignment
The dataset contains the incomes of 16 persons along with the information on their ages, gender
and political party affiliation. You are supposed to build an econometrics model to explain the
income of a person with the help of the remaining variables. For this purpose, you are required to
answer the following questions:
Answer:
In regression to represent the subgroups of the sample in a study, a numerical variable called
Dummy Variable is used. It is often used to distinguish different treatment groups. Each such dummy
variable will take a value of 0 or 1.
R code:
sharvari=read.csv(file.choose())
sharvari
install.packages("dplyr")
library(dplyr)
attach(sharvari)
Female.Dummy
=ifelse(Gender=="Female",1,0)
Female.Dummy
sharvari1=data.frame(sharvari,Female.Dummy)
sharvari1
Answer:
R code :
Dem.dummy=ifelse(Party=="Dem",1,0)
Dem.dummy
Ind.dummy=ifelse(Party=="Ind",1,0)
Ind.dummy
sharvari2=data.frame(sharvari1,Ind.dummy,Dem.dummy)
sharvari2
Here Ind.dummy is a dummy variable for Independent. Dem.dummy is a dummy variable for
democrat.
3) Set up a multiple regression model to explain the income with the help of age, gender
and their political affiliation. Interpret the regression coefficients.
Answer:
R code :
attach(sharvari2)
y=Income
x=Age
x1=Female.Dummy
x2=Dem.dummy
x3=Ind.dummy
sh=lm(y~x+x1+x2+x3)
summary(sh)
Multi-regression to explain the income on the basis of age, gender and political affiliation.
𝑦 = 𝛼 + 𝛽0 ∗ 𝑥0 + 𝛽1 ∗ 𝑥1 + 𝛽2 ∗ 𝑥2 + 𝛽3 ∗ 𝑥3 + 𝜇
𝑥0 = 𝑎𝑔𝑒,
𝑥1 = 𝐹𝑒𝑚𝑎𝑙𝑒. 𝐷𝑢𝑚𝑚𝑦 𝑖. 𝑒. 𝑑𝑢𝑚𝑚𝑦 𝑣𝑎𝑟𝑖𝑏𝑙𝑒 𝑓𝑜𝑟 𝑓𝑒𝑚𝑎𝑙𝑒, 𝑥2 =
𝐷𝑒𝑚. 𝑑𝑢𝑚𝑚𝑦 𝑖. 𝑒. 𝑑𝑢𝑚𝑚𝑦 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑓𝑜𝑟 𝑑𝑒𝑚𝑜𝑐𝑟𝑎𝑡 , 𝑥3 =
𝐼𝑛𝑑. 𝑑𝑢𝑚𝑚𝑦 𝑖. 𝑒. 𝑑𝑢𝑚𝑚𝑦 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑓𝑜𝑟 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡, 𝜇 = 𝑒𝑟𝑟𝑜𝑟 𝑡𝑒𝑟𝑚
Call:
lm(formula = y ~ x + x1 + x2 + x3)
Residuals:
Coefficients:
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
There is positive impact of age on Income as β is positive. The mean income of democrats are
less than republic by 15594.5 (Republic variable is our benchmark variable) while the mean
income of Independent are less than mean income of Republic by 10453.1. The mean income
for female is less than male by 13677.5.
Answer:
install.packages("lmtest")
install.packages("car")
library(lmtest)
library(car)
vif(sh)
bptest(sh)
To check multicollinearity, we use VIF i.e. variance influence factor. If VIF is greater than 4
then we have to go for another test and if it is greater than 10 then there is high
multicollinearity.
x x1 x2 x3
1.140509 1.035011 1.529495 1.384677
Here all figures are less than 4. So we shouldn’t be worried about multicollinearity.
For heteroscedasticity, we generally use BP test i.e. Breusch-Pagan test. The null hypothesis
of this test is there is a homoscedasticity.
Data: sh
BP = 3.2371, DF= 4, p-value = 0.519
Answer: