Beruflich Dokumente
Kultur Dokumente
ECONOMETRICS PROJECT
FACTOR INFLUENCING ON
HANU STUDENTS’ HOUSE RENT
Tutor: Nguyen Dinh Du
Class: Tutorial 1
Group members:
Nguyen Thi My Hanh – 1604040039
Nguyen Duy Phuong – 1604040089
Nguyen Thi Hoai Nam – 1604040077
Duong Thi Van- 1604040123
TABLE OF CONTENTS
I. Introduction ....................................................................................................................... 1
II. Literature review ............................................................................................................. 2
III. Methodology ................................................................................................................... 3
IV. Data analysis and results................................................................................................ 4
1) Descriptive statistics.................................................................................................. 4
2) Model analysis and results ....................................................................................... 5
a) Testing the drop variable in the model ............................................................. 5
b) Testing the functional form of the model .......................................................... 6
c) Testing for Significant ........................................................................................ 6
i) Individual partial coefficient test .................................................................. 7
ii) Testing the Overall Significance ................................................................... 7
d) Error-checking test ............................................................................................. 8
i) Multicollinearity ............................................................................................. 8
ii) Heteroscedasticity ........................................................................................... 8
iii) Autocorrelation ............................................................................................... 8
e) Discussing estimated parameters....................................................................... 9
f) Discussing overall fitness of the model .............................................................. 9
g) Discussing the practical implications of model ................................................ 10
This very first page would be dedicated to all of people who had the contribution to
the completion of the project. Firstly, we would like to express our special thanks to
Mr Nguyen Dinh Du – our tutor and lecturer for the knowledge provided in the
Econometrics subject as well as his dedication and support. His advices and
suggestions were of great importance to help us deal with the problems in the
process of doing this project. Secondly, we would like to be grateful to scholars
whose findings have been useful as the reference of our work. Last but not least, our
appreciation is to our group members and other friends who gave us a hand and
contributed to our paper.
I. Introduction
It goes without saying that accommodation plays an indispensable role in human life since
nobody can live without it. However, currently, many people have to live and work far from
home, especially individuals from the countryside moving to the urban areas; therefore, they
need a house for rent to settle down their lives.
For students who have to study far from home, housing rent is extremely essential. However,
it sometimes takes a long time for them to find a good place because some advertisements on
the Internet and false information in some websites can lead them to a wrong option. In
addition, the problems may arise throughout the process they look for a suitable
accommodation and they need to consider some elements such as the facility in the room, the
limited number of people for a room, the distance to the public places for the conveniences
and so on. Therefore, some criteria should be taken into account for their best choice. There
are numerous factors that affect the rent for students. In order to limit the scope of this study,
we focus on investigating into the student of Hanoi University (Hanu) and their opinions
towards the criteria related to housing rent. Hence, in this project, we did carry out a research
to draw the conclusion about the main factors that affect Hanu students’ choice of house for
rent.
For the research outcome, we expect to examine the influence of some factors related to the
change in the price of house for rent for Hanu students. Apparently, the house rent is an
indicator of the economic development of a certain area because it reflects the economic
situation as well as the living standards in that place. It is a common sense that we suppose
the higher the house rent is, the more conveniences this area can provides. People are willing
to pay more for an affordable house with great conveniences and Hanu students are not an
exception.
To an extent of our expectation, the result of this research illustrates the influence of two
among three factors that we carried out the linear regression model on the price of house for
rent purpose of Hanu pupils.
In this project, we will present the process to come up with the final conclusion of the main
factors that affect Hanu students’ choice of house for rent.
1
II. Literature review
The level of economic development is a macro factor that determines the level, scale and
completeness of the real estate market. When the economy grows at a high level with the
purpose and variety of land use, real estate products will be diversified, leading to the need to
transfer real estate among objects in the economy. Therefore, along with the trend of
industrialization and modernization of the economy, the real estate market is also vibrant and
abundant to develop. However, the pace of economic development faces many challenges,
requiring the government to have appropriate policies and plans to ensure balance between
supply and demand, clean and healthy environment for citizens and meet social housing
needs as well as reduce the risk of market bubbles. Housing rent is a part that contributed to
the real estate market, which can significantly impacts the economy in return. Therefore, we
decided to investigate into this topic to have a background understanding about the factors
impacting on housing price.
In terms of house rent topic, we chose Hanu students to study to narrow down the scope of
this research for better and more precise result. The objective of this project is to analyze the
factors impacting on what Hanu students consider for housing rent. This analysis uses the
data we collected from a survey on Hanu students. There are several variables that we
considered to study including square, time to go to school, and the number of people in a
room. Firstly, square is the initial factor we choose because when considering a house for
rent, students often flexibly choose an accommodation that is large or small to meet their
demand. Secondly, we also take time to go to school into account as it is apparent that pupils
tend to accept to pay a higher cost for a room or an apartment that takes a short duration to
come to their university for the convenience. Last but not least, we also investigated into the
number of people in a room due to the fact that in general, most students want to economize
on house rent, as a result, sharing this cost with others is a common choice among them.
The statistics and linear regression analysis will be conducted in this project to illustrate the
result of the influences of these factors above on Hanu students’ choice of housing rent
despite the simplification for the sake of brevity.
2
III. Methodology
In this paper, we will present the procedures in collecting data and process of making our
final conclusion about estimated regression function of factors affecting Hanu’s students.
1. Research question
What are the factors that can affect the price of house for rent?
In this part, we will consider the effects of 3 variables namely square, time to go to school,
and number of people in a room.
Based on our mentioned purposes, this project, using the applications of Econometrics, will
check the connection between square, time to go to school, the number of people in a room
and the house price for rent. From that viewpoint, we have constructed a linear regression
model with 3 variables:
PRICE = 𝜷𝟎 + 𝜷𝟏 *TIME+ 𝜷𝟐 *NUMBER+ 𝜷𝟑 *SQUARE + u
Where:
PRICE – Price that Hanu students willing to pay for rent
TIME – Time to go to school from their room that Hanu students expect
We suppose all variables in this project are normally distributed. Initially, the data was
gathered and arranged in Microsoft Excel to import the data into Eviews and run the model,
to recognize whether there is a linear relationship between our chosen variables.Firstly, to
choose the best-fit model, R2 (coefficient of determination), and C.V (coefficient of variance)
3
are applied when trying with 4 main functional forms: lin-lin-lin, lin-lin-log, log-lin-log, log-
lin-lin. While C.V of the model measures the average error of the sample regression function
relative to the mean of dependent variable, we had priority over the highest R 2which is close
to 1 when deciding the model since it can measure how much variation in the endogenous the
regression model can explain. Therefore, the highest R2- “Goodness of Fit” can construct a
more preferred equation.
Secondly, we used T-test to see whether each slope coefficient differs from zero or not; in
another way, to test the significance of each individual coefficient having impact on our
dependent variable.
Thirdly, F-test, in which Ho: β2 = 0, β2 = 0 (all variables have no effect) &H1: β2 ≠ 0 or β3 ≠
0 (at least one variable has effect), is chosen to determine the overall significance of
independent variables in our model.
Finally, to make sure if there are any errors may happen in the model, we ran 3 error-
checking tests below: multicollinearity by using auxiliary regressions; heteroscedasticity by
implying White’s Heteroscedasticity and autocorrelation by applying Durbin-WatsonTest.
First, we have to import dataset into Eviews, and then we put factors affecting the rental
house price to get the multiple regression function:
PRICE = β1 + β2Time + β3Number + β4Square + u
From the Eviews, we will have results in a table below:
Price Time Number Square
Mean 622159.1 8.079545 2.681818 16.18233
Median 600000.0 5.000000 3.000000 15.00000
Standard
209385.8 4.819075 1.077945 6.930036
deviation
Covariance
Price 4.33E+10 68123.71 170118.8 747675.7
Time 68123.71 22.95958 0.536674 3.464076
4
Number 170118.8 0.536674 1.148760 3.483241
Square 747675.7 3.464076 3.483241 47.47965
Correlation
Price 1.000000 0.068289 0.762381 0.521187
Time 0.068289 1.000000 0.104499 0.104918
Number 0.762381 0.104499 1.000000 0.471645
Square 0.521187 0.104918 0.471645 1.000000
Next, we will graph the relationship between PRICE and other variables by using Scatter in
order to provide visual analysis
60
50
40
Time
Number 30
Square
20
10
0
200,000 600,000 1,000,000 1,400,000 1,800,000
Price
It can be seen from the graph that SQUARE has the strongest positive relationship effect on
PRICE, proving that the bigger the square, the higher the price, while the TIME has less
positive influence on PRICE. With the skewness = 0.824467 and large standard deviation =
1.077945, NUMBER also affects housing price, although its effect is less than SQUARE.
5
Suppose we drop β2, and the hypothesis will be:
H0: β2=0, no effect of TIME
H1: β2≠0
The result is that F-statistics = 0.116 < Fc0.05,1,84=3.96
=>We do not reject H0.
=> Therefore, adding TIME is not relevant so we decide to drop TIME. In other words, the
variable TIME has no effect on PRICE and the housing price will not depend on TIME factor
as well.
First of all, we need to identify the best by testing all functional forms, including lin-lin, log-
log and semi-log model. We will use R-squared and CV (coefficient of variation) to select
the suitable functional form from those models. The highest R-squared and low CV can be
considered in order to choose the best model.
From Eviews, we will have the following results:
Model R-square CV
Lin-Lin-Lin 0.614817 0.211314
Lin-Lin-Log 0.606098 0.213692
Log-Lin-Log 0.551849 0.016779
Log-Lin-Lin 0.543619 0.016881
Based on the table, we take into considerations between lin-lin-lin and lin-lin-log model
because of the high R-squared. However, in this case, we cannot choose lin-lin-log model
because square might be not estimated in percentage. Therefore, lin-lin-lin model will be the
best model that we should select.
̂ = 174453.2 + 129046.8 ∗ 𝑁𝑈𝑀𝐵𝐸𝑅 + 6280.046 ∗ 𝑆𝑄𝑈𝐴𝑅𝐸
=> Final model: 𝑃𝑅𝐼𝐶𝐸
First of all, we will conduct two different t-tests to check whether NUMBER and SQUARE
has an effect on PRICE or not.
6
i. T-test of individual partial coefficients
𝛃2 𝛃3
Statistically
Conclusion Statistically significant
significant
=>CONCLUSION: There is enough evidence to conclude that Number and Square have
impact on the housing price.
ii. Testing the overall significance of all coefficients (F-test)
With the hypothesis:
H0: β2 = β3= 0
H1: At least one coefficient is not zero
(β2 ≠ 0, or β3 ≠0)
We have: F-statistic = 67.83 is higher thanFc5%, 2, 85 = 3.1 so we decided to rejectH0.
=> Therefore, the overall estimators are statistically significant and different from 0, and
there is at least one variable has effect on rental housing price.
d. Error-checking test
i. Multicollinearity
In order to detect the multicollinearity error, we use F-test on Auxiliary Regression:
NUMBER =𝛃𝟏 + 𝛃2*SQUARE + u
7
H0: Multi-collinearity does not exist
H1: Multi-collinearity does exist
𝑹𝟐 /(𝒌−𝟏) 𝟎.𝟎𝟐𝟐𝟐𝟒/(𝟐−𝟏)
Value of test statistic: 𝑭 = (𝟏−𝑹𝟐 )/(𝒏−𝒌) = (𝟏−𝟎.𝟐𝟐𝟐𝟒)/(𝟖𝟖−𝟐) = 24.53
8
- β2 = 129046.8: presenting that NUMBER of people affect a strong impact on PRICE.
NUMBER and PRICE have a positive relationship under no effects of other factors, adding
one person, PRICE increase by nearly 130000 VND. Then, the result is matched with our
expectation toward this relationship, conclude that more people in the room, the more
expensive that it cost.
- β3 = 6280.044: meaning that there exist a positive relationship between PRICE and
SQUARE of the house. If under no effects of other factors, the area increase by 1m2, PRICE
is expected to increase by 6280 VND. The result is matched with the result of the NUMBER
that the house is larger (as well as more people can live), the price of that is more expensive.
f. Discussing overall fitness of the model
Because R2- “Goodness of Fit” (measuring how much variation in the endogenous the
regression model can explain) in which R2 value closer to 1 implies good fitting model and
vice versa, we evaluate the value of R2here. R2= 0.614817 means that ≈ 61.5% of the
variation in the Price of a rent house is explained by the variation in Number and Square.
To discuss about this number, we have to deal with the drawback in our chosen model. In
this model, we ignore the impact of external factors as the supply and demand (at the
beginning of school year, the demand for rent house increases and so does the supply) or
price of other goods (under the effect of inflation, the growth of other prices leads to the
increase of house rent) because external factors are constantly changing over time. Moreover,
the Number variable, which is considered to be the most influential factor on Price and brings
positive sign may has the another sign in fact. As the test we done, the house can contain
more people means that it is larger so the price is also higher; the Number variable has the
positive sign. However, with the same square, the more people live, the lower average house
price; the Number here brings negative one.
However, due to the scale of the data, the number 61.5% show a strong relationship in the
elasticity among Price, Number and Square over 88 observations of Hanu students.
g. Discussing practical implications of the model
Number of people and Square of the house show a strong positive relationship with the Price
of rent house. In fact, many homeowners only rent rooms with a certain number of people, so
increasing the number of people in the house also needs to increase the cost. Because at
nowadays, the housing price usually includes other fees such as cleaning, parking, internet…
9
so when adding one person in the house, the total cost is also higher. In the case of the survey,
Number and Square has the positive relationship that the larger the room, the more people
can live, the higher the room price. Although housing prices are higher for larger rooms in
general, the more people live, the lower the price for each individual. Overall, the results of
using this model are still accurate because we consider housing prices in general.
Finally, this model can be used as quick access for measurement of Price of house rent for a
particular area (Hanoi University) based on two basic elements: number and square.
However, for other areas and other subjects, house prices will be more affected by other
factors.
10
APPENDICES
A. DROPPING “TIME”
11
Variable Coefficient Std. Error t-Statistic Prob.
OLD MODEL:
𝑃𝑅𝐼𝐶𝐸 = 180955.8 − 1011.911 ∗ TIME + 129366.9 ∗ NUMBER + 6330.392 ∗ SQUARE + u
NEW MODEL: 𝑃𝑅𝐼𝐶𝐸 = 174453.2 + 129046.8 ∗ NUMBER + 6280.046 ∗ SQUARE + u
* H0: β2=0, no effect of TIME
H1: β2≠0
* RSSUR=1.47E+12; R2UR=0.615351
RSSR=1.47E+12; R2R=0.614817
(𝑅 2 −𝑅𝑅
𝑈𝑅
2
)/𝑚 (0.615351−0.614817)/1
=> F-statistics = (1−𝑅 2 = = 0.116
𝑈𝑅 )/(𝑛−𝑘) (1−0.615351)/(88−4)
* Fc0.05,1,88-4= Fc0.05,1,84=3.96
* F-statistics = 0.116 < Fc0.05,1,84=3.96
=> We do not reject H0.
=> Therefore, adding TIME is not relevant so we decide to drop TIME. In other words, the
variable TIME has no effect on PRICE and the housing price will not depend on TIME factor
as well.
B. FUNCTIONAL FORMS
12
Dependent Variable: PRICE
Method: Least Squares
Date: 05/16/19 Time: 23:41
Sample: 1 88
Included observations: 88
13
R-squared 0.606098 Mean dependent var 622159.1
Adjusted R-squared 0.596830 S.D. dependent var 209385.8
S.E. of regression 132951.0 Akaike info criterion 26.46684
Sum squared resid 1.50E+12 Schwarz criterion 26.55130
Log likelihood -1161.541 Hannan-Quinn criter. 26.50087
F-statistic 65.39483 Durbin-Watson stat 1.477330
Prob(F-statistic) 0.000000
14
Date: 05/17/19 Time: 01:38
Sample: 1 88
Included observations: 88
C. HYPOTHESIS TESTING
15
Adjusted R-squared 0.605753 S.D. dependent var 209385.8
S.E. of regression 131471.4 Akaike info criterion 26.44446
Sum squared resid 1.47E+12 Schwarz criterion 26.52892
Log likelihood -1160.556 Hannan-Quinn criter. 26.47849
F-statistic 67.83702 Durbin-Watson stat 1.472328
Prob(F-statistic) 0.000000
* H0: 𝛽i = 0 (i=2,3,4,5)
H1: 𝛽i ≠0
* Significant level (α) = 5%
* T-critical (α/2=2.5%, n-k) = t2.5%, 86 = 1.99
* Taket.statisticsfromEviews, we have :
𝛃2 𝛃3
Statistically
Conclusion Statistically significant
significant
=>CONCLUSION: There is enough evidence to conclude that Number and Square have impact
on the housing price.
2) Testing the overall significance of all coefficients (F-test)
* H0: β2 = β3 = 0
H1: At least one coefficient is not zero
(β2 ≠ 0, or β3 ≠0)
* F-statistic= 67.83
* Level of significance (α) = 5%=0.05
16
* F-critical (α=5%, k-1, n-k) = Fc5%, 2, 85 = 3.1
* We have: F-statistic = 67.83 >Fc5%, 2, 85 = 3.1
=> Reject H0
=> CONCLUSION: Therefore, the overall estimators are statistically significant and different
from 0, and there is at least one variable has effect on rental housing price.
D. ERROR
1. Multicollinearity
Dependent Variable: NUMBER
Method: Least Squares
Date: 05/12/19 Time: 02:03
Sample: 1 88
Included observations: 88
17
CONCLUSION: There is enough evidence to conclude that multicollinearity does exist because
the auxiliary regression is considered to be the official test of this error.
In order to correct multi-collinearity, we have some remedial measures such as utilizing a priori
information, combining cross-sectional and time-series data, dropping or adding a variable(s)
and re-specify the regression, additional new data. With our limited model and data, our group
decided to choose "Do nothing" measure seeing that we find this error is difficult to correct and
the main purpose is just to figure out and predict whether this model has multi-collinearity or not.
2. Heteroscedasticity
Test Equation:
Dependent Variable: RESID^2
Method: Least Squares
Date: 05/12/189 Time: 02:06
Sample: 1 88
Included observations: 88
18
S.E. of regression 2.64E+10 Akaike info criterion 50.86610
Sum squared resid 5.93E+22 Schwarz criterion 50.95055
Log likelihood -2235.108 Hannan-Quinn criter. 50.90012
F-statistic 1.215553 Durbin-Watson stat 1.944223
Prob(F-statistic) 0.301648
Test Equation:
Dependent Variable: RESID^2
Method: Least Squares
Date: 05/12/19 Time: 02:20
Sample: 1 88
Included observations: 88
19
NUMBER^2 -88084538 2.86E+0.9 -0.030780 0.9755
NUMBER*SQUARE -3.75E+08 5.71E+08 -0.655984 0.5137
NUMBER 1.02E+10 1.45E+10 0.701961 0.4847
SQUARE^2 65570221 73612778 0890745 0.3757
SQUARE -1.34E+09 1.97E+09 -0.682602 0.4968
CONCLUSION: We can conclude that there is enough evidence to state that higher order
autocorrelation exists.
20