Sie sind auf Seite 1von 7

Abstract

In this project the main idea is finding the correlation between the population and the property
price in Ireland. The main hypothesis is that property price is increasing in main cities as the
population grows up. Here we analyze the information about 5 main cities in Ireland which are
Cork, Dublin, Galway, Waterford and Limerick. These cities are the destination of most
immigrants in recent years.
More facilities and the industry developments, better education perks and universities, more job
opportunities, better communication positions and etc. in the urban area make the population
(mostly young people) to prefer to migrate from rural areas to main cities. Ireland cities have also
immigrants from other countries because of its better conditions for life.
These circumstances created more need to have residential properties in the urban areas in Ireland.
As the demand is increasing, the price of properties is growing up in main cities few times more
than rural areas. Here we will analyze the property price and population data in main cities and
show that the correlation between property price and population in recent years are really strong.
We will firs work on the datasets, explain and visualize the main properties of them, then we use
a predictive model in R language to see how we can relate the property price and population in
main cities. We have chosen linear regression model to do this analysis.

Data Description
In this problem we have two main datasets. One is about property price in different cities in Ireland
and the other is about population in different counties in Ireland. Let’s take a look at them step by
step.

Property Price Dataset


The dataset of property price in Ireland has the prices from year 1969 till 2015 for Cork, Dublin,
Galway, Waterford and Limerick. It has also a column for other areas. We considered the data
from 1975 till 2015 and we filled the nan value for other areas in 2015 with the property price of
other areas in 2014.
Figure 1 shows the bar plot of the property price in these years. As it can be seen in this figure the
most expensive accommodations are for Dublin city. Also we can positive slope in property prices
for all cities. Waterford has the lowest prices in all years.
We can also see that the highest prices (the peak) is in years 2006 and 2007 and after these years
we have a little negative slope and then it will be positive again.
Figure 1. Property Prices
Figure 2. Population
Population Dataset
The dataset for population is only for 2006 and 2011 years and splits Ireland to 29 counties. We have total
population, population by sex and population by age. The point is in this analysis we only used total
population. We put all 29 counties into one of the 6 groups in property price dataset (Cork, Dublin,
Waterford, Galway, Limerick and other areas).
As we can’t do much analysis with just 2 years’ information we planned to fill the unknown values with a
simple strategy. With the numbers of 2006 and 2011 we calculate the mean value of population growth in
each year and with this parameter we estimate the population for all other years. Of course it is not a good
way because population growth rate is not 0, but we had to do this to make analysis meaningful.
After estimating unknown information, we plot the population in each groups through these years. Figure
2 shows the bar plot of population data. We can see that the slope of the population is constant which is not
real, because the rate of growth can’t be 0. Also we can see that the most population is in other areas and it
seems that other areas not only contains rural areas but also some urban areas. Beside the other areas the
most population is in Dublin and the least is in Waterford. As we have seen before the Waterford city has
cheaper accommodations and the Dublin city has more expensive properties and this refers to our main
hypothesis.

Hypothesis Testing
After merging 2 datasets we can setup out hypothesis test. The null and alternative hypothesis can be defined
as follow:

 H0: There is no correlations between property price and population.


 H1: Population and property price are somehow correlated.

For testing out hypothesis we used Pearson correlation test which measures a linear dependence
between two variables (x and y). It’s also known as a parametric correlation test because it depends
to the distribution of the data. The plot of y = f(x) is named the linear regression curve. The Pearson
correlation formula is as follow:
∑(𝑥 − 𝑚𝑥 )(𝑦 − 𝑚𝑦 )
𝑟=
2
√∑(𝑥 − 𝑚𝑥 )2 ∑(𝑦 − 𝑚𝑦 )

mx and my are the means of x and y variables.


The p-value (significance level) of the correlation can be determined:
1. by using the correlation coefficient table for the degrees of freedom: 𝑑𝑓 = 𝑛 − 2, where n is
the number of observation in x and y variables.
2. or by calculating the t value as follow:

𝑟
𝑡= √𝑛 − 2
√1 − 𝑟 2
Now let’s take a look at the x-y plots in each city in Figure 3. We can see in these plots that property
price is increasing as the population grows up.

Figure 3. population-property price plots for 5 cities

Results
For testing our hypothesis, we used cor.test function in R with Pearson method. The results for
different cities were as follow:
We can see that in all of cor.test’s results for different cities the alternative hypothesis is accepted
which conforms that the correlation between population and property price is not 0.

Conclusion
We have tested the hypothesis that the population and property price in cities of Ireland are
correlated. We use two datasets, one for population in 2006 and 2011 and the other for property
price. We could get more real results if we had the population information in all years from 1975
till 2015. As we didn’t have most of the years’ population we estimated the unknown populations
with a simple strategy. We proved the alternative hypothesis and observed that in all cities as
population grows up we have some raise in property price.

Das könnte Ihnen auch gefallen