Sie sind auf Seite 1von 9

Case Study: Analysis of the Athens, GA Real Estate Market

The purpose of this analysis is to examine the different selling prices for single-family houses in
Athens, GA based on which neighborhood the house is located, the number of offers on the
house, the number of bathrooms, the number of bedrooms, and the square footage. It is hopeful
that this analysis will give a better understanding of the effects of different property attributes on
the selling price for a specific house and to generate an ideal price for which a home should be
sold for.

First, in order to construct a thorough analysis, scatterplots of the response variable by each of
the quantitative explanatory variables is created to gain a visual understanding of the behavior of
the selling price based on number of offers on the house, square footage of the house, and the
number of bedrooms and bathrooms. The data plotted was gathered based on the information
from n=128 recent sales in the Athens, GA area. Looking at Figure 1, there is a linear
relationship between selling price and number of offers on the house, but with a negative trend,
indicating that as the number of offers increases, the selling price decreases. Looking at Figure 2,
there is a clear positive linear relationship between selling price and square feet, therefore, the
selling price increases as square footage increases. Looking at Figures 3 and 4, there is also a
positive linear relationship between selling price and both the number of bedrooms and the
number of bathrooms, indicating that the selling price increases as the number of bedrooms and
bathrooms increases.

This analysis continues by fitting a full linear regression model to both the quantitative
explanatory variables and other quantitative variables, to the response variable. For the purpose
of this analysis the explanatory variables were brick, Boulevard, Cobbham, number of bedrooms,
number of bathrooms, square footage, and number of offers, while the response variable was
selling price. Through this full linear regression, an equation for predicted selling price was
generated (see Equation 1). Through further statistical analysis of available data, an analysis of
variance (ANOVA) and coefficients tables was created (see tables 1 & 2). These statistics
explained 86.86% of the variation for property attributes with the average error in predicted
value being 10,018.94 units.

After analyzing the overall fit of the model, the quantitative explanatory variables need to be
specifically analyzed to determine if they are important indicators for the model. Based on
further statistical analysis of the explanatory variables, it appears that all of the quantitative
explanatory variables are important in predicting the selling price for a single-family home in
Athens, GA. A statistically significant relationship between selling price and all of the
quantitative explanatory variables was found at the = .05 level, since all of the p values were
less than .0001. The Variance Inflation Factors (VIFs) for the variables was generated in Smart
Reg to determine if multicollinearity was a problem in the model. By looking at the VIFs for the
variables, it is safe to conclude that multicollinearity is not a problem in this model because all of
the variables have VIFs of less than 10, therefore no variables are redundant.

This analysis continues by performing a hypothesis test for the neighborhood variable to
determine whether or not there are differences in the selling prices among the different
neighborhoods after accounting for all of the other property attributes for the model. The
hypothesis tested will be: H o : 1= 2=0 ; H a : At least one of the coefficients is not equal
to 0. Both the Boulevard and Cobbham variables were removed from the full model, and then a
new partial model was fitted to determine if there really was a difference between the response
variable and specific explanatory variables. Once the new model was fitted, a partial F Test was
conducted to compare the full and reduced models. A statistically significant relationship was
found at the = .05 level, F( 128 )=39.0747 , p < .0001. The statistics from the reduced
model explain 78.31% of variation, with an expected average error in predicted values of
12768.46 units. After the analysis of the reduced mode, it can be concluded that the full model,
including the neighborhood variables, is the best model fit because it has a higher R square and
smaller standard error than the reduced model without the neighborhood variables. Since a
statistically significant relationship was found, it can be concluded that there is indeed a
difference between the selling prices of homes in the different neighborhoods.

Next, for the purpose of this analysis, a residual plot of the standardized residuals against the
predicted values of the response variable is generated to determine if there is a linear relationship
between the response and explanatory variables. This residual graph shows a random scatter of
the residuals, therefore indicating that the assumption of linearity has been met. All of the
residuals fall within 3 standard deviations of the normal mean, therefore 99% of the data can be
predicted from our model.

This analysis continues with performing an analysis of influence on the full model. Leverage and
Cooks D plots were created to determine the observation with the highest leverage and the two
observations with the greatest influence on the model. By looking at the leverage graph (see
Figure 5), the observation with the highest leverage point appears to be observation 117. Looking
at the Cooks D graph, the two most influential points appear to be observations 68 and 104.

To conclude this analysis, two estimates will be done to predict the selling price of two different
houses. The first house, located in the Cobbham neighborhood, has had 2 different offers on it so
far, is made of brick, is 2000 square feet in size, and has 2 bedrooms and 2 baths, is predicted to
sell for $133,169.50. The second house, located in the Woodlawn neighborhood, but otherwise
has the same profile as the first house, is predicted to sell for $153,850.53.

To sum up the analysis done in this report, there does appear to be a difference in the selling
price among the different neighborhoods. The selling price also increases as the square footage,
number of bedrooms and number of bathrooms increases and as the number of offers decreases.
It is seen to be more expensive to purchase a house in the Woodlawn neighborhood as opposed to
Boulevard and Cobbham. It is hopeful that this analysis provided an optimal selling price for a
home in Athens, GA with different property attributes.
APPENDIX

Supporting Equations, Tables, and Graphs

Figure 1: Scatterplot for Estimating Selling Price for Number of Offers


Figure 2: Scatterplot for Estimating the Selling Price for Square Feet
Figure 3: Scatterplot for Estimating Selling Price for Number of Bathrooms
Figure 4: Scatterplot for Estimating Selling Price for Number of Bedrooms
Figure 5: Leverage Graph
Figure 6: Cooks D Graph
Table 1 & 2: Regression Analysis: ANOVA and Coefficients

y=22840.54 + (2241.62 ) Boulevard + (20681.04 ) Cobbham+ (8267.49 ) Offers+ 52.99 SqFt +17297.35 Brick +(1)
4246.79 Be

Das könnte Ihnen auch gefallen