Sie sind auf Seite 1von 3

SMMD Assignment-1 Name: Shubham Chauhan l PGID: 61910037

1. Summary report for House Prices:

1.Price Statistics 2. Price Histogram with Normal Plot 3. Price Versus Living Area

Observations/Comments –
• Mean of House prices is -$163,862 with Median price of $151,917
• Price distribution is positively skewed to right
• Standard Deviation of prices is -$67,652
• Price is directly proportional to the living area available in the house. This can be clearly
observed in the Graph-3 (neglecting few exceptions)

2. Although the plot is not perfectly normal but it does provide us a good statistical insight in terms of normal
mapping. We can clearly see the outliers in the below plotting which indicates very high prices. Also the
eviation from the straight line indicates the deviation from normal behaviour. But it is acceptable for
applying statistical analysis.

Graphical Insight

• As can be seen from the graph, there is a


deviation from normality since quantile plot
doesn’t follow a straight line.
• We can clearly see the outliers in the Box plot.
These values lie outside of ±1.5 IQR
3. Assuming a normal model for Price ~ N(164K, (68K)2)
A. Probabilities for two scenarios - P(Price>92,800) and P(Price<255,000) have been calculated and
reported below-
• Below probabilities have been calculated using Excel function <Norm.Dist>

Mean 1,64,000 Mean 1,64,000


SD 68,000 SD 68,000
xi 92,800 xi 2,55,000
P(Price>92800) 85.25% P(Price<255000) 90.96%

• Below table has been generated using the JMP Software

Price in Tabular Insights


Quantiles Region $ 1.P>92,800
100.00% maximum 446436
• By Perfectly Normal -85.25%
99.50% 385738 • By JMP (Actual Data) – 90%
97.50% 329453
There is a variation of around 5% in both methods. This is because excel
90.00% 255526
formula considers a perfect normal distribution which is not the case in real.
75.00% quartile 205397
JMP software depicts the actual scenario. Both model doesn’t agree
50.00% median 151917 2.P<255,000
25.00% quartile 111875
• By Perfectly Normal -90.96%
10.00% 92785
• By JMP (Actual Data) – 90%
2.50% 62971
0.50% 41830 Probability matches almost exactly i.e the excel calculation and the JMP
0.00% minimum 16858 Software. Data agrees

B. Probabilities for two scenarios - P(Price>232,000)

The probability by excel sheet comes out to be 84.13%.

Mean 1,64,000
SD 68,000
xi 2,32,000
P(Price<232000) 84.13%

The probability by JMP comes out to be almost 84%


Smoothed Empirical Likelihood Quantiles
Quantile Estimate Lower 95%Upper 95%
84% 232141 224528 238139

Yes, both models are similar and agrees with each other

C. Based on theoretical model, the price of 75%percentile house should be $209,865 while the
actual price is $205397 which are pretty much close to each other and hence agrees with each
other.
4. Below is the histogram for the living area data. As can be clearly seen in the histogram, the data appears
to be right skewed which is consistent with the summary statistics (Skewness -0.808)

5. Below is the graph for Log of Living room data. Yes, the graph looks almost perfectly normal when
compared to plotting the living room data.

Graphical Insights

One of the major reason for such an improvement


in the normality is that the variance in the data has
reduced. The standard deviation has reduced from
641 to just 0.350. This has resulted in original
skewness to drop from 0.808 to 0.005

This is the reason that log(Living Area) graph is


almost perfectly normal while the graph for (Living
Area) is right skewed.

Das könnte Ihnen auch gefallen