Beruflich Dokumente
Kultur Dokumente
Submitted to:
Submitted by:
Utsav Gahtori
80011314014
PGDM 2014-16
NMIMS Hyderabad
Descriptive Statistics
N
Minimum
Maximum
Mean
Std. Deviation
Skewness
Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error
Statistic
Std. Error
40
-4.311
5.934
2.15145
1.929015
-1.176
.374
2.737
.733
40
12.154
21.872
17.58990
3.066949
-.270
.374
-1.154
.733
40
.125
16.849
3.51630
3.185368
2.584
.374
7.998
.733
Unemployment rate
40
4.750
11.750
7.46792
2.217023
.531
.374
-1.047
.733
Employment
36
24
31
27.26
2.190
.159
.393
-1.142
.768
40
-8.227
12.357
3.80470
3.705395
-.462
.374
1.942
.733
Gross
domestic
product,
constant prices
Gross national savings
Inflation, average consumer
prices
Kurtosis
36
In this case, I have taken six key economic variables to gain an insight on countrys performance. These variables are GDP growth rate,
Total Investment, Gross National Savings, Unemployment Rate, Inflation and Employment.
Skewness is the measurement of the asymmetry of the distribution of the real valued random variables. For a data to be normally
distributed, Skewness lies between -1 to +1. But, if we observe here, in case of Inflation and GDP growth rate, the skewness is way
above +1, i.e. 2.584 and -1.176 respectively. This depicts that the data that we have is not symmetrical. For GDP growth rate, a negative
skewness depicts asymmetrical distribution with a long tail to the left (lower values). Whereas, in case of Inflation, the positive skewness
depicts asymmetrical distribution with a long tail to the right (higher values). So, in this case, the skewness is substantial and the
distribution is far from symmetrical.
Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to
have a distinct peak near the mean, decline rather rapidly, and have heavy tails. From the above give Kurtosis value, it can be seen that
the statistics are either more than 1 or less than 1.
So, in order to use the data set for further interpretation, we apply the transformative algorithm to normalize the data set. When that is
done, we again apply a descriptive statistic to see whether the skewness and kurtosis are within the range or not. If yes, then further
interpretations can be made from the data set. And, if not, then we identify the outliners by the way of creating a boxplot and then
identify the outliers, and subsequently removing them to make the data useful for further study and interpretations.
2) GDP growth rate between 1980 to 2000 is same as between 2001 to 2020 using Independent sample T-Test.
As it was seen in the last question that both skewness and kurtosis for GDP growth rate was more or less than 1, so in that case we first
normalize the data by transformation. After normalizing the data set, we again run a descriptive test to measure the skewness and
kurtosis.
Statistics
Ln_gdp
Valid
37
N
Missing
Mean
6
1.0636
Median
.9373
Mode
.94
Std. Deviation
1.22916
Variance
1.511
Skewness
4.205
.388
23.476
.759
Minimum
-.81
Maximum
7.61
Sum
Percentiles
39.35
25
.7749
50
.9373
75
1.2032
Here, we find that still the values are either greater or lesser than 1. So, in this case, we develop a boxplot chart, and find out the
outliers. Outliers are basically observation points that are distant from other observations, and thus creating a conflict in the statistic
output that are used for making any decision. So, now we identify the outliers and eliminate them, and again run a descriptive statistics.
Descriptive Statistics
Minimum
Maximum
Mean
Std. Deviation
Skewness
Kurtosis
Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error
Statistic
Std. Error
Ln_gdp
32
.50
1.71
.9940
.27642
.573
.414
.362
.809
Valid N (listwise)
32
Here, we find out that now skewness and kurtosis both lie within the desired range, indicating that the data set is now
normalized. So, now we run the Independent T-test. The independent-samples t-test compares the means between two
unrelated groups on the same continuous, dependent variable.
From the T-test performed, we see that the significance level is .006, which is less than .05, so we accept that alternate hypothesis.
This implies that the GDP growth rate is not similar for two different group of time, which indicates that GDP growth rate has not been
constant, and it can either be a positive growth rate or negative when compared with the other group.
3) Change in Total Investment, Gross National Savings, Employment %, Unemployment rate, Inflation change by
dividing the data into three groups (1980 to 1991, 1992 to 2003 and 2004 to 2015) using ANOVA.
Creating
Hypothesis:
H0: There is no change in the variables across three given period of time (Groups)
Ha: There is a significant change in the variables across three given period of time (Groups)
Here again, we first run a descriptive statistic to see if the skewness or kurtosis value is less than or greater than 1.
here
will
compare
various
parameters
among
themselves
and
provide
significant
value.
Statistics
Inflation,
Gross
average
savings
Employment
rate
consumer prices
Valid
32
32
32
32
28
Missing
Mean
96.48012
17.54431
19.6458
7.29247
27.34
Median
94.79150
17.04200
19.5035
6.22500
27.33
5.400
24
16.29
Mode
48.701
Std. Deviation
27.620588
2.867211
2.00870
2.363304
2.230
Variance
762.897
8.221
4.035
5.585
4.974
Skewness
-.106
-.134
1.355
.733
.108
.414
.414
.414
.414
.441
Kurtosis
-.918
-.907
4.173
-.972
-1.021
.809
.809
.809
.809
.858
Minimum
48.701
12.154
16.29
4.750
24
Maximum
141.047
22.415
26.75
11.750
31
Sum
3087.364
561.418
628.67
233.359
766
25
82.53925
15.93300
18.4700
5.40000
25.35
50
94.79150
17.04200
19.5035
6.22500
27.33
75
124.49575
19.85600
20.3290
9.28125
29.21
Percentiles
12.154
So, here we observe that in case of Total Investment, the skewness and kurtosis are greater than 1. So, now we normalize the data
set by applying transformation algorithm, and then again run a descriptive statistic to see whether the correction is done or not. If
not, then we identify the outliers and eliminate them.
Descriptive Statistics
Minimum
Maximum
Mean
Std. Deviation
Skewness
Kurtosis
Statistic
Statistic
Statistic
Statistic
Statistic
Statistic
Std. Error
Statistic
Std. Error
Ln_TI
40
2.71
3.24
2.9702
.10891
.185
.374
.886
.733
Valid N (listwise)
40
Now, since all the data set is normalized, we run the ANOVA. The test results are below:
ANOVA
Ln_TI
Sum of Squares
df
Mean Square
Sig.
Between Groups
.308
.154
32.962
.000
Within Groups
.149
32
.005
Total
.457
34
Employment
Unemployment rate
Between Groups
123.152
61.576
Within Groups
27.542
32
.861
Total
150.695
34
Between Groups
73.918
36.959
Within Groups
93.367
32
2.918
Total
167.285
34
Between Groups
279.936
139.968
Within Groups
56.307
32
1.760
Total
336.243
34
Between Groups
18568.206
9284.103
Within Groups
2900.076
32
90.627
Total
21468.282
34
71.543
.000
12.667
.000
79.545
.000
102.443
.000
Multiple Comparisons
Tukey HSD
Dependent Variable
(I) Year_Group_ANOVA
(J) Year_Group_ANOVA
Mean
Sig.
(I-J)
2.00
.09798
3.00
.23014
1.00
Ln_TI
1.00
-.09798
3.00
.13216
2.00
1.00
-.23014
2.00
-.13216
3.00
1.00
Employment
2.00
3.00
1.00
Unemployment rate
2.00
3.00
1.00
2.00
3.00
prices
consumer
Upper Bound
.02852
.005
.0279
.1681
.02852
.000
.1600
.3002
.02852
.005
-.1681
-.0279
.02790
.000
.0636
.2007
.02852
.000
-.3002
-.1600
.02790
.000
-.2007
-.0636
-1.654
.387
.000
-2.61
-.70
3.00
-4.554
.387
.000
-5.51
-3.60
1.00
.387
.000
.70
2.61
.379
.000
-3.83
-1.97
.387
.000
3.60
5.51
.379
.000
1.97
3.83
.713015
.002
.89461
4.39889
.713015
.000
1.69644
5.20073
.713015
.002
-4.39889
-.89461
.697343
.491
-.91180
2.51546
.713015
.000
-5.20073
-1.69644
.697343
.491
-2.51546
.91180
.553713
.000
1.31391
4.03527
.553713
.000
5.53866
8.26002
.553713
.000
-4.03527
-1.31391
.541542
.000
2.89398
5.55552
.553713
.000
-8.26002
-5.53866
2.00
1.654
3.00
-2.900
1.00
4.554
2.900
2.00
2.00
3.00
2.646750
3.448583
1.00
-2.646750
3.00
.801833
1.00
-3.448583
2.00
-.801833
2.00
3.00
2.674591
6.899341
1.00
-2.674591
3.00
1.00
2.00
4.224750
-6.899341
-4.224750
.541542
.000
-5.55552
-2.89398
-32.524182
3.973806
.000
-42.28930
-22.75906
3.00
-56.778598
3.973806
.000
-66.54372
-47.01348
1.00
32.524182
3.973806
.000
22.75906
42.28930
3.886459
.000
-33.80489
-14.70394
2.00
average
Lower Bound
1.00
Inflation,
2.00
3.00
-24.254417
1.00
56.778598
3.973806
.000
47.01348
66.54372
24.254417
3.886459
.000
14.70394
33.80489
3.00
2.00
*. The mean difference is significant at the 0.05 level.
Since, the significance level value is less than .05 in all the cases, we can infer that there is a significant change in the
variables across three given period of time (Groups)
We see the mean difference between Employment rates in the three periods. As is visible the difference between 1 and 2 is -1.654
and between 1 and 3 is -0.4554. This means that period 1 has the lowest Employment rate. Also it can be noted that the difference
between period 2 and 3 is -0.2900 implying period 3 has more employment rate than period 2.
Collating all of the above analysis we can say that the Employment rate follows the following in increasing order: Period
1<Period 2<Period 3.
domestic Implied
PPP Gross
savings
prices
Pearson Correlation
Gross
domestic
product,
constant prices
Sig. (2-tailed)
N
Inflation,
average
consumer
prices
Unemployment rate
Employment
40
Pearson Correlation
-.017
Sig. (2-tailed)
.916
40
-.017
.241
-.458
.916
.133
.003
40
40
Pearson Correlation
.241
-.759
Sig. (2-tailed)
.133
.000
40
40
-.458
Sig. (2-tailed)
.003
40
Pearson Correlation
.796
Sig. (2-tailed)
N
**
**
**
-.741
-.741
.000
.000
40
40
.502
40
**
.502
.000
.001
40
40
**
and rate
40
.000
40
40
Pearson Correlation
-.126
.819
-.910
Sig. (2-tailed)
.464
.000
.000
36
36
36
.428
**
.006
40
**
40
.000
40
-.516
36
-.015
-.707
.925
.000
40
40
**
.428
-.910
36
**
-.516
.006
.001
40
40
36
.060
-.196
.712
.251
40
40
36
.060
-.758
.000
40
-.196
-.758
.001
.251
.000
36
36
36
36
**
36
**
.000
.014
40
**
**
36
**
40
*
.819
.000
.712
40
**
40
-.385
40
.841
.464
40
.014
Sig. (2-tailed)
.841
40
.089
.587
.000
.000
.925
-.707
-.126
.089
.000
.033
.033
.001
-.385
**
**
.587
40
40
.796
.272
.272
Pearson Correlation
**
**
-.015
**
**
40
-.759
40
Pearson Correlation
goods
services
**
of
Employment
For the correlation part, we have taken GDP, Implied PPP, Gross National Savings, Inflation, Volume of Imports of goods and services,
unemployment rate and employment as the key economic variables.
From a descriptive statistic, we observe that all the variables are normally distributed as the skewness and kurtosis were not less than or
greater than 1. So, we dont need to apply transformative algorithm to normalize the data set.
Here, we will compare the given economic variables with each other and infer our results.
We can observe from the correlation that GDP is positively correlated with Gross National Savings, Volume of imports of goods and
services, and employment rate, whereas negatively related to the rest of the variables. It is seen that GDP is highly correlated with volume
of import of goods and services, which in fact, is a practical fact.
Similarly implied PPP is negatively correlated to all the variables except the employment rate.
Also, Gross National Savings is positively correlated to Inflation, Volume of goods and services and unemployment rate, whereas negative to
all the remaining factors. This implies that the National Savings is highly affected by the inflation rate and unemployment rate, which again
is a plausible factor to be considered.
Similarly, relation between the remaining variables can be found out and commented upon.
**
**