Beruflich Dokumente
Kultur Dokumente
Abstract
It is very interesting to observe the relation between the per capita expenditure and the three
economic and demographic variables. If we observe the factors on which the per capita
expenditure depends, it will surely tell us about what are the things to be done to improve these
per capita expenditures. The study is done for exactly this purpose by the interior ministry in
USA to find out the relation between the per capita expenditure, the economic ability index, the
percentage of population in the metropolitan areas, and the percentage change in population.
2. Problem statement
First of all we will check if the variables are normal or not. Then the shape of the distribution and
skewness will be checked. We will see whether the distribution is shaped with a high peak or low
peak. Then it is the descriptive statistics of all the variables that will tell us about their means and
the standard deviations. After all that we will estimate the regression line in which we will take
per capita expenditure as dependent variable and the economic ability index, percentage of
population in the metropolitan area, and the percentage change in population as the independent
variables. That will tell us if there is any relation between the selected dependent and independent
variables.
Last but not the least, we will check if the population means of the independent variables are
equal or not through the usage of ANOVA test.
3. Description
Per capita state and local public expenditures and associated state demographic and economic
characteristics are given for the year 1960. We have 48 cases in this regard. The characteristics
that are given in the data are related to the demography and the economic situation of the people.
We have to see if the per capita expenditure really does depend on the variables that we have
selected.
Number of cases: 40
Variable Names:
10
2
Std. Dev = 48.16
Mean = 272.3
0 N = 40.00
180.0 220.0 260.0 300.0 340.0 380.0
200.0 240.0 280.0 320.0 360.0
The histogram is exhibiting positive skewness and low peak. In the above case most of the
observations show a value of $300. Very few observations give a figure of per capita expenditure
of $180 and the average comes out to be $280 (app).
The above figure represents the percentage of population living in the metropolitan areas. In most
observations 50% of population live in the metropolitan areas and only a single observation gives
a figure of 0%. The distribution is slightly negatively skewed and low peaked.
10
2
Std. Dev = 17.39
Mean = 16.2
0 N = 40.00
-5.0 5.0 15.0 25.0 35.0 45.0 55.0 65.0 75.0
0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0
Here the distribution is positively skewed and highly peaked. The percentage change in the
population fluctuates from minimum to the maximum value. The most observations result in a
percentage figure of 15% and only one observation gives a figure of 30%.
4.2 Descriptive statistics:
De scriptiv e Statistics
The above table is a result of the processes done through SPSS n some data. We have taken 40
observations in total.
The per capita expenditure curve is exhibiting positive skewness and low peak. In the above case
most of the observations show a value of $300. Very few observations give a figure of per capita
expenditure of $180 and the average comes out to be $280 app.
In the case of the economic ability index, we can judge that the distribution is negatively skewed
and is low peaked. The mean comes out to be 92.9. Most of the observations give a value of 100
or 85 while only a very few observations give values of 55, 65, and 115.
While the percentage of population living in the metropolitan areas in most observations is 50%
and only a single observation gives a figure of 0%. The distribution is slightly negatively skewed
and low peaked.
The distribution for the percentage change in population is positively skewed and highly peaked.
The percentage change in the population fluctuates from minimum to the maximum value. The
most observations result in a percentage figure of 15% and only one observation gives a figure of
30%.
Assumptions
For testing the suggested hypothesis following assumptions are made.
1. The standard deviations ( σ ) of the populations for all sectors are equal. We can represent this
assumption for sector1 through n as:
σ 1 = σ 2 = σ 3 =…………………………….= σ n
Null Hypotheses H0 : µe = µm = µp
Alternative Hypotheses H1 : µe ≠ µm ≠ µp
ANOVA
data
Sum of
Squares df Mean Square F Sig.
Between Groups 119333.4 2 59666.713 149.337 .000
Within Groups 46746.656 117 399.544
Total 166080.1 119
Now in this test we have to check if the means of the three independent variables that are the
economic ability index, the percentage of people living in the metropolitan areas and the
percentage change in population are equal.
From the above results that have been extracted through SPSS, we can state a few conclusions.
We can see that the p value comes out to be 0.000 that is less than the pre- assigned level of
significance that was 0.05. This fact suggests that there is a significant difference prevalent in the
selected independent variables.
M ultiple Comparisons
Mean
Difference 95% Confidence Interval
(I) group (J) group (I-J) Std. Error Sig. Lower Bound Upper Bound
1.00 2.00 46.2600* 4.4696 .000 37.4082 55.1118
3.00 76.7025* 4.4696 .000 67.8507 85.5543
2.00 1.00 -46.2600* 4.4696 .000 -55.1118 -37.4082
3.00 30.4425* 4.4696 .000 21.5907 39.2943
3.00 1.00 -76.7025* 4.4696 .000 -85.5543 -67.8507
2.00 -30.4425* 4.4696 .000 -39.2943 -21.5907
*. The mean difference is significant at the .05 level.
Now we can clearly see from the above results that are given in the table that there is a significant
difference existent in all the three selected independent variables. In all the cases in the table is
which the comparisons have been done, the p- values are less than the pre- assigned value of the
level of significance that is 0.05. The lower and upper bounds of the distribution have also been
given.
Variables Variables
Model Entered Removed Method
1
%age
change in
population,
economic
ability
. Enter
index,
%age of
pop in
metrapolit
a
an area
Here we can conclude that none of the independent variables that have been selected is dropped
and that is because the coefficients of all the variables are significantly different.
M ode l Summary
From the above results shown in the table, we can see that none of the variables has been
dropped. The given regression line explains abut 50% variation of the dependent variable.
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 45233.558 3 15077.853 12.002 .000 a
Residual 45225.217 36 1256.256
Total 90458.775 39
a. Predictors: (Constant), %age change in population, economic ability index, %age of
pop in metrapolitan area
b. Dependent Variable: per capita state expenditure
The estimated regression line from the above given model is valid because the p- value is less
than the pre- assigned level of significance that is 0.05.
Coefficientsa
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 75.485 35.615 2.119 .041
economic ability index 2.439 .442 .805 5.524 .000
%age of pop in
-.948 .295 -.500 -3.220 .003
metrapolitan area
%age change in
.886 .354 .320 2.500 .017
population
a. Dependent Variable: per capita state expenditure
Here multicolinearity does not exist because there is no case in which the p- value gets greater
than the pre- assigned level of significance. So it is the best model to be selected. Here is no
dropped and it explains the dependent variable a lot. The significant difference between the
coefficients of the variables involved in the model makes it the best option. So from the above
model, the multiple regression line that can be fitted comes out to be.
Corre lations
%age of pop
in
economic metrapolitan %age change
ability index area in population
economic ability index Pearson Correlation 1.000 .586** .185
Sig. (2-tailed) . .000 .253
N 40 40 40
%age of pop in Pearson Correlation .586** 1.000 .386*
metrapolitan area Sig. (2-tailed) .000 . .014
N 40 40 40
%age change in Pearson Correlation .185 .386* 1.000
population Sig. (2-tailed) .253 .014 .
N
40 40 40
We can conclude that there is a significant difference in the coefficients of the independent
variables that are involved in this case. This fact makes a case where none of them can be
dropped when we are to estimate a perfect regression line. The reason is the same that is the p
value comes out to be lower. So there is a case of no correlation between the three independent
variables that have been selected.
5. Conclusion
From all the analysis that has been done, we can conclude the following things.
• The per capita expenditure curve is exhibiting positive skewness and low peak.
• In the case of the economic ability index, we can judge that the distribution is negatively
skewed and is low peaked.
• The distribution is slightly negatively skewed and low peaked in the case of population in
the metropolitan areas.
• The distribution for the percentage change in population is positively skewed and highly
peaked.
• The multiple regression line that can be fitted comes out to be.