Sie sind auf Seite 1von 19

Clementine Amoss DATA VIEW OF SPSS Row for a respondant Column- for a variable VARIABLE VIEW OF SPSS Measure

e nominal, ordinal, interval and ratio In spss, thr r only 3 measuresnominal, ordinal and scale..intrval and ratio are treated as scale. Variable 1 current brand Though it is a brand, Nominal which can be coded are marked as numeric itself. As the brands given here can be coded. String is used when we have some open ended qtns like others, givce suggestions etc.. LABEL the abbreviation which we have used , the expansion of the abrveiation VALUES - Coding procedure is used when the variable is nominal or scale Here current brand is qualitivative and hence for coding purposes we have coded closeup-1 , colgate 2 pepsodant 3 others- 4 Type 1, 2 3, 4 in value column and then corresponding labels in the label column. MISSING in case a respondant dint aswer a particular qtn, hence we will do a diffrnt analysis Here, in this we assume that ther is no missing variable 2nd and 3rd qtn, value and label are not needed, as the answer is in numeric form itself In 4th qtn , using dental health it mite give a wrong contradiction if the coding is given as 1- xtremely healthy, 2- healthy, 3 etc.. So we need to give it as 5- xtremely hlthy, 4- healthy 3 unhealthy 2 extreamlyunhlthy DATA entering is done in data view. Do the data entering as coded

NOTES : Normally it is advisable to enter in the order of the qtns itself WORKING MOTHERS Breast milk is a complete food if the resp types true, then it is true but if someone has typed falso, then its taken as 0 The statement correctness is checked as the survey if for the awareness of the breast feeding Cant say = o as the stdy is on awareness, and it shows the ignorance of the ladies When working with an existing DB, have a look at the variable and the codes of each sectors Go to transform compute variable and add the data variables Go to analyze Taking mean Awareness = 0-18 mean is 9 If >9 high awareness If we shoud have low awarensss, medium awarenss and high awareness, then dif=vide it into 3 classes of equal intervals 0-6,7-12 and 13- 18 Hence prgrm of awareness ws successful tAking median 50% is above 12 and 50% below 12 Hence prgrm is successful 3 steps Checking the relationships Testing Strength of relationship

TO CHECK WHETHER THERE IS RELNSHIP BTWEEN VARIABLES Which sector indpt variable ,nominal measurement Awareness level dept variable, scale measurement There r diff methods of measurement types, for investigative purposes. Eg: correlation, regression etc.

If indpt variable is nominal and dept variable is scale, then compare AM. If indpt variable is nominal and dept variable is ordinal, then we do cross tabulation, to find out whether there is relationship or not (nominalXordinal) If indpt variable is ordinal and dept variable is nominal, then we do cross tabulation, to find out whether there is relationship or not (ordinal x nominal) If indpt variable is ordinal and dept variable is ordinal, then we do cross tabulation, to find out whether there is relationship or not (ordinal x ordinal) ( we tell idepedendt variable 1st and dept variable 2nd ) Also sign of rank correlation can also be used if the data can be ranked If nominal X nominal, we can use cross tab as well.( but its wrong we use correspondence analysis) If ordinal X scale , comparison of AM to find out relationship If scale X scale , then we use sign of correlation coefficient and sign of regression correlation to find out the relationship between variables. Tools for measuring the Strength of relationship Nominal Not applicable for beginners Ordinal Contingency coefficient Cramers v Phi coefficient Lambda Uncertainity coefficient Gamma Somers D Kendalls tau-b Kendalls tau-c Rank order correlation Not applicable for beginners scale Eta

Nominal

Ordinal

Scale

Contingency coefficient Cramers v Phi coefficient Lambda Uncertainity coefficient Not applicable for beginners

Eta

Correlation coefficient Regression coefficient R2 (coefficient of determination)

To find the relation :Analyze means AM

Descriptive statistics frequency ; now checking whether the frequency has a relation with the sector

Report sum_awareness sector bank IT nurse contract labours teacher Total Mean 12.5581 12.0000 11.5750 9.2553 12.5915 11.6567 N 43 32 40 47 71 233 Std. Deviation 2.60313 2.48868 7.00508 2.90029 2.89126 4.01212

INTERPRETATION
If there are significant difference between AM between sectors , then only we can say that there is relationship between variables. Here only contract variables are the only sectors showing a low awareness level Bank, IT, teachers we can say that these people have a high level of awareness (keeping in mind the avg mean is 9 i.e between 0-18)

Go to descriptive statistics -cross tab ..now move the required variables to row and columns..(usually indpt is in row and dept is in column) -- cells and click on ROWS under percentages as the indpt variable here (sector) is in ROWS

sector * Do/did you BF baby on demand Crosstabulation Do/did you BF baby on demand No sector bank Count % within sector IT Count % within sector nurse Count % within sector contract labours Count % within sector teacher Count % within sector 18 41.9% 16 50.0% 9 22.5% 25 53.2% 24 33.8% Yes 25 58.1% 16 50.0% 31 77.5% 22 46.8% 47 66.2% Total 43 100.0% 32 100.0% 40 100.0% 47 100.0% 71 100.0%

Total

Count % within sector

92 39.5%

141 60.5%

233 100.0%

In interpreting cross tab data use only percentage , not frequency The above table shows a 50% prediction accuracy in all sectors except for teachers. Since there are not much of high percentage the prediction accuracy is less. If only there have been above 80%, then we can say the prediction accuracy is high and we can say that there are sector wise difference in awareness This is strength of reln btween variable when either dept or indpt variable is NOMINAL Analysing/interpreting cross tab For example : Fail pass Male 50 50 female 50 50

Fail/pass cant be predicted using gender.hence this is called no relationship in cross tab. Male 100 0

0 100 female 100 % prediction is possible provided one variable is nominal. So whatever statiticscal analysis we do in cross tab, almost everything will lie between these two situations

Report ATTITUDE TOWARDS EDUCATION Recoded SES Low Middle High Total Mean 77.47 69.68 61.41 69.48 N 73 151 76 300 Std. Deviation 12.544 12.264 8.950 12.868

INTERPRETATION : as socioeconomic status increases the attitude towards edn decreases HOW TO CATEGORISE FROM SCORE Transform recoded to different variables - change the variable and label old and new values

Change the variable name and change the range in old new value range column ( in these case we have decided that 30-60 its code is 1 and 61-99 is code 2 . now in the new column there will be values either 1 or 2 . (1- low attitude and 2 is high attitude)

Tools for testing the significance Nominal Nominal Ordinal Scale Chi- square Ordinal Chi- square Chi- square scale t- test ANNOVA t- test ANNOVA Test- correlation Test regression Test R2

Monday, October 08, 2012 SPSS ordinalXordinal Strength of relationship can be explained using sign This happens when the ordinal variable is increasing/decreasing. Depends on the selection of the The variables being tested . Low Mod Low 100 0 Mod 0 100 High 0 0 SE increases, the performance increases - +ve reln Low Mod Low 0 0 Mod 0 100 High 100 0 SE increases, the performance decreases - -ve reln High 0 0 100

High 100 0 100

Low Mod High Low 33 33 34 Mod 0 100 0 High 100 0 100 Theres no prediction accuracy in any row or column. Reltionship is hence zero here.

And the strength of relnship is also zero

How to recode? Transformrecode into diffrenet variables


Recoded SES * recoded attitude towards education Crosstabulation recoded attitude towards education low attitude Recoded SES Low Count % within Recoded SES Middle Count % within Recoded SES High Count % within Recoded SES Total Count % within Recoded SES 5 6.8% 26 17.2% 32 42.1% 63 21.0% high attitude 68 93.2% 125 82.8% 44 57.9% 237 79.0% Total 73 100.0% 151 100.0% 76 100.0% 300 100.0%

Interpretation Ses low , performance is 93.2% high SES mod , performance is 82.8% SES high, performance is 79% i.e. ve reln as SES increseas, the perf is decreasing

analyse descriptive stat crosstab statitsics Gamma ( since its ordinalXordinal)


Symmetric Measures Asymp. Std. Value Ordinal by Ordinal N of Valid Cases a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. Gamma -.600 300 Error
a

Approx. T .087

Approx. Sig. .000

-5.350

Interpretation ( as interpreting the correlation data) -1 perfect negative correlation +1 perfect positive correlation 0 non linear relationship

0-1/3 low 1/3-2/3 mod 2/3-1 high linear correlation When u click on somers D( analyse descriptive stat crosstab stat somers d)

Directional Measures Asymp. Std. Value Ordinal by Ordinal Somers' d Symmetric Recoded SES Dependent recoded attitude towards education Dependent a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. -.277 -.399 -.212 Error
a

Approx. T .047 .067 .039

Approx. Sig. .000 .000 .000

-5.350 -5.350 -5.350

Interpretation
Meaning there is an assumption behind reln between indpt and dept variables . Here the attitude is depnt variable and recoded SES is indpt variable in this study, hence we need only take the 3rd result in Somers d. TESTING OF HYPOSTHESIS To check wther the sample is a characterstic of the population H0- no reln ; H1 there is reln between variables WHICH CHARACTERESTIC? Can be relan between two variables Can be strength of reln btween two variables This being found in sample, we need to chek wther this can be projected to the population Hence , checking the above mentioned characteristic in population is called testing of hypothesis The method is characterized by p value or significant value If p value<0.05 , H1 is accepted If p>0.05, H1 is rejected i.e, h0 is accepted T- test Used when there is comparison between 2 groups i.e AM of Male vs AM of Female

ANOVA Used when there is comparison between more than 2 groups

Example : 1. Female students have a better attitude towards education. Comparing the means , as there are only 2 values and going to test the hypothesis Analyse compare means Indpt sample t-test Variable to be tested ( attitude towards edn) is test variable and the indpt variable is sex
Group Statistics SEX ATTITUDE TOWARDS EDUCATION MALE FEMALE N 168 132 Mean 68.52 70.70 Std. Deviation 13.054 12.572 Std. Error Mean 1.007 1.094

Shows that male has less attitude twrds edn and females have more attitude towards edn But, now checking whthr this is a representative of the sample, we check the significance level using t-test 0.146 > 0.05, hence H1 is rejected. i.e there is no significant diffrnce in attitude for males or females One smple t-test used in industries Indpt sample t-test - when 2 samples are independentaly chosen Paired sample t-test used when one group doesnt have an indpt choice ; its not random choice ; i.e mother-child ( its a dept sample) ANOVA When more than 2 groups are there for indent var, we use anova. For anova, we use f-distrbtn, where as for t-test we use t-distrbn. There is no concept of sign in fdistrbn. For example. Ho: mean(male)=mean(female) H1: mean(male) not equal to mean(female) Or mean(male)>mean(female) or men(male)<mean(female). Using anova , we will be able to prove only not equal to, > or < cannot be tested using anova. To test direction hypotheses, we will use only z-distrn or t-distrbn. We can change the sign of t-distribn, by accordingly changing the group definition. For example, if we have given hypothesis, mean(male)>mean(female). Then we should give group definition as group1 and group 2. Alternatively, we should give group definition as group2 and group 1. t-value can be made +ve and ve by changing the definition of hypothesis. If only 2 groups are there, best method is to use t-test always.

Example:
Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference Sig. (2F ATTITUDE TOWARDS EDUCATION Equal variances assumed Equal variances not assumed - 285.973 1.465 .144 -2.179 1.487 -5.106 .748 .479 Sig. .489 t 1.459 df 298 Mean Std. Error Upper .761

tailed) Difference Difference Lower .146 -2.179 1.494 -5.119

Here we can see t value is -1.459, hence, we can say the direction of hypothesis Anova Analyzecomapre meansone way anova Factor represents the indpnt var Eg:
ANOVA ATTITUDE TOWARDS EDUCATION Sum of Squares Between Groups Within Groups Total 351.011 49159.825 49510.837 df 1 298 299 Mean Square 351.011 164.966 F 2.128 Sig. .146

Here we can see sig value(p-value) is.146, hence we will not be able to say direction, but instead we can say that there is no significant difference between mean

If p-value<0.05, accept H1 If p-value >0.05, accept H0

PA 765:type in google, to get notes for interpretations on spss output. If sig value for one way anova is >0.05, no need for subgroup analysis..the following post-hoc analysis is used for subgroup analysis Post-hoc analysis: Tukey and LSD are commonly used. Tukey is much better. This is done to check the varitaions within the groups.(for eg, to know which group has variation etc..)

Multiple Comparisons ATTITUDE TOWARDS EDUCATION Tukey HSD Mean Difference (I) STANDARD VII STD - EARLY (J) STANDARD IX STD - MIDDLE XI STD - LATE IX STD - MIDDLE VII STD - EARLY XI STD - LATE XI STD - LATE VII STD - EARLY IX STD - MIDDLE *. The mean difference is significant at the 0.05 level. (I-J) 4.270
*

95% Confidence Interval Std. Error 1.804 1.804 1.804 1.804 1.804 1.804 Sig. .049 .051 .049 1.000 .051 1.000 Lower Bound .02 -.02 -8.52 -4.29 -8.48 -4.21 Upper Bound 8.52 8.48 -.02 4.21 .02 4.29

4.230 -4.270
*

-.040 -4.230 .040

From the above table, w can see the difference btn 3 groups-7th,9th and 11th. Here we can see, 7th and 9th has sig value .049, which is <0.05 and hence accept h1., where as, for the other groups, we can see sig value>0.05 and hence accept h0 hypothesis.

ANOVA ATTITUDE TOWARDS EDUCATION Sum of Squares Between Groups Within Groups Total 367.106 49143.731 49510.837 df 4 295 299 Mean Square 91.776 166.589 F .551 Sig. .698

Here, we need not do subgroup analysis since sig value>0.05.

Chi-square test Use crosstabs, take row percentage, and then statistics, choose chi square, choose accordingly for nominal and ordinal,which all tests to use.(ex;nominal: contingency coeef etc, ordinal:gamma etc)

Chi-Square Tests Asymp. Sig. (2Value Pearson Chi-Square Continuity Correction Likelihood Ratio Fisher's Exact Test Linear-by-Linear Association N of Valid Cases .601 300 1 .438
b

Exact Sig. (2sided)

Exact Sig. (1sided)

df
a

sided) 1 1 1 .437 .526 .436

.603

.402 .607

.477

.264

a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 27.72. b. Computed only for a 2x2 table

Here we can see, that pearson chi-square sig value =.498 which is >0.05, hnce accept null hypo, here we cans ay there is no significant diff btn gender and attitude. Correlation Analyzecorrelatebivariate Correlation is symmetric. Hence, keep both the variables in the variable list. For large samples, correlation will always be significant.(dnt prove hypothesis using correlation) For pilot study, if we use correlation to interpret data, we get a good idea of the relationship between the variables. Partial correlation To control the relationship btn 2 variable using another var. we can use any number of variables to control the effect of the variables.

REGRESSION When both dept and indept variables are scale. Measures assymetric reln , whereas correlation measures symmetric reln

Correlation tells strength of reln and direction of reln. This is applicable in regression also. Another characteristic is the predictive power of regression ( which is not applicable in correlation) Hence regression is found to be a highly used method in research Simple regression Y= a+bx where, y= dept variable and x= indpt variable A= y intercept ( if indept variable is absent, the value of the dept variable is called y intercept; its also called CONSTANT OF REGRESSION) B= regression coefficient or slope xy denotes , x will influence y change produced in dept variable when the indept variable changes by one unit is called coefficient of regression example :Attitude towards edn and parental encouragement

Coefficients

Standardized Unstandardized Coefficients Model 1 (Constant) PARENTAL ENCOURAGEMENT SCALE a. Dependent Variable: ATTITUDE TOWARDS EDUCATION B 35.314 .840 Std. Error 3.134 .075 .542 Coefficients Beta t 11.266 11.123 Sig. .000 .000

INTERPRETATION Y=a+bx Attitude twrds edn = 35.314+0.840(parental encouragement) Consider a situation where parental encouragement= 0, attitude twrds edn= 35.315 When parental encouragement is increased by 1 unit, the attitude twrds edn increases by 0.84 units TESTING In general the rule is , H0= model is not valid for the problem Regression coeff

H1 = model is valid for the entire population OR H0 : R 2 =0 H1 : R 2 not equal to 0 The developed equation from sample, can I claim that it is applicable for the entire population This can be said by interpreting the ANOVA table Here p=0.00 which is <0.05 which means that the model is valid for the population

Model Summary Adjusted R Model 1 R .542


a

Std. Error of the Estimate

R Square .293

Square .291

10.835

a. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE

If its simple regression, the R will be correlation score it self R 2 = coefficient of determination = explains the percentage of variance of dept variable explained by indept variable Interpretation of above table R 2 = is always explained as a percentage = 29.3% = reln of parental encouragement on attitude twrds education can be explained only 29.3% If coefficient is negative , an example by doing simple regression Parental encouragement

Model Summary Adjusted R Model 1 R .509


a

Std. Error of the Estimate

R Square .260

Square .257

11.091

a. Predictors: (Constant), SOCIO-ECONOMIC-STATUS

Can be explained 26%

But,
Coefficients
a

Standardized Unstandardized Coefficients Model 1 (Constant) SOCIO-ECONOMICSTATUS a. Dependent Variable: ATTITUDE TOWARDS EDUCATION B 87.648 -.239 Std. Error 1.890 .023 -.509 Coefficients Beta t 46.384 -10.221 Sig. .000 .000

Hence, when one unit SES increases, -0.239 decrease in attitude towards education MULTIPLE REGRESSION Y= a+b1x1+ b2x2 B1 = change in y when x1 increases by 1 unit , provided x2 is constant B2 = change in y when x2 increases by 1 unit, provided x1 is constant

X1

y X2 X1 and x2 influence y

For example, Marks = 5+3(hrs of preprn)+4(no of bookd referred) Preprn =0, books =0 then M= 5 Prepr =1, books =1 then M = 12 Preprn =1, books = 2 , then M= 16 Preprn =1 , books = 3, M = 20 Its a change produced in the dpt variable(y) corresponding to one unit change in a indept variable(of books ) when other indept variable( preparation ) remains constant

INTERPRETATION
Model Summary Adjusted R Model 1 R .612
a

Std. Error of the Estimate

R Square .375

Square .370

10.210

a. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE, HOME CLIMATE SCALE SCORE

Coefficient of multiple determinations( when all indpt variable are used)

Can be explained 37% When we add more indpt variable into the model, the decimal of R2 value seems to increase. Hence to nullify that effect, the adjusted R2 value is introduced.

Coefficients

Standardized Unstandardized Coefficients Model 1 (Constant) HOME CLIMATE SCALE SCORE PARENTAL ENCOURAGEMENT SCALE a. Dependent Variable: ATTITUDE TOWARDS EDUCATION .707 .074 .456 9.524 .000 B 19.893 .321 Std. Error 3.858 .052 .298 Coefficients Beta t 5.156 6.212 Sig. .000 .000

B coefficients are not directly comparable as the 2 indpt variables may be measured in different scales. Hence we depend on Standardised regr coefficents or beta. Its used for comparison purpose to find which indpt variable is having more effect on the dept variable. Now, increasing the indept variables in SPSS, here we took 8 variables

Model Summary Adjusted R Model 1 R .746


a

Std. Error of the Estimate

R Square .557

Square .545

8.680

Model Summary Adjusted R Model 1 R .746


a

Std. Error of the Estimate

R Square .557

Square .545

8.680

a. Predictors: (Constant), Non Verbal Intelligence Score, PARENTAL ENCOURAGEMENT SCALE, SOCIO-ECONOMIC-STATUS, PERSONAL DEVELOPMENT, HOME CLIMATE SCALE SCORE, SOCIO-CULTURAL INFLUENCE SCORE, INFLUENCE OF MASS MEDIA, RELATIONSHIP DIMENSION

Coefficients

Standardized Unstandardized Coefficients Model 1 (Constant) HOME CLIMATE SCALE SCORE PARENTAL ENCOURAGEMENT SCALE INFLUENCE OF MASS MEDIA` SOCIO-CULTURAL INFLUENCE SCORE RELATIONSHIP DIMENSION PERSONAL DEVELOPMENT SOCIO-ECONOMICSTATUS Non Verbal Intelligence Score a. Dependent Variable: ATTITUDE TOWARDS EDUCATION .153 .053 .121 2.873 .004 -.158 .020 -.337 -7.894 .000 .121 .095 .055 1.282 .201 .253 .067 .175 3.765 .000 .039 .053 .033 .748 .455 .016 .052 .014 .309 .757 .525 .071 .339 7.428 .000 B 23.206 .201 Std. Error 6.232 .046 .186 Coefficients Beta t 3.724 4.326 Sig. .000 .000

Here it has been found that its >0.05 and hence these variables cannot be taken to the entire population

Since 3 of these indpt variables got rejected , it is better to go fwrd to SEM Changing the method to stept wise and doing analyse

Model Summary Adjusted R Model 1 2 3 4 5 R .542 .688


a b c

Std. Error of the Estimate

R Square .293 .473 .518 .539 .554

Square .291 .469 .513 .533 .546

10.835 9.376 8.978 8.793 8.669

.720 .734 .744

d e

a. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE b. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE, SOCIO-ECONOMIC-STATUS c. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE, SOCIO-ECONOMIC-STATUS, HOME CLIMATE SCALE SCORE d. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE, SOCIO-ECONOMIC-STATUS, HOME CLIMATE SCALE SCORE, RELATIONSHIP DIMENSION e. Predictors: (Constant), PARENTAL ENCOURAGEMENT SCALE, SOCIO-ECONOMIC-STATUS, HOME CLIMATE SCALE SCORE, RELATIONSHIP DIMENSION, Non Verbal Intelligence Score

Coefficients

Standardized Unstandardized Coefficients Model 1 (Constant) PARENTAL ENCOURAGEMENT SCALE 2 (Constant) PARENTAL ENCOURAGEMENT SCALE SOCIO-ECONOMICSTATUS 3 (Constant) 41.723 4.113 10.145 .000 -.202 .020 -.430 -10.050 .000 55.255 .726 3.360 .066 .468 16.443 10.956 .000 .000 B 35.314 .840 Std. Error 3.134 .075 .542 Coefficients Beta t 11.266 11.123 Sig. .000 .000

PARENTAL ENCOURAGEMENT SCALE SOCIO-ECONOMICSTATUS HOME CLIMATE SCALE SCORE 4 (Constant) PARENTAL ENCOURAGEMENT SCALE SOCIO-ECONOMICSTATUS HOME CLIMATE SCALE SCORE RELATIONSHIP DIMENSION 5 (Constant) PARENTAL ENCOURAGEMENT SCALE SOCIO-ECONOMICSTATUS HOME CLIMATE SCALE SCORE RELATIONSHIP DIMENSION Non Verbal Intelligence Score

.636

.066

.410

9.675

.000

-.183

.020

-.391

-9.389

.000

.244

.046

.226

5.283

.000

35.682 .539

4.349 .070 .347

8.205 7.737

.000 .000

-.169

.020

-.361

-8.683

.000

.207

.046

.192

4.481

.000

.247

.067

.172

3.686

.000

27.940 .533

4.970 .069 .344

5.621 7.771

.000 .000

-.166

.019

-.353

-8.601

.000

.210

.046

.194

4.595

.000

.264

.066

.183

3.971

.000

.153

.050

.121

3.079

.002

a. Dependent Variable: ATTITUDE TOWARDS EDUCATION

Here everything is significant Effectively, it means that its only selecting only those variables which is significant and the insignificant ones are removed

Das könnte Ihnen auch gefallen