Beruflich Dokumente
Kultur Dokumente
Chapter
S P S S M A N U A L
Crosstabulations
If the dependent variable is nominal or ordinal, you cannot analyze
mean differences, instead we run crosstabs to test hypotheses.
T
hus far, I have discussed statistical tests which you perform when your
dependent variable is either interval or ratio. But, what if your dependent
variable is nominal or ordinal? Suppose the managers at Visa wanted to test
whether there is a relationship between the region of the United States where
a respondent resides and whether he/she will have a regular Visa card.. In this
example, we cannot compare means since the data is nominal, thus, we turn to
crosstabs and the chi-square test to examine this hypothesis.
This example is from the NFO file in the SPSS Student Assistant. The basic
elements of a crosstabulation design are the count of the number of cases in each cell
of a crosstab table (which will explained in a moment). For example, in a hypothetical,
precise state, there should be an equal percentage of regular Visa cardholders in each
specified region of America. There should be no differences based upon a person’s
region of residence; this will become your null hypothesis. See Figure 9.1.
59
S P S S M A N U A L
Figure 9.2 represents a crosstabulation table. You will note that 65.2% off
the respondents who reside in a the West North Central part of the U.S. have a
regular Visa card while 33.3% of the respondents who in the East South Central
part of the U.S. have a regular Visa card. Can you explain the difference? The
question we need to begin asking ourselves is whether there is a relationship
between a consumer’s region of residence and regular Visa card ownership. To
test hypotheses about data that are counts (nominal and ordinal), you need to
compute a chi-square statistic.
Geographic region
East North West North East South West South
New EnglandMiddle Atlantic Central Central South Atlantic Central Central Mountain
VISA Card No Count 11 18 25 8 23 14 23 10
Regular % within
61.1% 40.0% 40.3% 34.8% 46.9% 66.7% 63.9% 52.6%
Geographic regio
Yes Count 7 27 37 15 26 7 13 9
% within
38.9% 60.0% 59.7% 65.2% 53.1% 33.3% 36.1% 47.4%
Geographic regio
Total Count 18 45 62 23 49 21 36 19
% within
100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
Geographic regio
60
S P S S M A N U A L
• Select your dependent variable and select the right arrow next to the
Row(s) box. Your variable will appear in the box.
61
S P S S M A N U A L
• Select your independent variable and select the right arrow next to the
Column(s) box. Your variable will appear in the box.
Figure 9.5
Cells
After you press Continue, you are back to the Chi-Square box as seen in Figure 9.4.
Left click on the box marked Cells. The Cells box appears, see Figure 9.6.
62
S P S S M A N U A L
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 19.024a 8 .015
Likelihood Ratio 19.544 8 .012
Linear-by-Linear
.011 1 .916
Association
N of Valid Cases 300
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 8.28.
63
S P S S M A N U A L
Additional output, is provided when you perform a chi-square test such as the
Likelihood Ration, Linear-by-Linear Association and Fisher’s Exact test. For the
purpose of this manual, you should focus on Pearson chi-square result. However,
keep one important note in mind, all of the other tests should give you the same
result: reject the null or fail to reject the null. If any of the other results conflict with
the Pearson, then you should fail to reject the null hypothesis. It is better to be safe
than to be sorry.
C H I - S Q U A R E T E S T W A R N I N G
Suppose I wanted to perform the same type of test for American Express
management. Namely, I want to test whether there is a relationship between a
consumer’s region of residence and green American Express card ownership. Again, I
perform a chi-square test; the SPSS output is provided to you in Figure 9.8.
Figure 9.8
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 10.862a 8 .210
Likelihood Ratio 12.374 8 .135
Linear-by-Linear
1.212 1 .271
Association
N of Valid Cases 300
a. 9 cells (50.0%) have expected count less than 5. The
minimum expected count is 1.14.
The output seems all right, doesn’t it? In fact, you may be inclined to fail to reject the
null hypothesis based upon the Pearson Chi-Square of .210. But, wait…look at the
output under the table. 50% of the cells in the crosstabulation tables have lower than
expected cell counts. As a result, this test has to be voided. In order for a chi-
square test to be valid, the maximum percentage of cells that can
be below expected counts is 20%.
Correcting the problem. You basically have two choices if you have too many cells
with low numbers. First, recode some of your data into fewer categories. For instance,
I could create 4 regions such as East, South, North and West. However, I will lose a
great deal of precision. The next choice is to increase the sample size. You simply
need to find more respondents who meet the criteria of the lower than expected cells.
Both budget and time constraints come into play in this decision. If you need an
answer right away, then condense variables by recoding and indexing. If you can wait a
bit, then increase your sample size.
64
S P S S M A N U A L
The following example is based on the “Friendly” data file that is located on your SPSS
Student Assistant. Let’s say that we want to investigate the example above. When you
add a control variable to a test, you essentially have three hypotheses to test. These
include the following:
3. Ho: Gender does not affect the relationship between Circle K patronage
and work status.
Ha: Gender does affect the relationship between Circle K patronage and
work status.
Significant Factors:
65
S P S S M A N U A L
Respondent's Sex
Frequency Percent
Valid Male 76 46.9
Female 86 53.1
Use Circle K
Total 162 100.0
Frequency Percent
Valid Do Not Use Regularly 39 24.1
Use Regularly 123 75.9
Work Status Total 162 100.0
Frequency Percent
Valid Full-Time 88 54.3
Part-Time 45 27.8
Total 133 82.1
Missing Retired/Do Not Work 29 17.9
Total 162 100.0
Control variable
FIGURE 9.9 represents the Crosstabulation Dialog box with the control variable
66
S P S S M A N U A L
Working with a First hypothese, fail to reject the null hypothesis, there is not a
control variable is significant difference between work status and Circle-K
a 3 step process patronage, the Pearson is above .05. (See Figure 9-10).
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square .278b 1 .598
Continuity Correctiona .093 1 .760
Likelihood Ratio .274 1 .600
Linear-by-Linear
.276 1 .599
Association
N of Valid Cases 133
Figure 9.10
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 9.81.
Second hypothesis: we again fail to reject the null hypothesis; there is not a significant
difference between gender and Circle-K patronage. See Figure 9.11.
Respondent's Sex
Male Female Total
Use Circle Do Not Use Regularly Count 15 24 39
K % within
19.7% 27.9% 24.1%
Respondent's Sex
Use Regularly Count 61 62 123
% within
80.3% 72.1% 75.9%
Respondent's Sex
Total Count 76 86 162
% within
100.0% 100.0% 100.0%
Respondent's Sex
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 1.473b 1 .225
Continuity Correctiona 1.060 1 .303
Likelihood Ratio 1.486 1 .223
Linear-by-Linear
1.464 1 .226
Association
N of Valid Cases 162
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 18.30.
67
S P S S M A N U A L
If one of the control variable groups (either male or female) would have been below
.05, then we would have rejected the null hypothesis; gender would have influenced
the relationship between Circle-K patronage and Work Status.
Work Status
Respondent's Sex Full-Time Part-Time Total
Male Use Circle Do Not Use Regularly Count 7 3 10
K % within Work Status 17.9% 13.6% 16.4%
Use Regularly Count 32 19 51
% within Work Status 82.1% 86.4% 83.6%
Total Count 39 22 61
% within Work Status 100.0% 100.0% 100.0%
Female Use Circle Do Not Use Regularly Count 11 8 19
K % within Work Status 22.4% 34.8% 26.4%
Use Regularly Count 38 15 53
% within Work Status 77.6% 65.2% 73.6%
Total Count 49 23 72
% within Work Status 100.0% 100.0% 100.0%
Chi-Square Tests
Asymp. Sig.
Respondent's Sex Value df (2-sided)
Male Pearson Chi-Square .191b 1 .662
Continuity Correctiona .006 1 .939
Likelihood Ratio .195 1 .659
Linear-by-Linear
.188 1 .665
Association
N of Valid Cases 61
Female Pearson Chi-Square 1.226c 1 .268
Continuity Correctiona .673 1 .412
Likelihood Ratio 1.192 1 .275
Linear-by-Linear
1.209 1 .272
Association
N of Valid Cases 72
a. Computed only for a 2x2 table
b. 1 cells (25.0%) have expected count less than 5. The minimum expected
count is 3.61.
c. 0 cells (.0%) have expected count less than 5. The minimum expected count
is 6.07.
Figure 9.12 Chi-Square Test with control
68
S P S S M A N U A L
Note
If you reject the null hypothesis for any group within your control
variable, then you reject the entire null hypothesis. For example, if we
had found that the significant factor for the Pearson correlation was .03
for women and .07 for men, we would have rejected the third hypothesis.
69
70