Sie sind auf Seite 1von 12

9

Chapter
S P S S M A N U A L

Crosstabulations
If the dependent variable is nominal or ordinal, you cannot analyze
mean differences, instead we run crosstabs to test hypotheses.

T
hus far, I have discussed statistical tests which you perform when your
dependent variable is either interval or ratio. But, what if your dependent
variable is nominal or ordinal? Suppose the managers at Visa wanted to test
whether there is a relationship between the region of the United States where
a respondent resides and whether he/she will have a regular Visa card.. In this
example, we cannot compare means since the data is nominal, thus, we turn to
crosstabs and the chi-square test to examine this hypothesis.

This example is from the NFO file in the SPSS Student Assistant. The basic
elements of a crosstabulation design are the count of the number of cases in each cell
of a crosstab table (which will explained in a moment). For example, in a hypothetical,
precise state, there should be an equal percentage of regular Visa cardholders in each
specified region of America. There should be no differences based upon a person’s
region of residence; this will become your null hypothesis. See Figure 9.1.

VISA Card Regular Geographic region

Frequency Percent Frequency Percent


Valid No 138 46.0 Valid New England 18 6.0
Yes 162 54.0 Middle Atlantic 45 15.0
Total 300 100.0 East North Central 62 20.7
West North Central 23 7.7
South Atlantic 49 16.3
East South Central 21 7.0
Figure 9.1 Frequency Distributions West South Central 36 12.0
Mountain 19 6.3
Pacific 27 9.0
Total 300 100.0

59
S P S S M A N U A L

Figure 9.2 represents a crosstabulation table. You will note that 65.2% off
the respondents who reside in a the West North Central part of the U.S. have a
regular Visa card while 33.3% of the respondents who in the East South Central
part of the U.S. have a regular Visa card. Can you explain the difference? The
question we need to begin asking ourselves is whether there is a relationship
between a consumer’s region of residence and regular Visa card ownership. To
test hypotheses about data that are counts (nominal and ordinal), you need to
compute a chi-square statistic.

VISA Card Regular * Geographic region Crosstabulation

Geographic region
East North West North East South West South
New EnglandMiddle Atlantic Central Central South Atlantic Central Central Mountain
VISA Card No Count 11 18 25 8 23 14 23 10
Regular % within
61.1% 40.0% 40.3% 34.8% 46.9% 66.7% 63.9% 52.6%
Geographic regio
Yes Count 7 27 37 15 26 7 13 9
% within
38.9% 60.0% 59.7% 65.2% 53.1% 33.3% 36.1% 47.4%
Geographic regio
Total Count 18 45 62 23 49 21 36 19
% within
100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
Geographic regio

Figure 9.2 Crosstabulation Table

Performing a Chi-Square Test with a Crosstab


In this test, your dependent variable must be nominal or ordinal. Although you could
perform the chi-square test on higher-level data, you should always perform a t-test or
ANOVA when you have interval or ratio data.

Beginning the test


First, you need to select Analyze on your tool bar, then Descriptive
Statistics and Crosstabs. See Figure 9.3

60
S P S S M A N U A L

Figure 9.3 Crosstab

The Crosstab dialog box


After you select Crosstabs, the dialog box, as shown in Figure 9.4, appears.

Figure 9.4 Crosstab Dialog Box

• Select your dependent variable and select the right arrow next to the
Row(s) box. Your variable will appear in the box.

61
S P S S M A N U A L

• Select your independent variable and select the right arrow next to the
Column(s) box. Your variable will appear in the box.

• How do you know which variable is which? Think of your dependent


variable as the variable of interest. In other words, ask yourself why are
you doing the test? Do you want to investigate whether people possess
regular Visa cards or do you want to investigate regions of the U.S.?
Obviously, Visa management is concerned about their card, so consider
this your dependent variable.

• Variables such as gender, marital status, income categories, religion, and


other demographics are usually independent variables in marketing
research. After all, marketers usually do not expend resources to gauge
consumer’s gender or marital status. These are typically demographics that
usually appear at the end of every survey.

The next steps


The next step you need to do is too left-click on the button marked Statistics, see
Figure 9.4. Figure 9.5 appears and you need to click in the box next to Chi-Square.
You can then click on Continue.

Figure 9.5
Cells

After you press Continue, you are back to the Chi-Square box as seen in Figure 9.4.
Left click on the box marked Cells. The Cells box appears, see Figure 9.6.

62
S P S S M A N U A L

. In this box, you need to check the


Percentages box. SPSS needs to know
where your independent variable is
located. If you follow my directions, your
independent variable should always be in
the column.

All you need to do here is left click the


box next to Column in the Percentages area.
You can then select Continue.

Figure 9.6 Cell Dialog Box

The Write-Up Procedure for a Chi-Square Test


Ho: There is not a significant difference between a consumer’s region of residence and
regular Visa card ownership.
Ha: There is a significant difference between a consumer’s region of residence and
regular Visa card ownership
Test: Chi-Square
Confidence Level: 95%
Significant Factor: .015 (Pearson Chi-Square). Conclusion, reject the null hypothesis;
there is significant difference between a consumer’s region of residence and regular
Visa card ownership. See Figure 9.7 for the Chi-Square test output.

Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 19.024a 8 .015
Likelihood Ratio 19.544 8 .012
Linear-by-Linear
.011 1 .916
Association
N of Valid Cases 300
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 8.28.

Figure 9.7 Chi-Square Test Output

63
S P S S M A N U A L

Additional output, is provided when you perform a chi-square test such as the
Likelihood Ration, Linear-by-Linear Association and Fisher’s Exact test. For the
purpose of this manual, you should focus on Pearson chi-square result. However,
keep one important note in mind, all of the other tests should give you the same
result: reject the null or fail to reject the null. If any of the other results conflict with
the Pearson, then you should fail to reject the null hypothesis. It is better to be safe
than to be sorry.
C H I - S Q U A R E T E S T W A R N I N G

Suppose I wanted to perform the same type of test for American Express
management. Namely, I want to test whether there is a relationship between a
consumer’s region of residence and green American Express card ownership. Again, I
perform a chi-square test; the SPSS output is provided to you in Figure 9.8.

Figure 9.8
Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 10.862a 8 .210
Likelihood Ratio 12.374 8 .135
Linear-by-Linear
1.212 1 .271
Association
N of Valid Cases 300
a. 9 cells (50.0%) have expected count less than 5. The
minimum expected count is 1.14.

The output seems all right, doesn’t it? In fact, you may be inclined to fail to reject the
null hypothesis based upon the Pearson Chi-Square of .210. But, wait…look at the
output under the table. 50% of the cells in the crosstabulation tables have lower than
expected cell counts. As a result, this test has to be voided. In order for a chi-
square test to be valid, the maximum percentage of cells that can
be below expected counts is 20%.
Correcting the problem. You basically have two choices if you have too many cells
with low numbers. First, recode some of your data into fewer categories. For instance,
I could create 4 regions such as East, South, North and West. However, I will lose a
great deal of precision. The next choice is to increase the sample size. You simply
need to find more respondents who meet the criteria of the lower than expected cells.
Both budget and time constraints come into play in this decision. If you need an
answer right away, then condense variables by recoding and indexing. If you can wait a
bit, then increase your sample size.

64
S P S S M A N U A L

Using Control Variables in Crosstabulations


Control Variables are often used to market research to increase the precision of a test.
For instance, suppose that Friendly management, a competitor of Circle K, believes
that there is a relationship between a person’s work status and whether they shop at
Circle K. The rumor on the street is that Circle K serves better coffee than Friendly
and commuters may be stopping at Circle K on their way to work in the morning.
Let’s take this test a step further. Does gender influence the relationship at all?
Perhaps, it’s only male full-time employees who are stopping at Circle K for coffee;
maybe its cigarette smokers. In any case, by making a separate table for Circle K
patronage and work status for men and women, we are treating gender as a control
variable. Similarly, if we created a separate table for Circle K patronage and work
status for cigarette smokers, we would be treating smoking as a control variable.
A D D I N G C O N T R O L V A R I A B L E S

The following example is based on the “Friendly” data file that is located on your SPSS
Student Assistant. Let’s say that we want to investigate the example above. When you
add a control variable to a test, you essentially have three hypotheses to test. These
include the following:

1. Ho: There is not a significant relationship between Circle K patronage and


work-status.

Ha: There is a significant relationship between Circle K patronage and work-


status.

2. Ho: There is not a significant relationship between Circle K patronage and


gender.

Ha: There is a significant relationship between Circle K patronage gender.

3. Ho: Gender does not affect the relationship between Circle K patronage
and work status.

Ha: Gender does affect the relationship between Circle K patronage and
work status.

Test: Chi-square test

Confidence Level: 95%

Significant Factors:

65
S P S S M A N U A L

Creating a control variable in a crosstabulation


In order to test hypotheses 1 and 2, you will follow the usual chi-square test procedure
as I discussed above. However, when you begin your test for hypothesis 3, you make a
slight addition to the Crosstabulation Dialog Box (see Figure 9.10). Also, I limited
work status to full-time and part-time by declaring retired respondents as missing
values. Figure 9.9 represents the frequency tables for all the three variables.

Respondent's Sex

Frequency Percent
Valid Male 76 46.9
Female 86 53.1
Use Circle K
Total 162 100.0
Frequency Percent
Valid Do Not Use Regularly 39 24.1
Use Regularly 123 75.9
Work Status Total 162 100.0

Frequency Percent
Valid Full-Time 88 54.3
Part-Time 45 27.8
Total 133 82.1
Missing Retired/Do Not Work 29 17.9
Total 162 100.0

Control variable

FIGURE 9.9 represents the Crosstabulation Dialog box with the control variable

66
S P S S M A N U A L

Working with a First hypothese, fail to reject the null hypothesis, there is not a
control variable is significant difference between work status and Circle-K
a 3 step process patronage, the Pearson is above .05. (See Figure 9-10).

Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square .278b 1 .598
Continuity Correctiona .093 1 .760
Likelihood Ratio .274 1 .600
Linear-by-Linear
.276 1 .599
Association
N of Valid Cases 133
Figure 9.10
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 9.81.

Second hypothesis: we again fail to reject the null hypothesis; there is not a significant
difference between gender and Circle-K patronage. See Figure 9.11.

Use Circle K * Respondent's Sex Crosstabulation

Respondent's Sex
Male Female Total
Use Circle Do Not Use Regularly Count 15 24 39
K % within
19.7% 27.9% 24.1%
Respondent's Sex
Use Regularly Count 61 62 123
% within
80.3% 72.1% 75.9%
Respondent's Sex
Total Count 76 86 162
% within
100.0% 100.0% 100.0%
Respondent's Sex

Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 1.473b 1 .225
Continuity Correctiona 1.060 1 .303
Likelihood Ratio 1.486 1 .223
Linear-by-Linear
1.464 1 .226
Association
N of Valid Cases 162
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 18.30.

Figure 9.11 SPSS Output for Hypothesis 2

67
S P S S M A N U A L

Performing the Control Variable Test


Figure 9.12 illustrates the output for the control variable. Please note how the table
treats gender as a control variable. Look at the two arrows; you can fail to reject the
null hypothesis for both men and women (Pearsons are greater than .05). Gender does
not affect the relationship between work-status and Circle-K patronage.

If one of the control variable groups (either male or female) would have been below
.05, then we would have rejected the null hypothesis; gender would have influenced
the relationship between Circle-K patronage and Work Status.

Use Circle K * Work Status * Respondent's Sex Crosstabulation

Work Status
Respondent's Sex Full-Time Part-Time Total
Male Use Circle Do Not Use Regularly Count 7 3 10
K % within Work Status 17.9% 13.6% 16.4%
Use Regularly Count 32 19 51
% within Work Status 82.1% 86.4% 83.6%
Total Count 39 22 61
% within Work Status 100.0% 100.0% 100.0%
Female Use Circle Do Not Use Regularly Count 11 8 19
K % within Work Status 22.4% 34.8% 26.4%
Use Regularly Count 38 15 53
% within Work Status 77.6% 65.2% 73.6%
Total Count 49 23 72
% within Work Status 100.0% 100.0% 100.0%

Chi-Square Tests

Asymp. Sig.
Respondent's Sex Value df (2-sided)
Male Pearson Chi-Square .191b 1 .662
Continuity Correctiona .006 1 .939
Likelihood Ratio .195 1 .659
Linear-by-Linear
.188 1 .665
Association
N of Valid Cases 61
Female Pearson Chi-Square 1.226c 1 .268
Continuity Correctiona .673 1 .412
Likelihood Ratio 1.192 1 .275
Linear-by-Linear
1.209 1 .272
Association
N of Valid Cases 72
a. Computed only for a 2x2 table
b. 1 cells (25.0%) have expected count less than 5. The minimum expected
count is 3.61.
c. 0 cells (.0%) have expected count less than 5. The minimum expected count
is 6.07.
Figure 9.12 Chi-Square Test with control

68
S P S S M A N U A L

Note

If you reject the null hypothesis for any group within your control
variable, then you reject the entire null hypothesis. For example, if we
had found that the significant factor for the Pearson correlation was .03
for women and .07 for men, we would have rejected the third hypothesis.

69
70

Das könnte Ihnen auch gefallen