Sie sind auf Seite 1von 28

Data Entry in SPSS

Introduction
SPSS :
Data View Variable View ,
:

Data View, ,
Excel, .
(.. ) .

: gender , education , age , qu1 , qu2 qu3 . 5

- 1 -

(male),
(sixth form), 19
(Strongly Agree) .
You define these variables, and their properties, in the Variable View window, which
is accessed via the

tab, as shown below:


Each row in the Variable View window (see above) represents a new variable and
each column a property of that variable. Each of these properties is explained below:
(Name)
Click in this cell to name or re-name your variable. The names in this
column cannot start with a number or contain special characters (e.g., / * $, space,
etc.). If you enter in an illegal name, SPSS will give you a warning and you will have
to rename it.
(Type)
This is where you define the variable's data type (e.g., numeric, currency,
etc.). Click in the cell, which will highlight the
click on this button:

- 2 -

button (as shown below), and then

This will launch the Variable Type dialogue box, as shown below:

You can use the options in this dialogue box to specify the data type of your variable.
The most common (and default type) is Numeric, which is used for numbers and,
usually, for nominal and ordinal data types as well (this is explained later in
the Values section). The most common other data type is String, which is computerspeak for "text".
Width
You can alter the maximum number of characters that can be entered for a
variable in the Data View window by specifying a value here. The default is 8. This
does not affect "numeric" variable types.

- 3 -

(Decimals)
You can specify how many decimal places you want to show in the Data
View by selecting a value here.
(Label)
Variable names tend to be short. This option allows you to write a text
label for the variable that is much more explanatory. This is helpful if your variable
names are not very informative. Depending on your setup of SPSS, these labels are
sometimes printed in the output instead of the variable names.
(Values)
This is where you can provide a text label for any categorical variables:
either ordinal or nominal. This is because it is generally preferred in SPSS that
categorical groups are given a numeric code (e.g., for the variable gender , you might
code "males" as "1" and "females" as "2"). The "Values" property allows you to instruct
SPSS to make the connection between the numeric code and a more meaningful text
label. To be able to do this, click on the cell and then click the

button. You will be

presented with the Value Labelsdialogue box, as shown below:

As an example, imagine you have a variable called gender and you wish to instruct
SPSS that this particular variable has two categories called "males" and "females". It
is entirely up to you what code you use, but a code of "0" for "males" and "1" for
"females" will be used here (this coding makes particular sense if you are
using gender as a dummy variable in regression) but you can use another code (e.g.,

- 4 -

"1" for "females" and "2" for "males"). To code this example, add the numeric code
into the Value: box and the category name into the Label: box. This is shown for
"males" below:

Click the

button. You will be presented with the following screen showing that

this category has now been added:

Now add the numeric code and label for "females" and click the
will end up with the following screen:

- 5 -

button. You

Click the

button to return to the Variable View window. If you wished to enter

in ordinal data instead (e.g., a likert item with 5 levels: strongly agree, agree, neutral,
disagree, strongly disagree), it might look like the following:

Entering value labels will be discussed in more detail in the Entering categorical data
(including grouping variables) section of this guide.
(Missing)
You can indicate how to "describe" or determine missing values. Click on
the cell and then click the

button. You will be presented with the Missing

Values dialogue box, as shown below:

- 6 -

The default is No missing values. You should note that SPSS can allow you to have
missing values in your data without selecting any options here: missing values by
default will be represented by a "." (punctuation mark) in the data cell and this type of
missing value is called a "system missing value". However, the Missing
Values dialogue box offers more options. You can also specify either three discrete
values using the Discrete missing values option or specify a range and one discrete
value with the third and final option. An example of using the discrete option, would be
to classify missing values that were missing for different reasons. For example,
questions where participants did not answer as they were unsure of the answer could
be coded as "9" whilst missing answers that were based on the participant refusing to
answer could be coded as "99", as follows:

You will now be able to classify missing values with a more meaningful code.

- 7 -

Columns
You can specify how many columns (how long) each variable name can be
in both the Data View and Variable View window here. The default is 8. Any variable
names longer than this will be truncated (i.e., will not be shown past this column width).
Align
You can specify whether the data in each cell is aligned to the left, right
or centre in the Data View window. The default is to align to the right (

).

Measure
The measure column is where you indicate the variable's
measurement type (i.e., scale, ordinal or nominal), as shown below:

The default in IBM SPSS Statistics version 20 is "Unknown", until you indicate the
actual measurement type. For previous SPSS versions, the default is "Scale" (
). "Scale" is the name SPSS uses for continuous (interval or ratio) data. See the Types
of Variable guide for more help, if needed. Even if you have setup a coding system
within the
) or "Nominal" (

property, you will still need to select either "Ordinal" (


) in this section.

Role
You can specify the role of the variable in this column. There are five
types of role: Input, Target, Both, None, Partition and Split, as shown below:

- 8 -

You can get this list of variable role types by clicking on the cell, which will highlight
the drop-down symbol for you to press. For almost all your needs you can use the
"None"
() role type, as shown above. The first three role types are for certain
procedures in SPSS which can use these designations to automatically populate their
variable fields. This can be great when used in business where the same analysis can
be repeated often, but is of less use when looking at your statistics less frequently, as
is normally the case as a student. The last two are advanced options that you are
unlikely ever to need to use.

If you wish to enter a variable that has a continuous data type, you need to setup your
variable to have a name, a numeric data type, a descriptive label, the correct number
of decimal places, the correct measurement type, and the correct role. As an example,
consider setting up a variable that records the GPA achieved by a participant in a
study. The setup should be as follows:
Variable Property

SPSS Property

Name

GPA

Type

Numeric

Decimals

0 (zero) decimal places

Label

GPA score of participant

Measure

Scale

Role

None

This is shown in the following screen:

- 9 -

(including grouping variables)


Categorical data (i.e., both ordinal and nominal) can be entered in a similar manner to
each other. The only major difference is to declare the different measurement type in
the
variable property. Categorical data can be either a dependent
variable or an independent variable, the latter often called a "grouping variable" in
SPSS. In this guide, you will concentrate on creating a categorical "grouping" variable
called education . This is an ordinal variable with three categories: "School", "College"
and "University", in that order. In SPSS, you will assign a number to each group using
the
variable property. This is also why the variable
property
is Numericrather than String (text), as numbers are being entered rather than the
names themselves. You want the following setup:
Variable Property

SPSS Property

Name

education

Type

Numeric

Decimals

0 (zero) decimal places

Label

Highest level of education

Measure

Ordinal

Values

1:
2:
3: University

Role

None

This is shown in the following screen:

- 10 -

School
College

With the

options completed as follows (see above for how to do this, if

needed):

In the Data View window you now have a drop-down list of these options when you
want to enter data into this variable, as shown below:

- 11 -

However, to see the underlying code (which is numeric), click on the "Value Labels"
button (

button) on the main toolbar, as shown below:

You will be presented with the underlying codes for the education variable:

- 12 -

Multiple categorical groups


In many circumstances, your participants might belong to more than one group. For
example, you might have classified each participant according to their maximum level
of education, education , and their gender, gender . In SPSS, you can still have each
participant occupying only a single row, and one row only, even with multiple
categorical groups - you simply add both variables to the data file and select each
specfic option for each relevant participant. This is demonstrated below:

- 13 -

If
you
inspect
the
first
two
columns,
which
represent
the education and gender variables, you will see that there are six different
combinations of the levels of the two variables. These combinations are highlighted
below:
Combination
"School" and "Male"
"College" and "Male"
"University" and "Male"
"School" and "Female"
"College" and "Female"
"University" and "Female"
These represent all six possible combinations of all levels of the two independent
variables. You can see, therefore, that if you are subdividing your group of participants
into multiple subgroups, this is possible by simply selecting the categories of multiple
variables.
Entering repeated measures
Repeated measures are when you have the same dependent variable being
measured on more than one occasion on the same participant. For example, you have
measured body weight pre- and post- a dietary intervention. This situation of repeated
measures can easily lead to confusion in SPSS as you know that each variable is
represented by a single column, but there is no method to enter multiple responses

- 14 -

into a single variable. SPSS overcomes this problem by having each level of the
repeated measures variable act as a separate variable. For example, with the body
weight example above, you have a dependent variable, body weight , and an
independent variable, time , with two related groups: "pre-intervention" and "postintervention". However, in SPSS, the pre-intervention body weight scores now get
entered as a variable in one column and the post-intervention scores get entered as
another variable in a separate column. This can be achieved as below:

Notice here that the two variables have been called bw_pre and bw_post to
represent body weight pre- and post-intervention, respectively. Notice also that the
variable names include information on both the dependent and independent variable.
The bw part reflects the dependent variable and is present in both variables, and the
independent variable is represented by the _pre and _post parts. You can choose
your own naming conventions, but this is one that works well. It is important to
remember that SPSS does not understand what these two variables represent (e.g.,
Is bw_pre a separate variable or part of a repeated measures variable?). This is why
it is important to name the variables consistently and why, when you run the repeated
measures procedures (e.g., a paired-samples t-test or repeated measures ANOVA),
you need to instruct SPSS which variables represent the repeated measurements (as
SPSS does not know). The above data setup ensures that each participant's data
remains on a single row. The naming convention helps when you need to add another
repeated measure. For example, consider the case below where CRP (a protein) has
also been measured:

- 15 -

The naming convention makes it easy to distinguish which time point (pre- or post-)
and particular measure has been taken (body weight or CRP).
Both grouping variables and repeated measures
If you have made repeated measurements on participants, but have also separated
them into separate groups. For example, you measured body weight and CRP before
and after a dietary intervention, but also want to compare the results based on their
gender, you would set this up as follows:

As you can see from the above setup, you are essentially just combining the methods
of entering categorical groups and entering repeated measures.

- 16 -

There are many reasons why you might need to reverse code a variable in SPSS
Statistics, but the most common is when you are trying to generate a scale or subscale
from a series of Likert items (that you might have asked as part of a questionnaire or
survey), and one or more of your questions is coded in the opposite direction to what
you would like. For example, you might have a Likert item similar to the one below:

You might have coded the question as follows:

However, the (sub)scale requires that "strongly disagree" is equal to "5", "disagree" is
equal to "4", and so on (i.e., the numerical coding is reversed). What you actually need
is for the coding to be as follows:

Luckily, in SPSS Statistics reverse coding is relatively easy to do and you do not need
to go through your responses changing them manually. You can reverse code your
variables by following the instructions below:
Data setup in SPSS Statistics
The data for this example has been set up as follows (where Qu1 is the variable that
needs to be reverse coded):

- 17 -

On the left you have the responses to the questions and on the right the underlying
original coding. You can toggle between these by clicking the "Value Labels" (

icon on the main toolbar. The setup of categorical data (ordinal data, in this instance)
in SPSS Statistics is explained in our data setup guide.
Reverse coding in SPSS Statistics
There is no special menu dedicated to reverse coding, so you need to go through the
more generic recoding procedure. The first decision you need to make is whether to
overwrite the values of the existing variable, Qu1 , with the reverse coded responses,
or recode into a new variable. Unless you have very good reason to do so, you should
never "destroy" your original data unless it is absolutely necessary. Therefore, you
should create a new variable to store your newly recoded variable, which shall be
called Qu1R in this example.

Click Transform > Recode into Different Variables... on the top menu, as shown
below:

- 18 -

You will be presented with the Recode into Different Variables dialogue box, as
shown below:

Transfer the variable to be reverse coded, Qu1 , into the Numeric Variable -> Output
Variable: box, by using the

button. You will end up with the following screen:


- 19 -

You will notice that the Output Variable area is no longer "greyed out", and neither
is the

button.

The Output Variable area is where you enter the new variable name and label (i.e.,
into the Name: and Label: boxes, respectively). Therefore, create a new variable
called Qu1R to hold the recoded values and give it a label of "Qu1 reverse coded",
as shown below:

- 20 -

Click the
button to create this new variable. You will be presented with an
updated Numeric Variable -> Output Variable: box, as shown below:

- 21 -

Click the
button and you will be presented with the Recode into
Different Variables: Old and New Valuesdialogue box, as shown below:

In this example, you want the old values to be converted into new values as follows:
1 (old) becomes 5 (new)
2 (old) becomes 4 (new)
3 (old) becomes 3 (new)
4 (old) becomes 2 (new)
5 (old) becomes 1 (new)

Enter "1" into the Value: box in the Old Value area and enter "5" into the Value: box
in the New Value area, as shown below:

- 22 -

Click the
button. This will commit the changes and you will see a new entry
in the Old --> New: box reflecting this particular recoding, as shown below:

This is telling SPSS Statistics to replace any response of "1" in the Qu1 variable with
a "5" in the new variable, Qu1R .

- 23 -

Repeat the process above for the other values that need recoding so that you end up
with the following screen:

You can see all the recoding instructions in the Old --> New: box above.

Click the
button and you will be returned to the Recode into Different
Variables dialogue box.

Click the

button to generate the new variable containing the reverse coded

responses. You will be presented with the following screen:

- 24 -

You will see that a new variable has been added, called Qu1R . To see the reverse
coding of the numbers, click the
icon on the main toolbar and you will get a clear
view of the recoding, as shown below:

- 25 -

: For extra clarity, the number of decimal places has been reduced to zero.
To know how to do this, see the Decimals section in our data setup guide.
You can leave the data in the numeric format if you wish. Alternately, you can add a
new set of "Value Labels" for the new variable, Qu1R , by going to the Variable
View window and adding some "Value Labels", as shown below (a full explanation of
setting up variables is provided in our data setup guide):

- 26 -

Firstly, notice that the new "Value Labels" reflect the new coding system. This is
because the answers stay the same (i.e., it is irrelevant how you internally coded the
system; if someone answered "Agree", for example, this will remain the case
regardless of the coding system). Once you click the

button, and then return

to the Data View window, you will see the same answer response despite the
underlying coding having been reversed, as shown below (remember to click
the

button):

- 27 -

You have now successfully reverse coded your values. You can now repeat this
process for any other variables that need reverse coding.
Syntax
:

- 28 -

Das könnte Ihnen auch gefallen