SPSS Introduction Course at PSB, UUM

SPSS INTRODUCTION
COURSE
30TH OCTOBER 2018
Instructor:
Mr. Azim Azuan Osman
Mobilise Coaches:
Dr. Rusnifaezah Musa
Dr. Maliani Mohamad
Ms. Nor Hafida Hamzah
In Collaboration with:
1
ANTARA MUKA SPSS (SPSS INTERFACE)
• Membuka perisian SPSS

• Arahan:
All SPSS for

Start SPSS
Programs Windows
2
ANTARA MUKA SPSS (SPSS INTERFACE)
Start-up Dialogue Box
3
SPSS INTERFACE: VARIABLE VIEW
4
SPSS INTERFACE: DATA VIEW
5
SPSS INTERFACE: OUTPUT DIALOGUE BOX
6
DATA ENTRY & DECLARING VARIABLES
Variable View Data View
1. Insert/declare variables Key-in responses
2. Adjust decimal points
3. Set column width
4. Text alignment
7
TYPES OF DATA/VARIABLES
1. Categorical Data/Variable
a) Nominal
b) Dichotomous
c) Ordinal
2. Continuous Data/Variable
a) Ratio
b) Interval
3. String (letters/words)
8
CATEGORICAL vs CONTINUOUS
DATA/VARIABLES
Categorical variables are also known as discrete or qualitative

variables. Categorical variables can be further categorized as
either nominal, ordinal or dichotomous.
Continuous variables are also known as quantitative variables.

Continuous variables can be further categorized as
either interval or ratio variables.
9
Nominal variables are variables that have two or more categories, but which
do not have an intrinsic order.
It splits data into mutually exclusive and collectively exhaustive categories.
Nominal scale is usually used for obtaining personal data such a gender,
department in which one is working, and so on, where grouping of individuals
or objects is useful, as in the case below.
1. Gender 2. Education Level

• Male • Bachelor
• Female • Master
• PhD
10
Dichotomous variables are nominal variables which have only two
categories or levels. For example, if we were looking at gender, we
would most probably categorize somebody as either "male" or
"female".
This is an example of a dichotomous variable (and also a nominal

variable).
11
Ordinal variables are variables that have two or more categories just like
nominal variables only the categories can also be ordered or ranked.
Ordinal scale is usually used to rate preferences or usage of various

brand of the product by individuals and to rank order individuals,
objects, or events as the examples below.
Rank the following personal computers with respects to usage in your

office, assigning the number 1 to the most used system, 2 to the next
most used systems and so on. If particular system is not used at all in
your office, put a 0 against.
__ IBM PS2/30 __Compaq

__ IBM/AT __ AT&T
__ IBM/XT ___ Tandy 2000
__ Apple ___ Other (specify)
12
Ordinal Scale (example)
13
Ratio Scales
Ratio scale usually get used in organizational research when

exact figures on objective (as opposed to subjective ) factors
are called for, as in the following question.
1. How many others organization did you work before joining this
systems? ______
2. Please indicates the number of children you have in each of the
following categories:
____ below 3 years of age
____ between 3 and 6
____ over 6 years but under 12
____ 12 years and over
3. How many retail outlets do you operate?
14
Interval Scale
• Interval variables are variables for which their central characteristic is that
they can be measured along a continuum and they have a numerical value
• Interval scale is used when responses to various items that measure a
variable can be tapped on five point ( or seven- points or any others
number of points) scale, which can thereafter be summated across the
items.
Strongly Strongly
Disagree Neutral Agree
Disagree Agree
2 3 4
1 5
1. My job offers me
1 2 3 4 5
chance to test myself
and my abilities
2. Mastering these jobs 1 2 3 4 5

meant a lot to me.
15
HANDS-ON data entry using provided
example questionnaires
16
DATA TRANSFORM (TRANSFORMASI DATA)
• Data yang disimpan melalui SPSS boleh diubahsuai

mengikut keperluan masing-masing
• Proses transformasi data akan membolehkan kita

menjalankan operasi analisis data yang lebih
kompleks.
• Terdapat pelbagai proses transformasi yang boleh

dilakukan seperti arahan Recode, Compute, dan
Count.
17
DATA TRANSFORM: “RECODE” COMMAND
Ia membolehkan kita mengubahsuai data di dalam variable
supaya ia bersesuaian dengan analisis yang akan dijalankan.
Biasanya digunakan untuk menukar skala (scale) respon item-

item negative (negative/reverse questions) menjadi item-item
positif
atau
Menukar respon dalam bentuk string (perkataan) ke dalam
bentuk numeric (nombor/kod) dan sebaliknya.
Arahan:
1. Buka data SPSS (i.e. recode excercise)
2. Klik Transform > Recode into Same Variables… dan
tetingkap Recode into Same Variables akan dipaparkan
18
Example of negative/reverse questions
If your scale contains some items that are negatively worded

(common in psychological measures), these need to be
‘reversed’ before checking reliability.
19
“RECODE” into same variables
20
“RECODE” into same variables (cont’)
3. Masukkan variable/item Jantina ke dalam kotak Numeric Variables.
Kemudian, klik Old and New Values.
21
4. Taip 1. Lelaki dalam petak Value di bawah Old value dan taip 1
dalam petak Value di bawah New Value. Klik Add.
22
5. Taip 2. Perempuan dalam petak petak Value di bawah Old value
dan taip 2 dalam petak Value di bawah New Value. Klik Add.
23
6. Ulang langkah yang sama jika data mempunyai lebih daripada
2 kategori
7. Klik Continue > OK
24
DATA TRANSFORM: “COMPUTE” COMMAND
Compute membolehkan kita menghasilkan variable yang baharu melalui

operasi manipulasi beberapa variable yang sedia ada di dalam fail data
SPSS.
Compute bertujuan untuk mendapatkan nilai total (ataupun nilai min)
bagi soalan (item) x1 hingga x6 untuk mengukur tahap (level) persepsi
responden terhadap sesuatu pemboleh ubah (variable).
Biasanya dalam sesuatu kajian, penyelidik menggunakan nilai min (mean)
bagi mengukur tahap tersebut.
Arahan:
1. Buka data SPSS (i.e. Data Mediation and Moderation)
2. Klik Transform > Compute Variable… dan tetingkap Compute Variable
akan dipaparkan
25
COMPUTE VARIABLES (cont’)
3. Taip pada kotak Target Variables dengan satu nama pemboleh

ubah baharu yang anda rasa sesuai (contohnya mean_EI).
26
4. Function Group: Pilih Statistical

5. Functions and Special Variables: Pilih Mean
6. Numeric Expression: Masukkan item EI1 hingga EI16 seperti dalam rajah
7. Klik OK
27
28
DESCRIPTIVE ANALYSES
• SPSS membolehkan pengguna menghasilkan

pelbagai bentuk taburan data menerusi arahan
Frequencies, Descriptives dan Crosstabs.
• Frequencies digunakan untuk mengira jumlah

respon. Ia juga boleh menghasilkan statistik
diskriptif seperti min, mod, median dan varians.
• Pelbagai bentuk graf juga dapat dihasilkan

melalui arahan ini.
29
DESCRIPTIVE ANALYSES (cont’)
Gender:
1 = Male, 2 = Female
Education:
1 = Bachelor, 2 = Master, 3 = PhD
Employment:
1 = Contractual, 2 = Permanent, 3 = Others
Apakah peratusan lelaki dan perempuan (gender) dalam sampel?

Apakah frekuensi bagi setiap kategori tahap pendidikan (education)?
Berapa peratus dalam setiap kategori pekerjaan (employment)?
30
DESCRIPTIVES: FREQUENCIES
1. Klik Analyze > Descriptive Statics > Frequencies
31
DESCRIPTIVES: FREQUENCIES (cont’)
2. Klik Gender. Masukkan dalam kotak Variable(s).

3. Klik Chart. Pilih Chart Type, klik Continue
32
FREQUENCIES OUTPUT
33
DESCRIPTIVES: CROSSTABS
Memaparkan taburan secara serentak antara dua

atau lebih pemboleh ubah.
Taburan silang biasanya dibentangkan sebagai jadual

kontingensi dalam format matriks
34
DESCRIPTIVES: CROSSTABS (cont’)
1. Klik Analyze > Descriptive Statistics > Crosstabs.

2. Masukkan Nature of Employment dalam kotak Row(s) dan
Education dalam kotak Column(s)
3. Klik kotak Display clustred bar charts dan OK
35
Jadual Tabulasi Silang (Cross-Tabulation) di atas membandingkan antara dua pemboleh ubah
iaitu Nature of Employment dan Education. Responden berkelulusan ijazah pertama
(Bachelor’s) adalah paling ramai iaitu 121 orang berbanding keseluruhan responden yang
berjawatan kontrak (Contractual). Manakala, responden yang berkelulusan ijazah sarjana
(Master’s) pula ialah responden berjawatan tetap (Permanent) teramai iaitu seramai 12 orang
berbanding keseluruhan responden yang berjawatan tetap (Permanent).
36
37
DESCRIPTIVES: Descriptive
Analisis Deskriptif (Descriptive Analysis) biasanya digunakan untuk

menjawab persoalan kajian seperti:
Sejauh manakah tahap kepuasan pelanggan terhadap

Perkhidmatan A?
To what extend the level of customer satisfaction on Service A?
Sejauh manakah tahap pelaksanaan amalan 5S di Organisasi X?

To what extend the level of 5S implementation in X Organization?
Arahan:
1. Klik Analyze > Descriptive Statistics > Descriptives dan tetingkap
Descriptives akan dipaparkan
38
DESCRIPTIVES: Descriptive (cont’)
2. Klik EI, EP, C, OC, JA, POS dan EB. Masukkan dalam kotak Variable(s).
3. Klik Options. Pilih Mean dan Std. deviation, klik Continue.
39
DESCRIPTIVES: Descriptive (cont’)
Note: Interpret Mean score based on the items’ scales (e.g. scale 3
may represent moderate, score 4 equals to high, score 5 is very high).
40
Missing Data/Value @ Blank Responses
How do we take care of missing responses?

• Too many missing (more than 25% missing), throw out the
questionnaire
• Less than 10%, the missing data is considered minimal (Hair
et al. 2010), the missing responses can be ignored (i.e.
system missing).
Other ways of handling

1. Use the midpoint of the scale
2. Random number replacement
3. 4 systematic options (see next slide)
41
Missing Data/Value @ Blank Responses
If there is a pattern to the missing data or there are choose one of the
following options:
• Option 1: Replace values with numbers that are known from prior
knowledge or from an educated guess. Easily done but can lead to
researcher bias if you are not careful.
• Option 2: Replace missing values with variable mean (or median).
The simplest option but it does lower variability and in turn can bias
results.
• Option 3: Replace missing values with a group mean (i.e. the mean for
prejudice grouped by ethnicity). The missing value is replaced with the
mean of the group that the subject belongs to. A little more
complicated but there is not as much of a reduction in the variability.
• Option 4: Using regression to predict the missing values. Other
variables act as IVs predicting the variable with the missing values
(which acts as the DV).
42
How to DETECT the missing values
1. Klik Analyze > Descriptive Statistics >

Frequencies dan tetingkap Frequencies
akan dipaparkan
2. Klik EI1 hingga EI16. Masukkan dalam
kotak Variable(s).
3. Pastikan kotak Display frequency tables
ditandakan, klik OK.
43
How to DETECT the missing values (cont’)
44
How to REPLACE missing values
1. Klik Transform > Replace Missing Values dan tetingkap
Replace Missing Values akan dipaparkan
45
How to REPLACE missing values (con’t)
2. Klik EI1 dan EI4. Masukkan dalam kotak New Variable(s).

3. Klik (highlight) EI1_1=SMEAN(EI1). Klik Method. Pilih Median of
nearby points, klik Change
4. Ulang langkah yang sama untuk EI4_1=SMEAN(EI4)
5. Klik OK
46
47
48
Outlier Samples/Cases (Kes-kes Terpencil)
• Among continuous variables – whether searching for univariate or

multivariate outliers the method depends on whether the data is
grouped or not. If you are performing analyses with ungrouped data
(i.e. regression, canonical correlation, factor analysis, or structural
equations modeling) univariate and multivariate outliers are required
among all cases at once.
• If you are going to perform on of the analyses with grouped data

(ANOVA, ANCOVA, MANOVA, MANCOVA, profile analysis, discriminant
function analysis, or logistic regression) both univariate and
multivariate outliers are required within each group separately.
49
Outlier Samples/Cases (con’t)
50
Outlier Samples/Cases (con’t)
51
Univariate vs Multivariate
• Univariate statistics
• includes all statistical techniques for
analyzing a single variable of interest
• Multivariate statistics
• includes all statistical techniques for
analyzing two or more variables of interest
• The focus is on relationships among
variables rather than on isolated
individual factors
52
Univariate vs Multivariate Outliers
Univariate outliers are those with very large standardized scores (z scores
greater than 3.3) and that are disconnected from the distribution.
• SPSS DESCRIPTIVES will give you the z scores for every case if you select
save standardized values as variables and SPSS FREQUENCIES will give you
histograms (use SPLIT FILE/ Compare Groups under DATA for grouped data).
Multivariate Outliers are found by first computing a Mahalanobis Distance for

each case and once that is done the Mahalanobis scores are screened in the
same manner that univariate outliers are screened.
• To compute Mahalanobis Distance in SPSS you must use Analyze >
Regression > Linear. For grouped data the Mahalanobis distances must
be computed separately for each group
53
Detecting Univariate Outliers using
“Boxplot Diagram”
Arahan:
1. Klik Graphs > Legacy Dialogs > Boxplot dan tetingkap Boxplot akan
dipaparkan
2. Pilih Simple dan tandakan Summaries of separate variables pada kotak Data
in Chart Are akan dipaparkan. Klik Define.
54
3. Masukkan item EI1

hingga EI16 ke dalam
kotak Boxes Represent.
4. Masukkan ID responden
(jika ada) pada kotak
Label Cases by. Klik OK.
55
Rajah boxplot menunjukkan kes (responden) 2, 15, 135, 136, 137, 139 dan
179 merupakan Outliers (pencilan) bagi item EI7. Walau bagaimanapun,
tiada kes (responden) yang dikenal pasti sebagai Extreme Oulier (pencilan
melampau). Jika terdapat Extreme Outlier, tanda (*) akan kelihatan pada
rajah boxplot. 56
Detecting Multivariate Outliers using
“Mahalanobis Distance”
1. Klik Analyze > Regression > Linear
57
“Mahalanobis Distance” (cont’)
2. Klik EB (dependent variable). Masukkan dalam kotak Dependent.
3. Klik EI1, JA dan POS (independent variables). Masukkan dalam
kotak Independent(s).
4. Klik Save. Tandakan Mahalanobis dan klik Continue > OK.
58
59
Untuk menentukan nilai Mahalanobis yang dikategorikan sebagai

Outliers, nilai p_MAH perlu dihitung.
Arahan:
1. Klik Transform>Compute dan tetingkap Compute Variable akan
dipaparkan
ubah baharu yang anda rasa sesuai (contohnya p_MAH).
3. Function Group: Pilih CDF and noncentral CDF
4. Functions and Special Variables: Double click Cdf.chisq
5. Type & Label: Masukkan item Mahalanobis Distance (MAH_1)
6. Dalam kotak Numerical Expression, pastikan tertera persamaan
(equation) seperti berikut: 1-CDF.CHISQ(MAH_1,3)
7. Klik OK
NOTA: Nilai “3” mewakili degree of freedom (darjah kebebasan) iaitu bersamaan bilangan
faktor peramal (predictors). Predictors termasuklah IV, Mediator dan Moderator.
60
61
Seterusnya, untuk memastikan nilai p_MAH yang telah dihitung
adalah benar-benar merupakan Outliers, langkah berikut perlu
dijalankan:
Arahan:
1. Klik Transform>Compute dan tetingkap Compute Variable akan
dipaparkan
ubah baharu yang anda rasa sesuai (contohnya Outliers).
3. Dalam kotak Numerical Expression, ubah persamaan (equation)
yang tertera kepada: p_MAH<.001
4. Klik OK
NOTA: Nilai “p_MAH” juga boleh dimasukkan ke dalam kotak Numerical Expression
dengan cara mengklik variable “p_MAH” yang tertera di dalam kotak Type & Label
62
63
Kenormalan Taburan Data
Data Normality Distribution (con’t)
Statistical tests have the advantage of making an objective judgement

of normality, but are disadvantaged by sometimes not being sensitive
enough at low sample sizes or overly sensitive to large sample sizes.
As such, some statisticians prefer to use their experience to make a

subjective judgement about the data from plots/graphs. Graphical
interpretation has the advantage of allowing good judgement to
assess normality in situations when numerical tests might be over or
under sensitive, but graphical methods do lack objectivity.
If you do not have a great deal of experience interpreting normality

graphically, it is probably best to rely on the numerical methods.
64
NORMALITY TEST
1. Klik Analyze >Descriptive Statistics > Explore
65
NORMALITY TEST
2. Masukkan variable EB dalam kotak Dependent List
3. Klik pada Statistics. Tandakan kotak Descriptives dan
Outliers.
4. Klik Continue.
66
NORMALITY TEST
5. Klik pada Plots. Tandakan kotak Factor levels together,
Stem-and-leaf, Histogram, dan Normality plot with test.
6. Klik Continue. Dalam kotak Display, pastikan Both
ditandakan.
67
NORMALITY TEST
5. Klik Options. Dalam kotak Missing Value, klik Excluded

Cases Pairwise.
6. Klik Continue > OK
68
NORMALITY TEST
Kolmogrov-Smirnov & Shapiro-Wilk
Bagi ujian Kolmogorov-Smirnov dan ujian Shapiro-Wilk
• p > 0.05 bermakna data adalah normal
• p < 0.05 bermakna data TIDAK normal
Citations:
1. Discovering Statistics using SPSS (Field, 2009)
2. SPSS Survival Manual: A Step by Step Guide to Data Analysis using
SPSS Program (Pallant, 2010)
Nilai Sig. (p = .006) bagi ujian Kolmogorov-Smirnov dan Sig. (p = .000)

bagi ujian Shapiro-Wilk, menunjukkan data bertaburan tidak normal.
69
NORMALITY TEST
Skewness & Kurtosis
According to Hair Jr et al. (2010)

and Byrne (2016) that data is
considered to be normally
distributed if it has the z-score
values between ‐2 to +2 for
Skewness and ‐7 to +7 for
Kurtosis.
Z-score is calculated by dividing

the Statistic value with Std. Error
value.
Skewness z-score = -3.629

Kurtosis z-score = 3.655
70
NORMALITY TEST
Histogram
Rajah Histogram tidak

membentuk sebuah
loceng (bell-shaped)
dengan sempurna.
Maka, bermaksud data

bertaburan tidak normal.
71
NORMALITY TEST
Q-Q Plot
Rajah Normal Q-Q Plot

menunjukkan titik-titik
data tidak berada di
sepanjang garis lurus
(linear).
Maka, bermaksud data

72
NORMALITY TEST
Mardia’s Multivariate Skewness & Kurtosis
1. Log on to: https://webpower.psychstat.org/models/kurtosis/
73
Bagi ujian Mardia’s
Multivariate Skewness dan
Kurtosis
• p > 0.05 bermakna data
adalah normal
• p < 0.05 bermakna data
TIDAK normal
Nilai p-value = 0 bagi ujian

multivariate Skewness dan
p = 0 bagi ujian Kurtosis,
menunjukkan data
Citation:
Univariate and multivariate skewness and kurtosis for measuring non-
normality: Prevalence, influence and estimation (Cain, Zhang & Yuan, 2016)
74
VALIDITY & RELIABILITY
There are two primary types of psychometric analysis:
i) Reliability and ii) Validity.
75
76
77
EXPLORATORY FACTOR ANALYSIS
Langkah:
1. Klik Analyze > Dimension Reduction > Factor.
78
Langkah:
2. Klik Descriptives. Masukkan item EI1 hingga EI16 ke dalam kotak Variables.
Tandakan pada Initial solution, Anti-image dan KMO and Bartlett’s test of
sphericity. Klik Continue.
79
Langkah:
3. Klik Extraction. Tandakan pada Scree Plot. Pastikan Method adalah Principal
Component.
4. Pastikan Correlation matrix dan Eigenvalues over: 1 telah pun ditandakan. Klik
Continue.
80
LANGKAH:
5. Klik Rotation. Tanda pada Varimax dan Rotated solution.
6. Klik Continue.
81
LANGKAH:
7. Klik Options. Tanda pada Surpress small coefficients dan taip dalam
kotak Absolute value below: .33.
8. Klik Continue. Klik OK.
82
• Jadual menunjukkan ujian Bartlett’s of Sphericity adalah signifikan (p = .000)

dan ujian Kaiser-Meyer-Olkin (KMO) bagi kecukupan pensampelan (sampling
adequacy) adalah 0.910. Keputusan ujian Bartlett yang signifikan pada p <
0.05 menandakan kesemua item-tem di dalam konstruk sesuai digunakan
untuk analisis faktor kerana korelasi antara item-item di dalam konstruk
adalah memadai (Chua, 2009). Nilai KMO = 0.6 menunjukkan jumlah sampel
yang digunakan dalam sesebuah kajian adalah mencukupi (Hair Jr., Anderson,
Babin & Black, 2010; Pallant, 2010).
• Nilai KMO = 0.90 adalah cemerlang; 0.80 sangat baik; 0.70 baik; 0.60 biasa;
0.50 kurang baik dan kurang daripada 0.50 tidak boleh diterima untuk
melaksanakan analisis faktor (Hair Jr., Anderson, Babin & Black, 2010).
83
KMO is used to measure sampling adequacy. According to

Hutcheson & Sofroniou (1999), values between 0.5 and 0.7 are
mediocre, values between 0.7 and 0.8 are good, values between
0.8 and 0.9 are great and values above 0.9 are superb
Bartlett’s test is to examine whether the population correlation

matrix resembles an identity matrix. Identity matrix indicates
every variable correlates very badly with all other variables (all
correlation coefficients are close to zero). If no variables
correlate, then no clusters to find
To check whether the correlation matrix is identity matrix:

• If it is identity matrix, all correlations would be zero, thus
factor analysis is inappropriate
• If Bartlett test is significant (p<0.05), then, the correlation
matrix is not identity matrix, thus, factor analysis is
appropriate
84
85
Anti-image Matrices output table is examined by looking at the

values (loadings) that are indicated with (a), diagonally sorted in
the table. Loadings with a indicate the Measure of Sampling
Adequacy (MSA) that should reach certain values as follow:
86
Communality measures the percent of

variance in a given variable explained by all the
factors jointly.
As a rule of thumb, items with a communality

below 0.5 may be considered for dropping
from the factor model.
87
Jadual Total Variance Explained menunjukkan nilai-nilai eigen (eigenvalues) telah

menyenaraikan komponen 1 hingga 16. Jumlah amaun varians bagi pemboleh
ubah dalam analisis adalah sama dengan bilangan pemboleh ubah (dalam contoh
ini, 16)
88
These FOUR factors account for 48.976%, 10.926%, 8.097% and 6.459% of the total
variance, respectively. That is, 74.457% of the total variance is attributable to these FOUR
factors.
The remaining 12 factors together account for only approximately 25% of the variance.
89
Output from the table of Total Variance Explained are used

to decide how many factors to extract to represent the data
by examining the eigenvalues associated with the factors.
An eigenvalue is a ratio between the common (shared)

variance and the specific (unique) variance explained by a
specific factor extracted.
Only factors with eigenvalues of 1 or greater are considered

to be significant, while factors with eigenvalues less than 1
are disregarded
Thus, a model with FOUR factors may be adequate to

represent the data (see slides No.88 and 89).
90
Scree Test (Scree Plot) is used to identify the optimum

number of factors that can be extracted before the amount
of unique variance begins to dominate the common variance
structure (Hair, Anderson, Tatham, & Black, 1995)
Find a point at which the shape of the curve changes

direction and becomes horizontal. Retain all factors above
the elbow, or break in the plot, as these factors contribute
the most to the explanation of the variance in the data set
(see slide No.92).
91
The point at which the curve

first begins to straighten out is
considered to indicate the
maximum number of factors
to extract.
That is, those factors above

this point of inflection are
deemed meaningful
However, this curve is difficult

to interpret because it begins
to tail off after three factors,
but there is another drop after
four factors before a stable
plateau is reached.
92
Component Matrix
represents the
unrotated component
analysis factor matrix,
and presents the
correlations that relate
the variables to the
FOUR extracted factors.
These correlations are

called factor loadings,
indicate how closely the
variables are related to
each factor.
93
Factor loadings refer to correlation coefficients between the

variables (i.e. items) and the factors (i.e. construct) they
represent.
Variables with large loadings indicate that they are

representative of the factor. Small loadings suggest that
they are not.
Factor loadings greater than ±0.33 are considered to meet

the minimal level of practical significance.
The grouping of variables with high factor loadings should

suggest what the underlying dimension is for that factor.
94
The significance of a factor loading depends on the sample size.
Table of Critical Values (Stevens, 2002)

Sample Size Significant loadings REQUIRED
50 0.722
100 0.512
200 0.364
300 0.298
600 0.210
1000 0.162
• Factor loadings should be 0.7 or higher (Garson, 2013)

• For the exploratory purposes, a factor loading lower level such as 0.4
(Raubenheimer, 2014).
• Loadings above 0.6 “high” and those below 0.4 “low” (Hair et.al,
1998).
95
Sources:
1. Kaiser, H. F. (1970). A second-generation little jiffy, Psychometrika, 35, 401-
415.
2. Kaiser, H. F. and Rice, J. (1974). Little jiffy, Mark IV, Educational and
Psychology Measurement, 34, 111-117 96
High cross-loadings have

occurred which make
interpretation of the factors
difficult and theoretically
less meaningful.
Examples:
• Items EI2 and EI8 have
loaded highly on Factor 1
and Factor 4
• Items EI3 and EI4 have
loaded highly on Factor
1, Factor 2 and Factor 4
• Items EI9, EI10, EI11, and
EI12 have loaded highly
on Factor 1 and Factor 3
97
What to do for the cross loading?
1. Rerun factor analysis, stipulating a smaller number of

factors to be extracted.
2. Examine the wording of the cross-loaded variables, and

based on their face-validity, assign them to the factors
that they are most conceptually/logically representative
of.
3. Delete all cross-loaded variables. This will result in

“clean” factors and will make interpretation of the
factors that much easier. This method works best when
there are only few significant cross loadings.
98
RELIABILITY
The reliability of a measure is an inverse function of measurement

error:
• The more error, the less reliable the measure
• Reliable measures provide consistent measurement from
occasion to occasion
• The reliability of a measuring instrument is defined as its ability
to consistently measure the phenomenon it is designed to
measure.
• The degree to which the items that make up the scale ‘hang
together’.
99
INTERNAL CONSISTENCY RELIABILITY
• Internal consistency refers to the extent to which the items in a test

measure the same construct.
• Items that measure the same phenomenon should logically cling/hang
together in some consistent manner.
• Examining the internal consistency of the test enables the researcher to
determine which items are not consistent with the test in measuring
the phenomenon under investigation.
• The objective is to remove the inconsistent items and improve the
internal consistency of the test.
• An internally consistent test increases the chances of the test being
reliable.
100
RELIABILITY TEST
Arahan:
1. Klik Analyze > Scale > Reliability Analysis
101
RELIABILITY TEST
2. Klik item EB1 hingga item EB6. Masukkan dalam kotak

Items.
3. Taip nama pemboleh ubah (e.g. Employee Behaviour)
dalam kotak Scale label
4. Klik Statistics. Tandakan pada Scale if item deleted dan
Correlations. Klik Continue.
5. Pastikan Alpha telah dipilih dalam kotak berlabel
Model. Klik OK.
102
RELIABILITY TEST
103
RELIABILITY TEST (cont’)
HURAIAN:
1. Jadual Reliability Statistics melihat kepada pekali Cronbach’s
Alpha. Nilai Cronbach’s Alpha ialah 0.807 mencadangkan skor
pekali tersebut munasabah untuk dipercayai.
104
RELIABILITY TEST (Cronbach’s Alpha)
• Cronbach’s alpha coefficient of a scale above 0.7 is considered

sufficient/adequate (DeVellis, 2003; Nunnally, 1978), which suggests
that all of the items are reliable and the entire test is internally
consistent.
• However, values above 0.8 are preferable (DeVellis, 2003) and may be
not high enough for applied research (Nunnally, 1978).
• Where important decisions about the fate of individuals is made on the
basis of test scores, reliability should be at least 0.90, preferably 0.95 or
better (Nunnally, 1978).
• Cronbach alpha values are, however, quite sensitive to the number of
items in the scale.
• With short scales (e.g. scales with fewer than ten items) it is common to
find quite low Cronbach’s values (e.g. 0.5).
• The reliability of a scale can vary depending on the sample.
105
Check the Inter-Item Correlation Matrix for negative values.
All values should be positive, indicating that the items are

measuring the same underlying characteristic.
The presence of negative values could indicate that some of the

items have not been correctly reverse scored.
106
HURAIAN :
1. Kolum Cronbach’s Alpha if Item Deleted menunjukkan pekali alpha
menjadi 0.740, jika item EB1 dikeluarkan daripada skala. Pekali
alpha yang asal ialah 0.807 (rujuk Jadual Reliability Statistics).
2. Kolum Scale Variance if Item Deleted menunjukkan jumlah varians
skala akan menjadi 13.992 jika item EB1 dikeluarkan.
3. Keadaan yang sama juga akan berlaku sekiranya item lain
dikeluarkan daripada skala. Begitu juga dengan kolum-kolum lain.
107
• Item analysis is achieved through Item-Total correlation procedure.

• This procedure represents a refinement of test reliability by identifying
“problem” items in the test, i.e., those items that yield low correlations
with the sum of the scores on the remaining items.
• Item-Total correlation indicates the degree to which each item
correlates with the total score.
• Rejecting those items that are inconsistent with the rest (and retaining
those items with the highest average inter-correlations) will increase
the internal consistency of the measuring instrument.
• In deciding which item to retain or delete, the 0.33 criterion can be
used
• Low values (less than 0.3) here indicate that the item is measuring
something different from the scale as a whole.
108
PEARSON’S CORRELATION ANALYSIS
Range of correlation coefficient: −1 ≤ r ≤ +1

According to Cohen (1988) the strength/size of the relationship are:
• Weak/small: r= 0.1 – 0.29
• Moderate/medium: r = 0.3 – 0.49
• Large/strong: r= 0.5 – 0.1
Sources:
2. SPSS Survival Manual: A Step by Step Guide to Data Analysis using SPSS
Program (Pallant, 2010)
109
Arahan:
1. Klik Analyze > Correlate > Bivariate.
110
2. Klik pemboleh ubah EI, JA,

POS dan EB dan masukkan
ke dalam kotak Variables.
3. Pastikan Pearson dipilih
daripada kotak Correlation
Coefficients
4. Pastikan One-tailed dipilih
daripada kotak Test of
Significance.
5. Pastikan kotak Flag
significant correlations telah
ditanda.
6. Klik Continue > OK.
111
Contoh keterangan:
Jadual menunjukkan nilai pekali
korelasi, r antara pemboleh ubah EI
dan EB ialah 0.425. Nilai r yang positif
menunjukkan hubungan berkadar
langsung (direct relationship) antara
EI dan EB. Semakin meningkat tahap
EI, semakin bertambah tahap EB. Nilai
r = 0.425 menunjukkan korelasi antara
EI dan EB adalah hubungan positif
yang sederhana (moderate/medium).
Simbol asterisk (**) menunjukkan
signifikan pada aras keertian 0.01
(iaitu 99% selang keyakinan),
manakala (*) menunjukkan signifikan
pada aras keertian 0.05 (iaitu 95%
selang keyakinan).
112
LINEAR REGRESSION
(REGRESI LINEAR)
Simple Linear Regression (Regresi Linear Mudah)

Simple linear regression is used when we would like to see
the impact of a single independent variable on a dependent
variable.
Multiple Linear Regression (Regresi Linear Berganda)

Multiple linear regression is used when we would like to see
the impact of more than one independent variable on a
dependent variable
113
MULTIPLE LINEAR REGRESSION
ANALYSIS
1. Klik Analyze > Regression > Linear
114
MULTIPLE LINEAR REGRESSION (cont’)
2. Klik EB (dependent variable). Masukkan dalam kotak Dependent.
3. Klik EI1, JA dan POS (independent variables). Masukkan dalam kotak
Independent(s).
4. Klik Statistics. Tandakan Durbin-Watson dan Collinearity diagnostics.
5. Klik Continue > OK.
115
Independent errors means for any two observations the residual terms should be
uncorrelated (or independent). This eventuality is sometimes described as a lack of
autocorrelation. Durbin–Watson test is used to test serial correlations between the
errors.
Durbin-Watson Statistic, D
• 0 ≤D ≤4
• D ≈2 - No autocorrelation
• D < 1.5 - 1st order + ve autocorrelation
• D > 2.5 - 1st order - ve autocorrelation
Sources:
2. Testing for serial correlation in least squares regression (Durbin & Watson,
1951)
116
Value in Sig. column signifies the regression equation is significant
117
Multicollinearity exists when there is a strong correlation between two or more

predictors in a regression model.
Identifying multicollinearity:
a) Scan a correlation matrix of all of the predictor variables and see if any
correlate very highly (i.e. r > 0.80 or 0.90) (Field, 2009).
b) Variance Inflation Factor (VIF). The VIF indicates whether a predictor has a
strong linear relationship with the other predictor(s). When VIF > 1,
multicollinearity may be biasing the regression model (Bowerman &
O’Connell, 1990). VIF = 10 need to worry (Myers, 1990; Pallant, 2010).
118
• The Adjusted R2 value indicates the loss of predictive power or shrinkage. Whereas
R2 tells us how much of the variance in DV is accounted for by the regression model
from our sample, the adjusted value tells us how much variance in DV would be
accounted for if the model had been derived from the population from which the
sample was taken (Field, 2009).
• When a small sample is involved, the R2 value in the sample tends to be a rather
optimistic overestimation of the true value in the population (Tabachnick & Fidell
2007). The Adjusted R2 ‘corrects’ this value to provide a better estimate of the true
population value.
• Study with small sample size is suggested to report Adjusted R2 rather than the
normal R2 value (Pallant, 2010).
119
According to Hair Jr, Hult, Ringle and Sarstedt (2017):

R2 values Level
0.75 Substantial
0.50 Moderate
0.25 Weak
120
According to Hair Jr, Hult, Ringle and Sarstedt (2017):

T-values
p-values
One-tailed Two-tailed
0.01 2.33 2.57
0.05 1.65 1.96
0.10 1.28 1.65
121
UNTIL NEXT SESSION…
THINK LEAN, DO LEAN &

CONTINUOUSLY PURSUE FOR
PERFECTION
122

SPSS Introduction Course at PSB, UUM

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

SPSS Introduction Course at PSB, UUM

Hochgeladen von

Copyright:

Verfügbare Formate

SPSS INTRODUCTION

• Membuka perisian SPSS

All SPSS for

Variable View Data View

1. Insert/declare variables Key-in responses

2. Adjust decimal points

3. Set column width

Categorical variables are also known as discrete or qualitative

Continuous variables are also known as quantitative variables.

1. Gender 2. Education Level

This is an example of a dichotomous variable (and also a nominal

Ordinal scale is usually used to rate preferences or usage of various

Rank the following personal computers with respects to usage in your

__ IBM PS2/30 __Compaq

Ratio scale usually get used in organizational research when

2. Mastering these jobs 1 2 3 4 5

• Data yang disimpan melalui SPSS boleh diubahsuai

• Proses transformasi data akan membolehkan kita

• Terdapat pelbagai proses transformasi yang boleh

Biasanya digunakan untuk menukar skala (scale) respon item-

If your scale contains some items that are negatively worded

Compute membolehkan kita menghasilkan variable yang baharu melalui

3. Taip pada kotak Target Variables dengan satu nama pemboleh

4. Function Group: Pilih Statistical

• SPSS membolehkan pengguna menghasilkan

• Frequencies digunakan untuk mengira jumlah

• Pelbagai bentuk graf juga dapat dihasilkan

Apakah peratusan lelaki dan perempuan (gender) dalam sampel?

1. Klik Analyze > Descriptive Statics > Frequencies

2. Klik Gender. Masukkan dalam kotak Variable(s).

Memaparkan taburan secara serentak antara dua

Taburan silang biasanya dibentangkan sebagai jadual

1. Klik Analyze > Descriptive Statistics > Crosstabs.

Analisis Deskriptif (Descriptive Analysis) biasanya digunakan untuk

Sejauh manakah tahap kepuasan pelanggan terhadap

Sejauh manakah tahap pelaksanaan amalan 5S di Organisasi X?

How do we take care of missing responses?

Other ways of handling

1. Klik Analyze > Descriptive Statistics >

2. Klik EI1 dan EI4. Masukkan dalam kotak New Variable(s).

• Among continuous variables – whether searching for univariate or

• If you are going to perform on of the analyses with grouped data

Multivariate Outliers are found by first computing a Mahalanobis Distance for

3. Masukkan item EI1

Untuk menentukan nilai Mahalanobis yang dikategorikan sebagai

Statistical tests have the advantage of making an objective judgement

As such, some statisticians prefer to use their experience to make a

If you do not have a great deal of experience interpreting normality

1. Klik Analyze >Descriptive Statistics > Explore

5. Klik Options. Dalam kotak Missing Value, klik Excluded

Nilai Sig. (p = .006) bagi ujian Kolmogorov-Smirnov dan Sig. (p = .000)

According to Hair Jr et al. (2010)

Z-score is calculated by dividing

Skewness z-score = -3.629

Rajah Histogram tidak

Maka, bermaksud data

Rajah Normal Q-Q Plot

Maka, bermaksud data

1. Log on to: https://webpower.psychstat.org/models/kurtosis/

Nilai p-value = 0 bagi ujian

• Jadual menunjukkan ujian Bartlett’s of Sphericity adalah signifikan (p = .000)

KMO is used to measure sampling adequacy. According to

Bartlett’s test is to examine whether the population correlation

To check whether the correlation matrix is identity matrix:

IBM PS2/30 Compaq