Beruflich Dokumente
Kultur Dokumente
COURSE
30TH OCTOBER 2018
Instructor:
Mr. Azim Azuan Osman
Mobilise Coaches:
Dr. Rusnifaezah Musa
Dr. Maliani Mohamad
Ms. Nor Hafida Hamzah
In Collaboration with:
1
ANTARA MUKA SPSS (SPSS INTERFACE)
2
ANTARA MUKA SPSS (SPSS INTERFACE)
Start-up Dialogue Box
3
SPSS INTERFACE: VARIABLE VIEW
4
SPSS INTERFACE: DATA VIEW
5
SPSS INTERFACE: OUTPUT DIALOGUE BOX
6
DATA ENTRY & DECLARING VARIABLES
4. Text alignment
7
TYPES OF DATA/VARIABLES
1. Categorical Data/Variable
a) Nominal
b) Dichotomous
c) Ordinal
2. Continuous Data/Variable
a) Ratio
b) Interval
3. String (letters/words)
8
CATEGORICAL vs CONTINUOUS
DATA/VARIABLES
9
Nominal variables are variables that have two or more categories, but which
do not have an intrinsic order.
It splits data into mutually exclusive and collectively exhaustive categories.
Nominal scale is usually used for obtaining personal data such a gender,
department in which one is working, and so on, where grouping of individuals
or objects is useful, as in the case below.
10
Dichotomous variables are nominal variables which have only two
categories or levels. For example, if we were looking at gender, we
would most probably categorize somebody as either "male" or
"female".
11
Ordinal variables are variables that have two or more categories just like
nominal variables only the categories can also be ordered or ranked.
12
Ordinal Scale (example)
13
Ratio Scales
14
Interval Scale
• Interval variables are variables for which their central characteristic is that
they can be measured along a continuum and they have a numerical value
• Interval scale is used when responses to various items that measure a
variable can be tapped on five point ( or seven- points or any others
number of points) scale, which can thereafter be summated across the
items.
Strongly Strongly
Disagree Neutral Agree
Disagree Agree
2 3 4
1 5
1. My job offers me
1 2 3 4 5
chance to test myself
and my abilities
15
HANDS-ON data entry using provided
example questionnaires
16
DATA TRANSFORM (TRANSFORMASI DATA)
17
DATA TRANSFORM: “RECODE” COMMAND
Ia membolehkan kita mengubahsuai data di dalam variable
supaya ia bersesuaian dengan analisis yang akan dijalankan.
Arahan:
1. Buka data SPSS (i.e. recode excercise)
2. Klik Transform > Recode into Same Variables… dan
tetingkap Recode into Same Variables akan dipaparkan
18
Example of negative/reverse questions
20
“RECODE” into same variables (cont’)
3. Masukkan variable/item Jantina ke dalam kotak Numeric Variables.
Kemudian, klik Old and New Values.
21
“RECODE” into same variables (cont’)
4. Taip 1. Lelaki dalam petak Value di bawah Old value dan taip 1
dalam petak Value di bawah New Value. Klik Add.
22
“RECODE” into same variables (cont’)
5. Taip 2. Perempuan dalam petak petak Value di bawah Old value
dan taip 2 dalam petak Value di bawah New Value. Klik Add.
23
“RECODE” into same variables (cont’)
6. Ulang langkah yang sama jika data mempunyai lebih daripada
2 kategori
7. Klik Continue > OK
24
DATA TRANSFORM: “COMPUTE” COMMAND
Arahan:
1. Buka data SPSS (i.e. Data Mediation and Moderation)
2. Klik Transform > Compute Variable… dan tetingkap Compute Variable
akan dipaparkan
25
COMPUTE VARIABLES (cont’)
26
COMPUTE VARIABLES (cont’)
27
COMPUTE VARIABLES (cont’)
28
DESCRIPTIVE ANALYSES
29
DESCRIPTIVE ANALYSES (cont’)
Gender:
1 = Male, 2 = Female
Education:
1 = Bachelor, 2 = Master, 3 = PhD
Employment:
1 = Contractual, 2 = Permanent, 3 = Others
30
DESCRIPTIVES: FREQUENCIES
31
DESCRIPTIVES: FREQUENCIES (cont’)
32
FREQUENCIES OUTPUT
33
DESCRIPTIVES: CROSSTABS
34
DESCRIPTIVES: CROSSTABS (cont’)
35
DESCRIPTIVES: CROSSTABS (cont’)
Jadual Tabulasi Silang (Cross-Tabulation) di atas membandingkan antara dua pemboleh ubah
iaitu Nature of Employment dan Education. Responden berkelulusan ijazah pertama
(Bachelor’s) adalah paling ramai iaitu 121 orang berbanding keseluruhan responden yang
berjawatan kontrak (Contractual). Manakala, responden yang berkelulusan ijazah sarjana
(Master’s) pula ialah responden berjawatan tetap (Permanent) teramai iaitu seramai 12 orang
berbanding keseluruhan responden yang berjawatan tetap (Permanent).
36
DESCRIPTIVES: CROSSTABS (cont’)
37
DESCRIPTIVES: Descriptive
Arahan:
1. Klik Analyze > Descriptive Statistics > Descriptives dan tetingkap
Descriptives akan dipaparkan
38
DESCRIPTIVES: Descriptive (cont’)
2. Klik EI, EP, C, OC, JA, POS dan EB. Masukkan dalam kotak Variable(s).
3. Klik Options. Pilih Mean dan Std. deviation, klik Continue.
39
DESCRIPTIVES: Descriptive (cont’)
Note: Interpret Mean score based on the items’ scales (e.g. scale 3
may represent moderate, score 4 equals to high, score 5 is very high).
40
Missing Data/Value @ Blank Responses
41
Missing Data/Value @ Blank Responses
If there is a pattern to the missing data or there are choose one of the
following options:
• Option 1: Replace values with numbers that are known from prior
knowledge or from an educated guess. Easily done but can lead to
researcher bias if you are not careful.
• Option 2: Replace missing values with variable mean (or median).
The simplest option but it does lower variability and in turn can bias
results.
• Option 3: Replace missing values with a group mean (i.e. the mean for
prejudice grouped by ethnicity). The missing value is replaced with the
mean of the group that the subject belongs to. A little more
complicated but there is not as much of a reduction in the variability.
• Option 4: Using regression to predict the missing values. Other
variables act as IVs predicting the variable with the missing values
(which acts as the DV).
42
How to DETECT the missing values
43
How to DETECT the missing values (cont’)
44
How to REPLACE missing values
1. Klik Transform > Replace Missing Values dan tetingkap
Replace Missing Values akan dipaparkan
45
How to REPLACE missing values (con’t)
46
How to REPLACE missing values (con’t)
47
How to REPLACE missing values (con’t)
48
Outlier Samples/Cases (Kes-kes Terpencil)
49
Outlier Samples/Cases (con’t)
50
Outlier Samples/Cases (con’t)
51
Univariate vs Multivariate
• Univariate statistics
• includes all statistical techniques for
analyzing a single variable of interest
• Multivariate statistics
• includes all statistical techniques for
analyzing two or more variables of interest
• The focus is on relationships among
variables rather than on isolated
individual factors
52
Univariate vs Multivariate Outliers
Univariate outliers are those with very large standardized scores (z scores
greater than 3.3) and that are disconnected from the distribution.
• SPSS DESCRIPTIVES will give you the z scores for every case if you select
save standardized values as variables and SPSS FREQUENCIES will give you
histograms (use SPLIT FILE/ Compare Groups under DATA for grouped data).
53
Detecting Univariate Outliers using
“Boxplot Diagram”
Arahan:
1. Klik Graphs > Legacy Dialogs > Boxplot dan tetingkap Boxplot akan
dipaparkan
2. Pilih Simple dan tandakan Summaries of separate variables pada kotak Data
in Chart Are akan dipaparkan. Klik Define.
54
Detecting Univariate Outliers using
“Boxplot Diagram”
55
Detecting Univariate Outliers using
“Boxplot Diagram”
Rajah boxplot menunjukkan kes (responden) 2, 15, 135, 136, 137, 139 dan
179 merupakan Outliers (pencilan) bagi item EI7. Walau bagaimanapun,
tiada kes (responden) yang dikenal pasti sebagai Extreme Oulier (pencilan
melampau). Jika terdapat Extreme Outlier, tanda (*) akan kelihatan pada
rajah boxplot. 56
Detecting Multivariate Outliers using
“Mahalanobis Distance”
1. Klik Analyze > Regression > Linear
57
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
2. Klik EB (dependent variable). Masukkan dalam kotak Dependent.
3. Klik EI1, JA dan POS (independent variables). Masukkan dalam
kotak Independent(s).
4. Klik Save. Tandakan Mahalanobis dan klik Continue > OK.
58
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
59
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
Arahan:
1. Klik Transform>Compute dan tetingkap Compute Variable akan
dipaparkan
2. Taip pada kotak Target Variables dengan satu nama pemboleh
ubah baharu yang anda rasa sesuai (contohnya p_MAH).
3. Function Group: Pilih CDF and noncentral CDF
4. Functions and Special Variables: Double click Cdf.chisq
5. Type & Label: Masukkan item Mahalanobis Distance (MAH_1)
6. Dalam kotak Numerical Expression, pastikan tertera persamaan
(equation) seperti berikut: 1-CDF.CHISQ(MAH_1,3)
7. Klik OK
NOTA: Nilai “3” mewakili degree of freedom (darjah kebebasan) iaitu bersamaan bilangan
faktor peramal (predictors). Predictors termasuklah IV, Mediator dan Moderator.
60
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
61
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
Seterusnya, untuk memastikan nilai p_MAH yang telah dihitung
adalah benar-benar merupakan Outliers, langkah berikut perlu
dijalankan:
Arahan:
1. Klik Transform>Compute dan tetingkap Compute Variable akan
dipaparkan
2. Taip pada kotak Target Variables dengan satu nama pemboleh
ubah baharu yang anda rasa sesuai (contohnya Outliers).
3. Dalam kotak Numerical Expression, ubah persamaan (equation)
yang tertera kepada: p_MAH<.001
4. Klik OK
NOTA: Nilai “p_MAH” juga boleh dimasukkan ke dalam kotak Numerical Expression
dengan cara mengklik variable “p_MAH” yang tertera di dalam kotak Type & Label
62
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
63
Kenormalan Taburan Data
Data Normality Distribution (con’t)
64
NORMALITY TEST
65
NORMALITY TEST
2. Masukkan variable EB dalam kotak Dependent List
3. Klik pada Statistics. Tandakan kotak Descriptives dan
Outliers.
4. Klik Continue.
66
NORMALITY TEST
5. Klik pada Plots. Tandakan kotak Factor levels together,
Stem-and-leaf, Histogram, dan Normality plot with test.
6. Klik Continue. Dalam kotak Display, pastikan Both
ditandakan.
67
NORMALITY TEST
68
NORMALITY TEST
Kolmogrov-Smirnov & Shapiro-Wilk
Bagi ujian Kolmogorov-Smirnov dan ujian Shapiro-Wilk
• p > 0.05 bermakna data adalah normal
• p < 0.05 bermakna data TIDAK normal
Citations:
1. Discovering Statistics using SPSS (Field, 2009)
2. SPSS Survival Manual: A Step by Step Guide to Data Analysis using
SPSS Program (Pallant, 2010)
69
NORMALITY TEST
Skewness & Kurtosis
70
NORMALITY TEST
Histogram
71
NORMALITY TEST
Q-Q Plot
72
NORMALITY TEST
Mardia’s Multivariate Skewness & Kurtosis
73
Bagi ujian Mardia’s
Multivariate Skewness dan
Kurtosis
• p > 0.05 bermakna data
adalah normal
• p < 0.05 bermakna data
TIDAK normal
Citation:
Univariate and multivariate skewness and kurtosis for measuring non-
normality: Prevalence, influence and estimation (Cain, Zhang & Yuan, 2016)
74
VALIDITY & RELIABILITY
There are two primary types of psychometric analysis:
i) Reliability and ii) Validity.
75
76
77
EXPLORATORY FACTOR ANALYSIS
Langkah:
1. Klik Analyze > Dimension Reduction > Factor.
78
EXPLORATORY FACTOR ANALYSIS
Langkah:
2. Klik Descriptives. Masukkan item EI1 hingga EI16 ke dalam kotak Variables.
Tandakan pada Initial solution, Anti-image dan KMO and Bartlett’s test of
sphericity. Klik Continue.
79
EXPLORATORY FACTOR ANALYSIS
Langkah:
3. Klik Extraction. Tandakan pada Scree Plot. Pastikan Method adalah Principal
Component.
4. Pastikan Correlation matrix dan Eigenvalues over: 1 telah pun ditandakan. Klik
Continue.
80
EXPLORATORY FACTOR ANALYSIS
LANGKAH:
5. Klik Rotation. Tanda pada Varimax dan Rotated solution.
6. Klik Continue.
81
EXPLORATORY FACTOR ANALYSIS
LANGKAH:
7. Klik Options. Tanda pada Surpress small coefficients dan taip dalam
kotak Absolute value below: .33.
8. Klik Continue. Klik OK.
82
EXPLORATORY FACTOR ANALYSIS
• Nilai KMO = 0.90 adalah cemerlang; 0.80 sangat baik; 0.70 baik; 0.60 biasa;
0.50 kurang baik dan kurang daripada 0.50 tidak boleh diterima untuk
melaksanakan analisis faktor (Hair Jr., Anderson, Babin & Black, 2010).
83
EXPLORATORY FACTOR ANALYSIS
85
EXPLORATORY FACTOR ANALYSIS
86
EXPLORATORY FACTOR ANALYSIS
87
EXPLORATORY FACTOR ANALYSIS
These FOUR factors account for 48.976%, 10.926%, 8.097% and 6.459% of the total
variance, respectively. That is, 74.457% of the total variance is attributable to these FOUR
factors.
The remaining 12 factors together account for only approximately 25% of the variance.
89
EXPLORATORY FACTOR ANALYSIS
90
EXPLORATORY FACTOR ANALYSIS
91
EXPLORATORY FACTOR ANALYSIS
92
EXPLORATORY FACTOR ANALYSIS
Component Matrix
represents the
unrotated component
analysis factor matrix,
and presents the
correlations that relate
the variables to the
FOUR extracted factors.
93
EXPLORATORY FACTOR ANALYSIS
94
EXPLORATORY FACTOR ANALYSIS
The significance of a factor loading depends on the sample size.
95
EXPLORATORY FACTOR ANALYSIS
Sources:
1. Kaiser, H. F. (1970). A second-generation little jiffy, Psychometrika, 35, 401-
415.
2. Kaiser, H. F. and Rice, J. (1974). Little jiffy, Mark IV, Educational and
Psychology Measurement, 34, 111-117 96
EXPLORATORY FACTOR ANALYSIS
Examples:
• Items EI2 and EI8 have
loaded highly on Factor 1
and Factor 4
• Items EI3 and EI4 have
loaded highly on Factor
1, Factor 2 and Factor 4
• Items EI9, EI10, EI11, and
EI12 have loaded highly
on Factor 1 and Factor 3
97
EXPLORATORY FACTOR ANALYSIS
98
RELIABILITY
99
INTERNAL CONSISTENCY RELIABILITY
100
RELIABILITY TEST
Arahan:
1. Klik Analyze > Scale > Reliability Analysis
101
RELIABILITY TEST
102
RELIABILITY TEST
103
RELIABILITY TEST (cont’)
HURAIAN:
1. Jadual Reliability Statistics melihat kepada pekali Cronbach’s
Alpha. Nilai Cronbach’s Alpha ialah 0.807 mencadangkan skor
pekali tersebut munasabah untuk dipercayai.
104
RELIABILITY TEST (Cronbach’s Alpha)
105
RELIABILITY TEST (cont’)
106
RELIABILITY TEST (cont’)
HURAIAN :
1. Kolum Cronbach’s Alpha if Item Deleted menunjukkan pekali alpha
menjadi 0.740, jika item EB1 dikeluarkan daripada skala. Pekali
alpha yang asal ialah 0.807 (rujuk Jadual Reliability Statistics).
2. Kolum Scale Variance if Item Deleted menunjukkan jumlah varians
skala akan menjadi 13.992 jika item EB1 dikeluarkan.
3. Keadaan yang sama juga akan berlaku sekiranya item lain
dikeluarkan daripada skala. Begitu juga dengan kolum-kolum lain.
107
RELIABILITY TEST (cont’)
108
PEARSON’S CORRELATION ANALYSIS
109
PEARSON’S CORRELATION ANALYSIS
Arahan:
1. Klik Analyze > Correlate > Bivariate.
110
PEARSON’S CORRELATION ANALYSIS
111
PEARSON’S CORRELATION ANALYSIS
Contoh keterangan:
Jadual menunjukkan nilai pekali
korelasi, r antara pemboleh ubah EI
dan EB ialah 0.425. Nilai r yang positif
menunjukkan hubungan berkadar
langsung (direct relationship) antara
EI dan EB. Semakin meningkat tahap
EI, semakin bertambah tahap EB. Nilai
r = 0.425 menunjukkan korelasi antara
EI dan EB adalah hubungan positif
yang sederhana (moderate/medium).
Simbol asterisk (**) menunjukkan
signifikan pada aras keertian 0.01
(iaitu 99% selang keyakinan),
manakala (*) menunjukkan signifikan
pada aras keertian 0.05 (iaitu 95%
selang keyakinan).
112
LINEAR REGRESSION
(REGRESI LINEAR)
113
MULTIPLE LINEAR REGRESSION
ANALYSIS
1. Klik Analyze > Regression > Linear
114
MULTIPLE LINEAR REGRESSION (cont’)
2. Klik EB (dependent variable). Masukkan dalam kotak Dependent.
3. Klik EI1, JA dan POS (independent variables). Masukkan dalam kotak
Independent(s).
4. Klik Statistics. Tandakan Durbin-Watson dan Collinearity diagnostics.
5. Klik Continue > OK.
115
MULTIPLE LINEAR REGRESSION (cont’)
Independent errors means for any two observations the residual terms should be
uncorrelated (or independent). This eventuality is sometimes described as a lack of
autocorrelation. Durbin–Watson test is used to test serial correlations between the
errors.
Durbin-Watson Statistic, D
• 0 ≤D ≤4
• D ≈2 - No autocorrelation
• D < 1.5 - 1st order + ve autocorrelation
• D > 2.5 - 1st order - ve autocorrelation
Sources:
1. Discovering Statistics using SPSS (Field, 2009)
2. Testing for serial correlation in least squares regression (Durbin & Watson,
1951)
116
MULTIPLE LINEAR REGRESSION (cont’)
117
MULTIPLE LINEAR REGRESSION (cont’)
Identifying multicollinearity:
a) Scan a correlation matrix of all of the predictor variables and see if any
correlate very highly (i.e. r > 0.80 or 0.90) (Field, 2009).
b) Variance Inflation Factor (VIF). The VIF indicates whether a predictor has a
strong linear relationship with the other predictor(s). When VIF > 1,
multicollinearity may be biasing the regression model (Bowerman &
O’Connell, 1990). VIF = 10 need to worry (Myers, 1990; Pallant, 2010).
118
MULTIPLE LINEAR REGRESSION (cont’)
• The Adjusted R2 value indicates the loss of predictive power or shrinkage. Whereas
R2 tells us how much of the variance in DV is accounted for by the regression model
from our sample, the adjusted value tells us how much variance in DV would be
accounted for if the model had been derived from the population from which the
sample was taken (Field, 2009).
• When a small sample is involved, the R2 value in the sample tends to be a rather
optimistic overestimation of the true value in the population (Tabachnick & Fidell
2007). The Adjusted R2 ‘corrects’ this value to provide a better estimate of the true
population value.
• Study with small sample size is suggested to report Adjusted R2 rather than the
normal R2 value (Pallant, 2010).
119
MULTIPLE LINEAR REGRESSION (cont’)
120
MULTIPLE LINEAR REGRESSION (cont’)
121
UNTIL NEXT SESSION…
122