Sie sind auf Seite 1von 122

SPSS INTRODUCTION

COURSE
30TH OCTOBER 2018
Instructor:
Mr. Azim Azuan Osman
Mobilise Coaches:
Dr. Rusnifaezah Musa
Dr. Maliani Mohamad
Ms. Nor Hafida Hamzah
In Collaboration with:

1
ANTARA MUKA SPSS (SPSS INTERFACE)

• Membuka perisian SPSS


• Arahan:

All SPSS for


Start SPSS
Programs Windows

2
ANTARA MUKA SPSS (SPSS INTERFACE)
Start-up Dialogue Box

3
SPSS INTERFACE: VARIABLE VIEW

4
SPSS INTERFACE: DATA VIEW

5
SPSS INTERFACE: OUTPUT DIALOGUE BOX

6
DATA ENTRY & DECLARING VARIABLES

Variable View Data View

1. Insert/declare variables Key-in responses

2. Adjust decimal points

3. Set column width

4. Text alignment

7
TYPES OF DATA/VARIABLES

1. Categorical Data/Variable
a) Nominal
b) Dichotomous
c) Ordinal
2. Continuous Data/Variable
a) Ratio
b) Interval
3. String (letters/words)

8
CATEGORICAL vs CONTINUOUS
DATA/VARIABLES

Categorical variables are also known as discrete or qualitative


variables. Categorical variables can be further categorized as
either nominal, ordinal or dichotomous.

Continuous variables are also known as quantitative variables.


Continuous variables can be further categorized as
either interval or ratio variables.

9
Nominal variables are variables that have two or more categories, but which
do not have an intrinsic order.
It splits data into mutually exclusive and collectively exhaustive categories.
Nominal scale is usually used for obtaining personal data such a gender,
department in which one is working, and so on, where grouping of individuals
or objects is useful, as in the case below.

1. Gender 2. Education Level


• Male • Bachelor
• Female • Master
• PhD

10
Dichotomous variables are nominal variables which have only two
categories or levels. For example, if we were looking at gender, we
would most probably categorize somebody as either "male" or
"female".

This is an example of a dichotomous variable (and also a nominal


variable).

11
Ordinal variables are variables that have two or more categories just like
nominal variables only the categories can also be ordered or ranked.

Ordinal scale is usually used to rate preferences or usage of various


brand of the product by individuals and to rank order individuals,
objects, or events as the examples below.

Rank the following personal computers with respects to usage in your


office, assigning the number 1 to the most used system, 2 to the next
most used systems and so on. If particular system is not used at all in
your office, put a 0 against.

__ IBM PS2/30 __Compaq


__ IBM/AT __ AT&T
__ IBM/XT ___ Tandy 2000
__ Apple ___ Other (specify)

12
Ordinal Scale (example)

13
Ratio Scales

Ratio scale usually get used in organizational research when


exact figures on objective (as opposed to subjective ) factors
are called for, as in the following question.
1. How many others organization did you work before joining this
systems? ______
2. Please indicates the number of children you have in each of the
following categories:
____ below 3 years of age
____ between 3 and 6
____ over 6 years but under 12
____ 12 years and over
3. How many retail outlets do you operate?

14
Interval Scale
• Interval variables are variables for which their central characteristic is that
they can be measured along a continuum and they have a numerical value
• Interval scale is used when responses to various items that measure a
variable can be tapped on five point ( or seven- points or any others
number of points) scale, which can thereafter be summated across the
items.

Strongly Strongly
Disagree Neutral Agree
Disagree Agree
2 3 4
1 5

1. My job offers me
1 2 3 4 5
chance to test myself
and my abilities

2. Mastering these jobs 1 2 3 4 5


meant a lot to me.

15
HANDS-ON data entry using provided
example questionnaires

16
DATA TRANSFORM (TRANSFORMASI DATA)

• Data yang disimpan melalui SPSS boleh diubahsuai


mengikut keperluan masing-masing

• Proses transformasi data akan membolehkan kita


menjalankan operasi analisis data yang lebih
kompleks.

• Terdapat pelbagai proses transformasi yang boleh


dilakukan seperti arahan Recode, Compute, dan
Count.

17
DATA TRANSFORM: “RECODE” COMMAND
Ia membolehkan kita mengubahsuai data di dalam variable
supaya ia bersesuaian dengan analisis yang akan dijalankan.

Biasanya digunakan untuk menukar skala (scale) respon item-


item negative (negative/reverse questions) menjadi item-item
positif
atau
Menukar respon dalam bentuk string (perkataan) ke dalam
bentuk numeric (nombor/kod) dan sebaliknya.

Arahan:
1. Buka data SPSS (i.e. recode excercise)
2. Klik Transform > Recode into Same Variables… dan
tetingkap Recode into Same Variables akan dipaparkan

18
Example of negative/reverse questions

If your scale contains some items that are negatively worded


(common in psychological measures), these need to be
‘reversed’ before checking reliability.
19
“RECODE” into same variables

20
“RECODE” into same variables (cont’)
3. Masukkan variable/item Jantina ke dalam kotak Numeric Variables.
Kemudian, klik Old and New Values.

21
“RECODE” into same variables (cont’)

4. Taip 1. Lelaki dalam petak Value di bawah Old value dan taip 1
dalam petak Value di bawah New Value. Klik Add.

22
“RECODE” into same variables (cont’)
5. Taip 2. Perempuan dalam petak petak Value di bawah Old value
dan taip 2 dalam petak Value di bawah New Value. Klik Add.

23
“RECODE” into same variables (cont’)
6. Ulang langkah yang sama jika data mempunyai lebih daripada
2 kategori
7. Klik Continue > OK

24
DATA TRANSFORM: “COMPUTE” COMMAND

Compute membolehkan kita menghasilkan variable yang baharu melalui


operasi manipulasi beberapa variable yang sedia ada di dalam fail data
SPSS.
Compute bertujuan untuk mendapatkan nilai total (ataupun nilai min)
bagi soalan (item) x1 hingga x6 untuk mengukur tahap (level) persepsi
responden terhadap sesuatu pemboleh ubah (variable).
Biasanya dalam sesuatu kajian, penyelidik menggunakan nilai min (mean)
bagi mengukur tahap tersebut.

Arahan:
1. Buka data SPSS (i.e. Data Mediation and Moderation)
2. Klik Transform > Compute Variable… dan tetingkap Compute Variable
akan dipaparkan

25
COMPUTE VARIABLES (cont’)

3. Taip pada kotak Target Variables dengan satu nama pemboleh


ubah baharu yang anda rasa sesuai (contohnya mean_EI).

26
COMPUTE VARIABLES (cont’)

4. Function Group: Pilih Statistical


5. Functions and Special Variables: Pilih Mean
6. Numeric Expression: Masukkan item EI1 hingga EI16 seperti dalam rajah
7. Klik OK

27
COMPUTE VARIABLES (cont’)

28
DESCRIPTIVE ANALYSES

• SPSS membolehkan pengguna menghasilkan


pelbagai bentuk taburan data menerusi arahan
Frequencies, Descriptives dan Crosstabs.

• Frequencies digunakan untuk mengira jumlah


respon. Ia juga boleh menghasilkan statistik
diskriptif seperti min, mod, median dan varians.

• Pelbagai bentuk graf juga dapat dihasilkan


melalui arahan ini.

29
DESCRIPTIVE ANALYSES (cont’)

Gender:
1 = Male, 2 = Female

Education:
1 = Bachelor, 2 = Master, 3 = PhD

Employment:
1 = Contractual, 2 = Permanent, 3 = Others

Apakah peratusan lelaki dan perempuan (gender) dalam sampel?


Apakah frekuensi bagi setiap kategori tahap pendidikan (education)?
Berapa peratus dalam setiap kategori pekerjaan (employment)?

30
DESCRIPTIVES: FREQUENCIES

1. Klik Analyze > Descriptive Statics > Frequencies

31
DESCRIPTIVES: FREQUENCIES (cont’)

2. Klik Gender. Masukkan dalam kotak Variable(s).


3. Klik Chart. Pilih Chart Type, klik Continue

32
FREQUENCIES OUTPUT

33
DESCRIPTIVES: CROSSTABS

Memaparkan taburan secara serentak antara dua


atau lebih pemboleh ubah.

Taburan silang biasanya dibentangkan sebagai jadual


kontingensi dalam format matriks

34
DESCRIPTIVES: CROSSTABS (cont’)

1. Klik Analyze > Descriptive Statistics > Crosstabs.


2. Masukkan Nature of Employment dalam kotak Row(s) dan
Education dalam kotak Column(s)
3. Klik kotak Display clustred bar charts dan OK

35
DESCRIPTIVES: CROSSTABS (cont’)

Jadual Tabulasi Silang (Cross-Tabulation) di atas membandingkan antara dua pemboleh ubah
iaitu Nature of Employment dan Education. Responden berkelulusan ijazah pertama
(Bachelor’s) adalah paling ramai iaitu 121 orang berbanding keseluruhan responden yang
berjawatan kontrak (Contractual). Manakala, responden yang berkelulusan ijazah sarjana
(Master’s) pula ialah responden berjawatan tetap (Permanent) teramai iaitu seramai 12 orang
berbanding keseluruhan responden yang berjawatan tetap (Permanent).
36
DESCRIPTIVES: CROSSTABS (cont’)

37
DESCRIPTIVES: Descriptive

Analisis Deskriptif (Descriptive Analysis) biasanya digunakan untuk


menjawab persoalan kajian seperti:

Sejauh manakah tahap kepuasan pelanggan terhadap


Perkhidmatan A?
To what extend the level of customer satisfaction on Service A?

Sejauh manakah tahap pelaksanaan amalan 5S di Organisasi X?


To what extend the level of 5S implementation in X Organization?

Arahan:
1. Klik Analyze > Descriptive Statistics > Descriptives dan tetingkap
Descriptives akan dipaparkan

38
DESCRIPTIVES: Descriptive (cont’)

2. Klik EI, EP, C, OC, JA, POS dan EB. Masukkan dalam kotak Variable(s).
3. Klik Options. Pilih Mean dan Std. deviation, klik Continue.

39
DESCRIPTIVES: Descriptive (cont’)

Note: Interpret Mean score based on the items’ scales (e.g. scale 3
may represent moderate, score 4 equals to high, score 5 is very high).

40
Missing Data/Value @ Blank Responses

How do we take care of missing responses?


• Too many missing (more than 25% missing), throw out the
questionnaire
• Less than 10%, the missing data is considered minimal (Hair
et al. 2010), the missing responses can be ignored (i.e.
system missing).

Other ways of handling


1. Use the midpoint of the scale
2. Random number replacement
3. 4 systematic options (see next slide)

41
Missing Data/Value @ Blank Responses

If there is a pattern to the missing data or there are choose one of the
following options:
• Option 1: Replace values with numbers that are known from prior
knowledge or from an educated guess. Easily done but can lead to
researcher bias if you are not careful.
• Option 2: Replace missing values with variable mean (or median).
The simplest option but it does lower variability and in turn can bias
results.
• Option 3: Replace missing values with a group mean (i.e. the mean for
prejudice grouped by ethnicity). The missing value is replaced with the
mean of the group that the subject belongs to. A little more
complicated but there is not as much of a reduction in the variability.
• Option 4: Using regression to predict the missing values. Other
variables act as IVs predicting the variable with the missing values
(which acts as the DV).
42
How to DETECT the missing values

1. Klik Analyze > Descriptive Statistics >


Frequencies dan tetingkap Frequencies
akan dipaparkan
2. Klik EI1 hingga EI16. Masukkan dalam
kotak Variable(s).
3. Pastikan kotak Display frequency tables
ditandakan, klik OK.

43
How to DETECT the missing values (cont’)

44
How to REPLACE missing values
1. Klik Transform > Replace Missing Values dan tetingkap
Replace Missing Values akan dipaparkan

45
How to REPLACE missing values (con’t)

2. Klik EI1 dan EI4. Masukkan dalam kotak New Variable(s).


3. Klik (highlight) EI1_1=SMEAN(EI1). Klik Method. Pilih Median of
nearby points, klik Change
4. Ulang langkah yang sama untuk EI4_1=SMEAN(EI4)
5. Klik OK

46
How to REPLACE missing values (con’t)

47
How to REPLACE missing values (con’t)

48
Outlier Samples/Cases (Kes-kes Terpencil)

• Among continuous variables – whether searching for univariate or


multivariate outliers the method depends on whether the data is
grouped or not. If you are performing analyses with ungrouped data
(i.e. regression, canonical correlation, factor analysis, or structural
equations modeling) univariate and multivariate outliers are required
among all cases at once.

• If you are going to perform on of the analyses with grouped data


(ANOVA, ANCOVA, MANOVA, MANCOVA, profile analysis, discriminant
function analysis, or logistic regression) both univariate and
multivariate outliers are required within each group separately.

49
Outlier Samples/Cases (con’t)

50
Outlier Samples/Cases (con’t)

51
Univariate vs Multivariate

• Univariate statistics
• includes all statistical techniques for
analyzing a single variable of interest

• Multivariate statistics
• includes all statistical techniques for
analyzing two or more variables of interest
• The focus is on relationships among
variables rather than on isolated
individual factors

52
Univariate vs Multivariate Outliers
Univariate outliers are those with very large standardized scores (z scores
greater than 3.3) and that are disconnected from the distribution.
• SPSS DESCRIPTIVES will give you the z scores for every case if you select
save standardized values as variables and SPSS FREQUENCIES will give you
histograms (use SPLIT FILE/ Compare Groups under DATA for grouped data).

Multivariate Outliers are found by first computing a Mahalanobis Distance for


each case and once that is done the Mahalanobis scores are screened in the
same manner that univariate outliers are screened.
• To compute Mahalanobis Distance in SPSS you must use Analyze >
Regression > Linear. For grouped data the Mahalanobis distances must
be computed separately for each group

53
Detecting Univariate Outliers using
“Boxplot Diagram”
Arahan:
1. Klik Graphs > Legacy Dialogs > Boxplot dan tetingkap Boxplot akan
dipaparkan
2. Pilih Simple dan tandakan Summaries of separate variables pada kotak Data
in Chart Are akan dipaparkan. Klik Define.

54
Detecting Univariate Outliers using
“Boxplot Diagram”

3. Masukkan item EI1


hingga EI16 ke dalam
kotak Boxes Represent.
4. Masukkan ID responden
(jika ada) pada kotak
Label Cases by. Klik OK.

55
Detecting Univariate Outliers using
“Boxplot Diagram”

Rajah boxplot menunjukkan kes (responden) 2, 15, 135, 136, 137, 139 dan
179 merupakan Outliers (pencilan) bagi item EI7. Walau bagaimanapun,
tiada kes (responden) yang dikenal pasti sebagai Extreme Oulier (pencilan
melampau). Jika terdapat Extreme Outlier, tanda (*) akan kelihatan pada
rajah boxplot. 56
Detecting Multivariate Outliers using
“Mahalanobis Distance”
1. Klik Analyze > Regression > Linear

57
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
2. Klik EB (dependent variable). Masukkan dalam kotak Dependent.
3. Klik EI1, JA dan POS (independent variables). Masukkan dalam
kotak Independent(s).
4. Klik Save. Tandakan Mahalanobis dan klik Continue > OK.

58
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)

59
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)

Untuk menentukan nilai Mahalanobis yang dikategorikan sebagai


Outliers, nilai p_MAH perlu dihitung.

Arahan:
1. Klik Transform>Compute dan tetingkap Compute Variable akan
dipaparkan
2. Taip pada kotak Target Variables dengan satu nama pemboleh
ubah baharu yang anda rasa sesuai (contohnya p_MAH).
3. Function Group: Pilih CDF and noncentral CDF
4. Functions and Special Variables: Double click Cdf.chisq
5. Type & Label: Masukkan item Mahalanobis Distance (MAH_1)
6. Dalam kotak Numerical Expression, pastikan tertera persamaan
(equation) seperti berikut: 1-CDF.CHISQ(MAH_1,3)
7. Klik OK

NOTA: Nilai “3” mewakili degree of freedom (darjah kebebasan) iaitu bersamaan bilangan
faktor peramal (predictors). Predictors termasuklah IV, Mediator dan Moderator.

60
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)

61
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)
Seterusnya, untuk memastikan nilai p_MAH yang telah dihitung
adalah benar-benar merupakan Outliers, langkah berikut perlu
dijalankan:

Arahan:
1. Klik Transform>Compute dan tetingkap Compute Variable akan
dipaparkan
2. Taip pada kotak Target Variables dengan satu nama pemboleh
ubah baharu yang anda rasa sesuai (contohnya Outliers).
3. Dalam kotak Numerical Expression, ubah persamaan (equation)
yang tertera kepada: p_MAH<.001
4. Klik OK

NOTA: Nilai “p_MAH” juga boleh dimasukkan ke dalam kotak Numerical Expression
dengan cara mengklik variable “p_MAH” yang tertera di dalam kotak Type & Label

62
Detecting Multivariate Outliers using
“Mahalanobis Distance” (cont’)

63
Kenormalan Taburan Data
Data Normality Distribution (con’t)

Statistical tests have the advantage of making an objective judgement


of normality, but are disadvantaged by sometimes not being sensitive
enough at low sample sizes or overly sensitive to large sample sizes.

As such, some statisticians prefer to use their experience to make a


subjective judgement about the data from plots/graphs. Graphical
interpretation has the advantage of allowing good judgement to
assess normality in situations when numerical tests might be over or
under sensitive, but graphical methods do lack objectivity.

If you do not have a great deal of experience interpreting normality


graphically, it is probably best to rely on the numerical methods.

64
NORMALITY TEST

1. Klik Analyze >Descriptive Statistics > Explore

65
NORMALITY TEST
2. Masukkan variable EB dalam kotak Dependent List
3. Klik pada Statistics. Tandakan kotak Descriptives dan
Outliers.
4. Klik Continue.

66
NORMALITY TEST
5. Klik pada Plots. Tandakan kotak Factor levels together,
Stem-and-leaf, Histogram, dan Normality plot with test.
6. Klik Continue. Dalam kotak Display, pastikan Both
ditandakan.

67
NORMALITY TEST

5. Klik Options. Dalam kotak Missing Value, klik Excluded


Cases Pairwise.
6. Klik Continue > OK

68
NORMALITY TEST
Kolmogrov-Smirnov & Shapiro-Wilk
Bagi ujian Kolmogorov-Smirnov dan ujian Shapiro-Wilk
• p > 0.05 bermakna data adalah normal
• p < 0.05 bermakna data TIDAK normal
Citations:
1. Discovering Statistics using SPSS (Field, 2009)
2. SPSS Survival Manual: A Step by Step Guide to Data Analysis using
SPSS Program (Pallant, 2010)

Nilai Sig. (p = .006) bagi ujian Kolmogorov-Smirnov dan Sig. (p = .000)


bagi ujian Shapiro-Wilk, menunjukkan data bertaburan tidak normal.

69
NORMALITY TEST
Skewness & Kurtosis

According to Hair Jr et al. (2010)


and Byrne (2016) that data is
considered to be normally
distributed if it has the z-score
values between ‐2 to +2 for
Skewness and ‐7 to +7 for
Kurtosis.

Z-score is calculated by dividing


the Statistic value with Std. Error
value.

Skewness z-score = -3.629


Kurtosis z-score = 3.655

70
NORMALITY TEST
Histogram

Rajah Histogram tidak


membentuk sebuah
loceng (bell-shaped)
dengan sempurna.

Maka, bermaksud data


bertaburan tidak normal.

71
NORMALITY TEST
Q-Q Plot

Rajah Normal Q-Q Plot


menunjukkan titik-titik
data tidak berada di
sepanjang garis lurus
(linear).

Maka, bermaksud data


bertaburan tidak normal.

72
NORMALITY TEST
Mardia’s Multivariate Skewness & Kurtosis

1. Log on to: https://webpower.psychstat.org/models/kurtosis/

73
Bagi ujian Mardia’s
Multivariate Skewness dan
Kurtosis
• p > 0.05 bermakna data
adalah normal
• p < 0.05 bermakna data
TIDAK normal

Nilai p-value = 0 bagi ujian


multivariate Skewness dan
p = 0 bagi ujian Kurtosis,
menunjukkan data
bertaburan tidak normal.

Citation:
Univariate and multivariate skewness and kurtosis for measuring non-
normality: Prevalence, influence and estimation (Cain, Zhang & Yuan, 2016)
74
VALIDITY & RELIABILITY
There are two primary types of psychometric analysis:
i) Reliability and ii) Validity.

75
76
77
EXPLORATORY FACTOR ANALYSIS

Langkah:
1. Klik Analyze > Dimension Reduction > Factor.

78
EXPLORATORY FACTOR ANALYSIS
Langkah:
2. Klik Descriptives. Masukkan item EI1 hingga EI16 ke dalam kotak Variables.
Tandakan pada Initial solution, Anti-image dan KMO and Bartlett’s test of
sphericity. Klik Continue.

79
EXPLORATORY FACTOR ANALYSIS
Langkah:
3. Klik Extraction. Tandakan pada Scree Plot. Pastikan Method adalah Principal
Component.
4. Pastikan Correlation matrix dan Eigenvalues over: 1 telah pun ditandakan. Klik
Continue.

80
EXPLORATORY FACTOR ANALYSIS
LANGKAH:
5. Klik Rotation. Tanda pada Varimax dan Rotated solution.
6. Klik Continue.

81
EXPLORATORY FACTOR ANALYSIS
LANGKAH:
7. Klik Options. Tanda pada Surpress small coefficients dan taip dalam
kotak Absolute value below: .33.
8. Klik Continue. Klik OK.

82
EXPLORATORY FACTOR ANALYSIS

• Jadual menunjukkan ujian Bartlett’s of Sphericity adalah signifikan (p = .000)


dan ujian Kaiser-Meyer-Olkin (KMO) bagi kecukupan pensampelan (sampling
adequacy) adalah 0.910. Keputusan ujian Bartlett yang signifikan pada p <
0.05 menandakan kesemua item-tem di dalam konstruk sesuai digunakan
untuk analisis faktor kerana korelasi antara item-item di dalam konstruk
adalah memadai (Chua, 2009). Nilai KMO = 0.6 menunjukkan jumlah sampel
yang digunakan dalam sesebuah kajian adalah mencukupi (Hair Jr., Anderson,
Babin & Black, 2010; Pallant, 2010).

• Nilai KMO = 0.90 adalah cemerlang; 0.80 sangat baik; 0.70 baik; 0.60 biasa;
0.50 kurang baik dan kurang daripada 0.50 tidak boleh diterima untuk
melaksanakan analisis faktor (Hair Jr., Anderson, Babin & Black, 2010).
83
EXPLORATORY FACTOR ANALYSIS

KMO is used to measure sampling adequacy. According to


Hutcheson & Sofroniou (1999), values between 0.5 and 0.7 are
mediocre, values between 0.7 and 0.8 are good, values between
0.8 and 0.9 are great and values above 0.9 are superb

Bartlett’s test is to examine whether the population correlation


matrix resembles an identity matrix. Identity matrix indicates
every variable correlates very badly with all other variables (all
correlation coefficients are close to zero). If no variables
correlate, then no clusters to find

To check whether the correlation matrix is identity matrix:


• If it is identity matrix, all correlations would be zero, thus
factor analysis is inappropriate
• If Bartlett test is significant (p<0.05), then, the correlation
matrix is not identity matrix, thus, factor analysis is
appropriate
84
EXPLORATORY FACTOR ANALYSIS

85
EXPLORATORY FACTOR ANALYSIS

Anti-image Matrices output table is examined by looking at the


values (loadings) that are indicated with (a), diagonally sorted in
the table. Loadings with a indicate the Measure of Sampling
Adequacy (MSA) that should reach certain values as follow:

86
EXPLORATORY FACTOR ANALYSIS

Communality measures the percent of


variance in a given variable explained by all the
factors jointly.

As a rule of thumb, items with a communality


below 0.5 may be considered for dropping
from the factor model.

87
EXPLORATORY FACTOR ANALYSIS

Jadual Total Variance Explained menunjukkan nilai-nilai eigen (eigenvalues) telah


menyenaraikan komponen 1 hingga 16. Jumlah amaun varians bagi pemboleh
ubah dalam analisis adalah sama dengan bilangan pemboleh ubah (dalam contoh
ini, 16)
88
EXPLORATORY FACTOR ANALYSIS

These FOUR factors account for 48.976%, 10.926%, 8.097% and 6.459% of the total
variance, respectively. That is, 74.457% of the total variance is attributable to these FOUR
factors.

The remaining 12 factors together account for only approximately 25% of the variance.
89
EXPLORATORY FACTOR ANALYSIS

Output from the table of Total Variance Explained are used


to decide how many factors to extract to represent the data
by examining the eigenvalues associated with the factors.

An eigenvalue is a ratio between the common (shared)


variance and the specific (unique) variance explained by a
specific factor extracted.

Only factors with eigenvalues of 1 or greater are considered


to be significant, while factors with eigenvalues less than 1
are disregarded

Thus, a model with FOUR factors may be adequate to


represent the data (see slides No.88 and 89).

90
EXPLORATORY FACTOR ANALYSIS

Scree Test (Scree Plot) is used to identify the optimum


number of factors that can be extracted before the amount
of unique variance begins to dominate the common variance
structure (Hair, Anderson, Tatham, & Black, 1995)

Find a point at which the shape of the curve changes


direction and becomes horizontal. Retain all factors above
the elbow, or break in the plot, as these factors contribute
the most to the explanation of the variance in the data set
(see slide No.92).

91
EXPLORATORY FACTOR ANALYSIS

The point at which the curve


first begins to straighten out is
considered to indicate the
maximum number of factors
to extract.

That is, those factors above


this point of inflection are
deemed meaningful

However, this curve is difficult


to interpret because it begins
to tail off after three factors,
but there is another drop after
four factors before a stable
plateau is reached.

92
EXPLORATORY FACTOR ANALYSIS

Component Matrix
represents the
unrotated component
analysis factor matrix,
and presents the
correlations that relate
the variables to the
FOUR extracted factors.

These correlations are


called factor loadings,
indicate how closely the
variables are related to
each factor.

93
EXPLORATORY FACTOR ANALYSIS

Factor loadings refer to correlation coefficients between the


variables (i.e. items) and the factors (i.e. construct) they
represent.

Variables with large loadings indicate that they are


representative of the factor. Small loadings suggest that
they are not.

Factor loadings greater than ±0.33 are considered to meet


the minimal level of practical significance.

The grouping of variables with high factor loadings should


suggest what the underlying dimension is for that factor.

94
EXPLORATORY FACTOR ANALYSIS
The significance of a factor loading depends on the sample size.

Table of Critical Values (Stevens, 2002)


Sample Size Significant loadings REQUIRED
50 0.722
100 0.512
200 0.364
300 0.298
600 0.210
1000 0.162

• Factor loadings should be 0.7 or higher (Garson, 2013)


• For the exploratory purposes, a factor loading lower level such as 0.4
(Raubenheimer, 2014).
• Loadings above 0.6 “high” and those below 0.4 “low” (Hair et.al,
1998).

95
EXPLORATORY FACTOR ANALYSIS

Sources:
1. Kaiser, H. F. (1970). A second-generation little jiffy, Psychometrika, 35, 401-
415.
2. Kaiser, H. F. and Rice, J. (1974). Little jiffy, Mark IV, Educational and
Psychology Measurement, 34, 111-117 96
EXPLORATORY FACTOR ANALYSIS

High cross-loadings have


occurred which make
interpretation of the factors
difficult and theoretically
less meaningful.

Examples:
• Items EI2 and EI8 have
loaded highly on Factor 1
and Factor 4
• Items EI3 and EI4 have
loaded highly on Factor
1, Factor 2 and Factor 4
• Items EI9, EI10, EI11, and
EI12 have loaded highly
on Factor 1 and Factor 3

97
EXPLORATORY FACTOR ANALYSIS

What to do for the cross loading?

1. Rerun factor analysis, stipulating a smaller number of


factors to be extracted.

2. Examine the wording of the cross-loaded variables, and


based on their face-validity, assign them to the factors
that they are most conceptually/logically representative
of.

3. Delete all cross-loaded variables. This will result in


“clean” factors and will make interpretation of the
factors that much easier. This method works best when
there are only few significant cross loadings.

98
RELIABILITY

The reliability of a measure is an inverse function of measurement


error:
• The more error, the less reliable the measure
• Reliable measures provide consistent measurement from
occasion to occasion
• The reliability of a measuring instrument is defined as its ability
to consistently measure the phenomenon it is designed to
measure.
• The degree to which the items that make up the scale ‘hang
together’.

99
INTERNAL CONSISTENCY RELIABILITY

• Internal consistency refers to the extent to which the items in a test


measure the same construct.
• Items that measure the same phenomenon should logically cling/hang
together in some consistent manner.
• Examining the internal consistency of the test enables the researcher to
determine which items are not consistent with the test in measuring
the phenomenon under investigation.
• The objective is to remove the inconsistent items and improve the
internal consistency of the test.
• An internally consistent test increases the chances of the test being
reliable.

100
RELIABILITY TEST
Arahan:
1. Klik Analyze > Scale > Reliability Analysis

101
RELIABILITY TEST

2. Klik item EB1 hingga item EB6. Masukkan dalam kotak


Items.
3. Taip nama pemboleh ubah (e.g. Employee Behaviour)
dalam kotak Scale label
4. Klik Statistics. Tandakan pada Scale if item deleted dan
Correlations. Klik Continue.
5. Pastikan Alpha telah dipilih dalam kotak berlabel
Model. Klik OK.

102
RELIABILITY TEST

103
RELIABILITY TEST (cont’)

HURAIAN:
1. Jadual Reliability Statistics melihat kepada pekali Cronbach’s
Alpha. Nilai Cronbach’s Alpha ialah 0.807 mencadangkan skor
pekali tersebut munasabah untuk dipercayai.

104
RELIABILITY TEST (Cronbach’s Alpha)

• Cronbach’s alpha coefficient of a scale above 0.7 is considered


sufficient/adequate (DeVellis, 2003; Nunnally, 1978), which suggests
that all of the items are reliable and the entire test is internally
consistent.
• However, values above 0.8 are preferable (DeVellis, 2003) and may be
not high enough for applied research (Nunnally, 1978).
• Where important decisions about the fate of individuals is made on the
basis of test scores, reliability should be at least 0.90, preferably 0.95 or
better (Nunnally, 1978).
• Cronbach alpha values are, however, quite sensitive to the number of
items in the scale.
• With short scales (e.g. scales with fewer than ten items) it is common to
find quite low Cronbach’s values (e.g. 0.5).
• The reliability of a scale can vary depending on the sample.

105
RELIABILITY TEST (cont’)

Check the Inter-Item Correlation Matrix for negative values.

All values should be positive, indicating that the items are


measuring the same underlying characteristic.

The presence of negative values could indicate that some of the


items have not been correctly reverse scored.

106
RELIABILITY TEST (cont’)

HURAIAN :
1. Kolum Cronbach’s Alpha if Item Deleted menunjukkan pekali alpha
menjadi 0.740, jika item EB1 dikeluarkan daripada skala. Pekali
alpha yang asal ialah 0.807 (rujuk Jadual Reliability Statistics).
2. Kolum Scale Variance if Item Deleted menunjukkan jumlah varians
skala akan menjadi 13.992 jika item EB1 dikeluarkan.
3. Keadaan yang sama juga akan berlaku sekiranya item lain
dikeluarkan daripada skala. Begitu juga dengan kolum-kolum lain.

107
RELIABILITY TEST (cont’)

• Item analysis is achieved through Item-Total correlation procedure.


• This procedure represents a refinement of test reliability by identifying
“problem” items in the test, i.e., those items that yield low correlations
with the sum of the scores on the remaining items.
• Item-Total correlation indicates the degree to which each item
correlates with the total score.
• Rejecting those items that are inconsistent with the rest (and retaining
those items with the highest average inter-correlations) will increase
the internal consistency of the measuring instrument.
• In deciding which item to retain or delete, the 0.33 criterion can be
used
• Low values (less than 0.3) here indicate that the item is measuring
something different from the scale as a whole.

108
PEARSON’S CORRELATION ANALYSIS

Range of correlation coefficient: −1 ≤ r ≤ +1


According to Cohen (1988) the strength/size of the relationship are:
• Weak/small: r= 0.1 – 0.29
• Moderate/medium: r = 0.3 – 0.49
• Large/strong: r= 0.5 – 0.1
Sources:
1. Discovering Statistics using SPSS (Field, 2009)
2. SPSS Survival Manual: A Step by Step Guide to Data Analysis using SPSS
Program (Pallant, 2010)

109
PEARSON’S CORRELATION ANALYSIS
Arahan:
1. Klik Analyze > Correlate > Bivariate.

110
PEARSON’S CORRELATION ANALYSIS

2. Klik pemboleh ubah EI, JA,


POS dan EB dan masukkan
ke dalam kotak Variables.
3. Pastikan Pearson dipilih
daripada kotak Correlation
Coefficients
4. Pastikan One-tailed dipilih
daripada kotak Test of
Significance.
5. Pastikan kotak Flag
significant correlations telah
ditanda.
6. Klik Continue > OK.

111
PEARSON’S CORRELATION ANALYSIS

Contoh keterangan:
Jadual menunjukkan nilai pekali
korelasi, r antara pemboleh ubah EI
dan EB ialah 0.425. Nilai r yang positif
menunjukkan hubungan berkadar
langsung (direct relationship) antara
EI dan EB. Semakin meningkat tahap
EI, semakin bertambah tahap EB. Nilai
r = 0.425 menunjukkan korelasi antara
EI dan EB adalah hubungan positif
yang sederhana (moderate/medium).
Simbol asterisk (**) menunjukkan
signifikan pada aras keertian 0.01
(iaitu 99% selang keyakinan),
manakala (*) menunjukkan signifikan
pada aras keertian 0.05 (iaitu 95%
selang keyakinan).

112
LINEAR REGRESSION
(REGRESI LINEAR)

Simple Linear Regression (Regresi Linear Mudah)


Simple linear regression is used when we would like to see
the impact of a single independent variable on a dependent
variable.

Multiple Linear Regression (Regresi Linear Berganda)


Multiple linear regression is used when we would like to see
the impact of more than one independent variable on a
dependent variable

113
MULTIPLE LINEAR REGRESSION
ANALYSIS
1. Klik Analyze > Regression > Linear

114
MULTIPLE LINEAR REGRESSION (cont’)
2. Klik EB (dependent variable). Masukkan dalam kotak Dependent.
3. Klik EI1, JA dan POS (independent variables). Masukkan dalam kotak
Independent(s).
4. Klik Statistics. Tandakan Durbin-Watson dan Collinearity diagnostics.
5. Klik Continue > OK.

115
MULTIPLE LINEAR REGRESSION (cont’)

Independent errors means for any two observations the residual terms should be
uncorrelated (or independent). This eventuality is sometimes described as a lack of
autocorrelation. Durbin–Watson test is used to test serial correlations between the
errors.
Durbin-Watson Statistic, D
• 0 ≤D ≤4
• D ≈2 - No autocorrelation
• D < 1.5 - 1st order + ve autocorrelation
• D > 2.5 - 1st order - ve autocorrelation
Sources:
1. Discovering Statistics using SPSS (Field, 2009)
2. Testing for serial correlation in least squares regression (Durbin & Watson,
1951)

116
MULTIPLE LINEAR REGRESSION (cont’)

Value in Sig. column signifies the regression equation is significant

117
MULTIPLE LINEAR REGRESSION (cont’)

Multicollinearity exists when there is a strong correlation between two or more


predictors in a regression model.

Identifying multicollinearity:
a) Scan a correlation matrix of all of the predictor variables and see if any
correlate very highly (i.e. r > 0.80 or 0.90) (Field, 2009).
b) Variance Inflation Factor (VIF). The VIF indicates whether a predictor has a
strong linear relationship with the other predictor(s). When VIF > 1,
multicollinearity may be biasing the regression model (Bowerman &
O’Connell, 1990). VIF = 10 need to worry (Myers, 1990; Pallant, 2010).

118
MULTIPLE LINEAR REGRESSION (cont’)

• The Adjusted R2 value indicates the loss of predictive power or shrinkage. Whereas
R2 tells us how much of the variance in DV is accounted for by the regression model
from our sample, the adjusted value tells us how much variance in DV would be
accounted for if the model had been derived from the population from which the
sample was taken (Field, 2009).
• When a small sample is involved, the R2 value in the sample tends to be a rather
optimistic overestimation of the true value in the population (Tabachnick & Fidell
2007). The Adjusted R2 ‘corrects’ this value to provide a better estimate of the true
population value.
• Study with small sample size is suggested to report Adjusted R2 rather than the
normal R2 value (Pallant, 2010).
119
MULTIPLE LINEAR REGRESSION (cont’)

According to Hair Jr, Hult, Ringle and Sarstedt (2017):


R2 values Level
0.75 Substantial
0.50 Moderate
0.25 Weak

120
MULTIPLE LINEAR REGRESSION (cont’)

According to Hair Jr, Hult, Ringle and Sarstedt (2017):


T-values
p-values
One-tailed Two-tailed
0.01 2.33 2.57
0.05 1.65 1.96
0.10 1.28 1.65

121
UNTIL NEXT SESSION…

THINK LEAN, DO LEAN &


CONTINUOUSLY PURSUE FOR
PERFECTION

122

Das könnte Ihnen auch gefallen