Sie sind auf Seite 1von 24

UKP6053 : DATA ANALYSIS AND INTERPRETATION

INDIVIDUAL ASSIGNMENT 2

TITLE

OUTLIERS & NORMALITY

PREPARED BY

NAME MATRIC NO.


JEYANTHI A/P RAJAGURU M20151000359

LECTURER : DR. MOHAMMED YOUSEF MAI

DATE : 25 MARCH 2016


1.0 Total Life Satisfaction
1.1 Checking for Outliers - Tlifesat

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

total life satisfaction 436 99.3% 3 0.7% 439 100.0%

Descriptives

Statistic Std. Error

total life satisfaction Mean 22.38 .324

95% Confidence Interval for Lower Bound 21.74


Mean Upper Bound 23.02

5% Trimmed Mean 22.52

Median 23.00

Variance 45.827

Std. Deviation 6.770

Minimum 5

Maximum 35

Range 30

Interquartile Range 9

Skewness -.323 .117

Kurtosis -.450 .233

Extreme Values

Case Number Value

total life satisfaction Highest 1 65 35

2 183 35

3 195 35

4 212 35

5 227 35a

Lowest 1 382 5

2 344 5

3 61 5
4 7 5

5 2 5

a. Only a partial list of cases with the value 35 are shown in the table of
upper extremes.
1
Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

total life satisfaction .087 436 .000 .982 436 .000

a. Lilliefors Significance Correction

2
3
Results
There are no any extreme values. The outlier’s score is genuine.

4
1.2 Normality Test - Tlifesat

sex Statistic Std. Error

total life satisfaction MALES Mean 21.67 .480

95% Confidence Interval for Lower Bound 20.72


Mean Upper Bound 22.62

5% Trimmed Mean 21.80

Median 23.00

Variance 42.570

Std. Deviation 6.525

Minimum 5

Maximum 35

Range 30

Interquartile Range 9
Skewness -.293 .179

Kurtosis -.283 .355

FEMALES Mean 22.90 .436

95% Confidence Interval for Lower Bound 22.04


Mean Upper Bound 23.76

5% Trimmed Mean 23.06

Median 23.00

Variance 47.762

Std. Deviation 6.911

Minimum 5

Maximum 35

Range 30

Interquartile Range 10

Skewness -.374 .154


Kurtosis -.519 .306

Males
Skewness z-values = -0.293/ 0.179 = -1.64 (between -1.96 and +1.96)
Kurtosis z-values = -0.283/ 0.436 = -0.65 (between -1.96 and +1.96)

Females
Skewness z-values = -0.374/ 0.154 = -2.43 (below -1.96)
Kurtosis z-values = -0.519/ 0.306 = -1.70 (between -1.96 and +1.96)
All the z-values between -1.96 and +1.96 except skewness z-values for females which is
below -1.96.
5
Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

sex Statistic df Sig. Statistic df Sig.

total life satisfaction MALES .094 185 .000 .984 185 .035

FEMALES .083 251 .000 .975 251 .000

a. Lilliefors Significance Correction

The null hypothesis for this test of normality, is that the data are normally distributed.
The hypothesis is rejected if the p-value is below 0.05.

Both p-values are below 0.05. We reject the null hypothesis. In terms of the Shapiro-
Wilk test, we can assume that our data are not normally distributed.

6
7
8
9
Results

Sample Characteristics

A Shapiro-Wilk’s test (p<0.05) (Shapiro & Wilk, 1965; Razali & Wah, 2011) and a
visual inspection of their histograms, normal Q-Q plots and box plots showed that the total
life satisfaction scores were not normally distributed for both males and females, with a
skewness of -0.293 (SE = 0.179) and a kurtosis of -0.283 (SE = 0.436) for the males and a
skewness of -0.374 (SE = 0.154) and a kurtosis of -0.519 (SE = 0.306) for the females
(Cramer & Howitt, 2004; Doane & Seward, 2011).

2.0 Total Perceived Stress


2.1 Checking for Outliers - Tpstress

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

total perceived stress 433 98.6% 6 1.4% 439 100.0%

Descriptives

Statistic Std. Error

total perceived stress Mean 26.73 .281

95% Confidence Interval for Lower Bound 26.18


Mean Upper Bound 27.28

5% Trimmed Mean 26.64

Median 26.00

Variance 34.194

Std. Deviation 5.848

Minimum 12

Maximum 46

Range 34

Interquartile Range 8

Skewness .245 .117


Kurtosis .182 .234

10
Extreme Values

Case Number id Value

total perceived stress Highest 1 7 24 46

2 262 157 44

3 216 61 43

4 190 6 42

5 257 144 42a

Lowest 1 366 404 12

2 189 5 12

3 247 127 13

4 244 119 13

5 98 301 13

a. Only a partial list of cases with the value 42 are shown in the table of upper extremes.

Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

total perceived stress .069 433 .000 .992 433 .021

a. Lilliefors Significance Correction

11
12
Results
There are no extreme values, but there are two outliers: ID numbers 24 and 157.

13
2.2 Checking for Outliers – Tpstress with outlier removed

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

total perceived stress 431 98.2% 8 1.8% 439 100.0%

Descriptives

Statistic Std. Error

total perceived stress Mean 26.64 .276

95% Confidence Interval for Lower Bound 26.10


Mean Upper Bound 27.18
5% Trimmed Mean 26.60

Median 26.00

Variance 32.788

Std. Deviation 5.726

Minimum 12

Maximum 43

Range 31

Interquartile Range 8

Skewness .154 .118

Kurtosis -.012 .235

Extreme Values

Case Number id Value

total perceived stress Highest 1 216 61 43

2 190 6 42

3 257 144 42

4 339 330 42

5 228 85 41a

Lowest 1 366 404 12

2 189 5 12

3 247 127 13

4 244 119 13
5 98 301 13

a. Only a partial list of cases with the value 41 are shown in the table of upper extremes.

14
Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

total perceived stress .067 431 .000 .993 431 .057

a. Lilliefors Significance Correction

15
16
Results
The outlier’s score is genuine.

17
1.2 Normality Test - Tpstress
Case Processing Summary

Cases

Valid Missing Total

sex N Percent N Percent N Percent

total perceived stress MALES 183 98.9% 2 1.1% 185 100.0%

FEMALES 248 97.6% 6 2.4% 254 100.0%

Descriptives

sex Statistic Std. Error

total perceived stress MALES Mean 25.68 .386

95% Confidence Interval for Lower Bound 24.92


Mean Upper Bound 26.44

5% Trimmed Mean 25.69

Median 25.00

Variance 27.220

Std. Deviation 5.217

Minimum 13

Maximum 39

Range 26

Interquartile Range 8

Skewness .046 .180

Kurtosis -.312 .357

FEMALES Mean 27.35 .380

95% Confidence Interval for Lower Bound 26.61


Mean Upper Bound 28.10

5% Trimmed Mean 27.30

Median 27.00

Variance 35.825

Std. Deviation 5.985

Minimum 12

Maximum 43

Range 31

Interquartile Range 7

Skewness .128 .155

Kurtosis .018 .308


Males
Skewness z-values = -0.046/ 0.180 = -0.26 (between -1.96 and +1.96)
Kurtosis z-values = -0.312/ 0.357 = -0.87 (between -1.96 and +1.96)

18
Females
Skewness z-values = -0.128/ 0.155 = -0.83 (between -1.96 and +1.96)
Kurtosis z-values = -0.018/ 0.308 = -0.06 (between -1.96 and +1.96)

All the values between -1.96 and +1.96.

Tests of Normality

Kolmogorov-Smirnova Shapiro-Wilk

sex Statistic df Sig. Statistic df Sig.

total perceived stress MALES .071 183 .026 .990 183 .245

FEMALES .061 248 .025 .992 248 .218

a. Lilliefors Significance Correction

The null hypothesis for this test of normality, is that the data are normally distributed.
The hypothesis is rejected if the p-value is below 0.05.

Both p-values are above 0.05. We accept the null hypothesis. In terms of the Shapiro-
Wilk test, we can assume that our data are approximately normally distributed.

19
20
21
22
Results

Sample Characteristics

A Shapiro-Wilk’s test (p>0.05) (Shapiro & Wilk, 1965; Razali & Wah, 2011) and a
visual inspection of their histograms, normal Q-Q plots and box plots showed that the total
life satisfaction scores were normally distributed for both males and females, with a skewness
of -0.046 (SE = 0.180) and a kurtosis of -0.312 (SE = 0.357) for the males and a skewness of
-0.128 (SE = 0.155) and a kurtosis of -0.018 (SE = 0.308) for the females (Cramer & Howitt,
2004; Doane & Seward, 2011).

23