Beruflich Dokumente
Kultur Dokumente
Lesson 4Analyze
Objectives
After completing
this lesson, you will
be able to:
Analyze
Topic 1Patterns of Variation
Classes of Distributions
The data obtained from measurement phase exhibits variety of distribution, depending on the data
type and its source.
The methods used to describe the parameters for classes of distribution are:
It is based on assumed
model of distribution.
Inferential Statistics
Statistics
Probability
Types of Distributions
The two types of distribution are as follows:
Continuous Distribution
Discrete Distribution
Binomial distribution
Normal distribution
Poisson distribution
Chi-square distribution
t-distribution
F-distribution
It is important to be familiar with discrete distributions while dealing with discrete data.
Poisson distribution.
These distributions help in predicting the sample behavior that has been observed in a population.
Binomial Distribution
Binomial distribution is a probability distribution for the discrete data.
Characteristics
of Binomial
Distribution
P R = n C r pr 1 p n r
where, P(R) = probability of exactly (r) successes out of a sample size of (n)
p = probability of success; r = number of successes desired; n = sample size
Copyright 2014, Simplilearn, All rights reserved.
Term
Formula
Mean
=
where, n = sample size
p = probability of success
Standard Deviation
= (1 )
where, n = sample size
p = probability of success
5! = 5 4 3 2 1 = 120
4! = 4 3 2 1 = 24
Q
A
Using binomial distribution formula, find the probability of getting 5 heads in 8 coin tosses.
8_5=
0.2187 = 21.87%
Poisson Distribution
Poisson distribution is an application of the population knowledge to predict the sample behavior.
Characteristics
of Poisson
Distribution
Used to analyze
situations wherein the
number of trials is large
10
Poisson DistributionFormula
The formula for the Poisson distribution is as follows:
x e
P =
!
where, P(x) = probability of exactly () occurrences in a Poisson distribution (n)
= mean number of occurrences during interval
= number of occurrences desired
e = base of the natural logarithm (equals 2.71828)
11
Q
A
The past records of a road junction which is accident-prone show that the mean number of accidents every
week is 5 at this junction. Assume that the number of accidents follows a Poisson distribution and calculate
the probability of any number of accidents happening in a week.
5x e5
0!
= 0.006
51 e5
1!
= 0.03
12
A variable is said to be continuous if the range of possible values falls along a continuum.
Example: Loudness of cheering at a ball game, weight of cookies in a package, length of a pen,
or the time required to assemble a car.
13
Normal Distribution
The Normal or Gaussian distribution is a continuous
probability distribution, illustrated as N (, ).
14
(Y )
15
Q
A
Suppose the time taken to resolve customer problems follows a normal distribution with the mean value of
250 hours and standard deviation value of 23 hrs. What is the probability that a problem resolution will take
more than 300 hrs?
Given:
Y = 300
= 250
= 23
Using the formula: Z =
(300250)
=
23
2.17
From a Normal Distribution Table, the Z value of 2.17 covers an area of 0.98499 under itself
Thus, the probability that a problem can be resolved in less than 300 hrs is 98.5%
The chances of a problem resolution taking more than 300 hours is 1.5%
16
Z-Table Usage
The probability of areas under the curve is 1. For the actual value, one can identify the Z score by
using the Z-table.
17
Z-Table
This Z-table gives the
probability that Z is between
zero and a positive number.
This is the most commonly
used normal distribution Ztable with the positive Zscores.
0.0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0
0.5000
0.5040
0.5080
0.5120
0.5160
0.5199
0.5239
0.5279
0.5319
0.5359
0.1
0.5398
0.5348
0.5478
0.5517
0.5557
0.5596
0.5636
0.5675
0.5714
0.5753
0.2
0.5793
0.5832
0.5871
0.5910
0.5948
0.5987
0.6026
0.6064
0.6103
0.6141
0.3
0.6179
0.6217
0.6255
0.6293
0.6331
0.6368
0.6406
0.6443
0.6480
0.6517
0.4
0.6554
0.6591
0.6628
0.6664
0.6700
0.6736
0.6772
0.6808
0.6844
0.6879
0.5
0.6915
0.6950
0.6985
0.7019
0.7054
0.7088
0.7123
0.7157
0.7190
0.7224
0.6
0.7257
0.7291
0.7324
0.7357
0.7389
0.7422
0.7454
0.7486
0.7517
0.7549
0.7
0.7580
0.7611
0.7642
0.7673
0.7704
0.7734
0.7764
0.7794
0.7823
0.7852
0.8
0.7881
0.7910
0.7939
0.7967
0.7995
0.8023
0.8051
0.8078
0.8106
0.8133
0.9
0.8159
0.8186
0.8212
0.8238
0.8264
0.8289
0.8315
0.8340
0.8365
0.8389
1.0
0.8413
0.8438
0.8461
0.8485
0.8508
0.8531
0.8554
0.8577
0.8599
0.8621
1.1
0.8643
0.8665
0.8686
0.8708
0.8729
0.8749
0.8770
0.8790
0.8810
0.8830
1.2
0.8849
08869
0.8888
0.8907
0.8925
0.8944
0.8962
0.8980
0.8997
0.9015
18
Using Z-TableExample
Q
A
There is no need of the table to find the answer once you know that the variable Z takes a value of
less than (or equal to) zero.
First, the area under the curve is 1, and second, the curve is symmetrical about Z = 0.
Hence, there is 0.5 (or 50%) above chance of Z = 0 and 0.5 (or 50%) below chance of Z = 0.
19
Q
A
20
21
Chi-Square Distribution
Chi-square distribution (chi-squared or distribution) with k-1 degrees of freedom is the distribution
of the sum of the squares of k independent standard normal random variables.
Characteristics
of
Distribution
Most widely used probability
distribution in inferential statistics
22
Chi-Square DistributionFormula
The formula for the Chi-square distribution is as follows:
2calculated = =
fO fe
fe
fe = expected frequency
Chi-square distribution will be covered in detail in the later part of this lesson.
23
t-Distribution
A t-distribution is most
appropriate to be used when:
population standard
deviation is not known; and
population is approximately
normal.
24
F-Distribution
The F-distribution is a ratio of two Chi-square distributions, and a specific F-distribution is denoted by
the degrees of freedom for the numerator Chi-square and the degrees of freedom for the
denominator Chi-square.
Fcalculated
S12
= 2
S2
If S1 > S2 , then the numerator should be greater than denominator (df1 = n1 1 and df2 = n2 1)
Refer F-table to find out critical F-distribution at and degrees of freedom of samples of two
different processes (df1 and df2)
25
Analyze
Topic 2Exploratory Data Analysis
Multi-Vari Studies
Multi-Vari studies analyze variation, investigate process stability, identify investigation areas, and
break down the variation.
Cyclical
Temporal
27
Decide
Sample Size
Example: Sample
size is five pieces
from each
equipment and
the frequency of
data collection is
every two hours.
Create a
Tabulation
Sheet
Example: The
tabulation sheet
with data records
contains the
columns with
time, equipment
number, and
thickness as
headers.
Link the
Observed
Values
Example: The
observed values
are linked by
appropriate lines.
28
Minitab is:
Minitab > Stat > Quality
Tools > Multi-Vari Chart
29
0
No correlation between
the two variables
+1
Movement in both
variables is same
30
Correlation Levels
Correlation measures the linear
association between the output
31
Regression
The degree of movement of variable changes is calculated using regression.
If a high percentage of variability in Y (r2> 70%) is explained by changes in X, the model to write a
Simple Linear Regression is for one X and Multiple Linear Regression is for more than one X.
32
Vital X
It is important to discover whether a statistical
significant relationship exists between Y and a
particular X by looking at p-values. Based on
regression, one can infer the vital X and eliminate the
rest.
It is important to understand if there is statistical relevance between Y and X using the metrics from
Regression Analysis. The Simple Linear Regression should be used as a Statistical Validation tool.
33
34
Minitab fits the line which has the least Sum of Squares of Error.
In a linear relationship, the points would lie on the line. Typically, the data lies off the line.
The distance from the point to line is the error distance used in the SSE calculations.
35
SLRExample
A farmer wishes to predict the relationship between the amount spent on fertilizers and the annual
sales of his crops. He collects the following data of last few years and determines his expected
2009
20
2010
25
2011
34
2012
2013
11
40
2014
31
36
37
The r2 value (Coefficient of Determination) conveys if the model is good and can be used. The r2
value is 0.3797.
Refer to the Cause and Effect Matrix and study the relationship between Y and a different X variable.
38
39
The residuals between the actual value and the predicted value give an indication of how good the
model is.
If the errors are small and predictions use Xs that are within the range of the collected data, the
predictions should be fine.
SST = SSR + SSE
r2 = SSR SST
Prioritization of Xs can be done through the SLR equation; run separate regressions on Y with each X.
40
There is a positive
correlation between the
number of sneezes and the
deaths in the city. It cannot
be assumed that sneezing is
the cause of death though
the correlation is very strong.
41
Analyze
Topic 3Hypothesis Testing
are economically significant when logical reasons are examined before implementation.
43
Alternate Hypothesis
Represented as H0
Represented as Ha
Challenges the null hypothesis
Example: Movie is not good
44
Type II Error
45
Probability of making one type of error can be reduced, leading to increasing the probability of
If a true null hypothesis is erroneously rejected (Type I error), a false null hypothesis may be
accepted (Type II error).
is set at 0.05, which means the risk of committing a type I error will be 1 out of 20 experiments.
It is important to decide what type of error should be less and set and accordingly.
46
Power of Test
The power of a test:
helps in the probability of correctly rejecting the null hypothesis when it is false.
47
The sample size for continuous data can be determined by the formula:
n=
Z1( )
1- ( 2 ) = 0.975
48
Q
A
The population standard deviation for the time, to resolve customer problems, is 30 hours. What should
be the size of a sample that can estimate the average problem resolution time within 5 hours tolerance
with 99% confidence?
49
p(1 p)
n=
1.96
p(1 p)
Q
A
The non-defective population proportion for pen manufacturing is 80%. What should be the sample size to
draw a sample that can estimate the proportion of compliant pens within 5% with an alpha of 5%?
50
Discrete data
Mean
Continuous data
Variance
Variance
Mean
F-test
Comparison Comparison
of two
of many
X2-test
unknown
t-test
known
Z-test
F-test
51
t-test ( unknown)
The population SD is unknown; however, it is estimated
from the sample SD; s = 5.0
Compute t = (X- 0) / (s2/ n) = (165 164.5) / (52 /25)=
0.5
Reject H0 at level of significance if t > tn-1,
Since t24, 0.05 = 2.064, the null hypothesis is not rejected at
5% level of significance. Thus a conclusion based on the
sample collected is that the average height of North
American males is 165 cm.
52
H0: Proportion of wins in Australia or abroad is independent of the country played against
Ha: Proportion of wins in Australia or abroad is dependent on the country played against
2 Critical = 6.251 and
2 Calculated = 1.36
Result: Since calculated value is less than the critical value, the proportion of wins of Australia hockey
team is independent of the country played or place.
53
54
understand whether the two samples belong to the same population or a different population;
and
55
Two samples of sizes n1 = 125 and n2 = 110 are taken from the two populations
X1 = 167.3, X2 = 165.8, s1 = 4.2, s2 = 5.0 are the sample means and SDs respectively
Since t223, 0.025 = 1.96, the null hypothesis is rejected at 5% level of significance
56
Susan is examining the earnings of two companies. According to her, the earnings of Company A are more
volatile than those of Company B. She has been obtaining earnings data for the past 31 years for Company
A, and for the past 41 years for Company B. She finds that the sample standard deviation of Company As
earnings is $4.40 and of Company Bs earnings is $3.90. Determine whether the earnings of Company A
have a greater standard deviation than those of Company B at 5% level of significance.
H0 : A2= B2 = the variance of Company As earnings is equal to the variance of Company Bs earnings.
Ha : A2 < > B2 = the variance of Company As earnings is different.
A2= variance of Company As earnings.
B2= variance of Company Bs earnings.
Note: A > B. In calculating the F-test statistic, always put the greater variance in the numerator.
57
The critical value from F-table equals 1.74. The null hypothesis is rejected if the F-test statistic is
Results: The F-test statistic (1.273) is not greater than the critical value (1.74). Therefore, at 5%
significance level, the null hypothesis cannot be rejected.
58
of avocados in ounces.
Group A (Chef 1)
Group B (Chef 2)
4.2
4.5
4.5
7.2
6.1
5.2
8.9
5.3
5.2
6.1
59
F-Test
The steps for conducting FTest in MS-Excel are:
60
F-Test Assumptions
Before interpreting the F-test, the assumptions to be considered are as follows:
Null Hypothesis: There is no significant statistical difference between the variances of the two
groups, thus concluding any variation could be because of chance. This is Common Cause of
Variation.
Alternate Hypothesis: There is a significant statistical difference between the variances of the two
groups, thus concluding variations could be because of assignable causes also. This is Special Cause
of Variation.
61
F-Test Interpretations
The interpretations for the conducted F-test are as
follows:
Variable 1
Variable 2
Mean
6.016666667
5.016666667
Variance
3.197666667
0.517666667
Observations
df
6.177076626
6
5
62
Group A (Chef 1)
Group B (Chef 2)
4.2
4.5
4.5
7.2
6.1
5.2
8.9
5.3
5.2
6.1
63
2-Sample t-Test
The steps for conducting 2-sample t-test in MS-Excel are given below:
Select 2-Sample
Independent t-test
assuming unequal
variances.
Open MS Excel,
click Data and click
Data Analysis.
In Variable 2 range,
select the data set
for Group B.
Keep the
Hypothesized
Mean Difference
as 0.
In Variable 1 range,
select the data set
for Group A.
Click Ok.
64
Null Hypothesis: There is no significant statistical difference between the means of the two groups,
thus concluding any variation could be because of chance. This is Common Cause of Variation.
Alternate Hypothesis: There is a significant statistical difference between the means of the two
groups, thus concluding variations could be because of assignable causes also. This is Special Cause
of Variation.
H0 : Mean of Group A = Mean of Group B
Ha : Mean of Group A Mean of Group B
The alternate hypothesis tests two conditions, Mean of A < Mean of B and Mean of A > Mean of B. Thus a
two-tailed probability needs to be used.
65
2-Tailed Probability
1-Tailed Probability
66
Variable 2
Mean
6.016666667
5.016666667
Variance
3.197666667
0.517666667
Observations
Hypothesized Mean
df
T Stat
1.270798616
P(T<=t) one-tail
0.122200546
T Critical one-tail
1.894578605
Variable 1
0.05.
67
Paired t-Test
The paired t-test is:
For example, a group of students score X in CSSGB before taking the Training program. Post the training
program, the scores are taken again.
One needs to find out if there is a statistical difference between the two sets of scores.
If there is a significant difference, the inference could be that the training was effective.
68
Sample Variance
Sample Variance (S2) is the average of the squared differences from the mean.
In statistics, its value is used by converting it into standard deviation and combining with the
mean.
Subtract each of
the value from
mean
Calculate the
square value of
the result
Take average of
the squared
differences
69
Sample VarianceExample
The example to calculate sample variance is as follows:
Consider the sample of weights. Suppose the mean value is 140 and when you subtract each value
from the mean, take the square value of the result, and then take the average of the squared
difference, the resulting sample variance value is 1936.
In order to get the standard deviation, take the square root of the sample variance: 1936 = 44.
The standard deviation along with the mean, will tell you how much the majority of the people
weigh.
o
The mean value is 140 and variance is 44, the majority of people weigh between 96 pounds
(140 - 44) and 184 pounds (140 + 44).
70
71
ANOVA Example
The table shows the takeaway food delivery time of
Outlet 1
Outlet 2
Outlet 3
48
50
49
49
48
48
48
36
39
53
50
49
58
50
34
50
62
33
46
45
57
50
47
48
49
51
47
47
44
39
delivery time.
72
columns.
2. In the main menu, choose
Stat > ANOVA > One-Way.
3. Select the response, delivery
time, factor, and outlet.
4. Click OK.
73
74
75
There is no significant difference between the means of delivery time for three outlets.
In one-way ANOVA, one factor has to be benchmarked unlike the two-way ANOVA.
76
Chi-Square Distribution
The Chi-square distribution (-distribution) or Chi-squared:
with k-1 degrees of freedom is the distribution of a sum of the squares of k independent standard
normal random variables.
2Calculated =
f0 fe
fe
Where,
2Calculated = chi-square index
Fo = An observed frequency
Fe = An expected frequency
Copyright 2014, Simplilearn, All rights reserved.
77
Chi-Square TestExample
To analyze the Australian hockey teams wins,
the data has two classifications:
Estimated Population
Parameters
Sample Statistics
total)/overall total.
Example: Observed frequency of 3 wins
against South Africa in Australia would convert
78
population parameters;
79
contingency table.
Degrees of freedom = (2 - 1)*(4-1) = 3
Assuming = 10%, 2Critical = 6.251
2Calculated = 1.36
2
2
Critical divides region into acceptance and rejection zones while Calculated allows
accepting or rejecting the null hypothesis depending on which zone it falls.
80
Analyze
Topic 4Hypothesis Testing with Non-Normal Data
Mann-Whitney Test
Mann-Whitney or Wilcoxon Rank Sum test is a non-parametric test used to compare two unpaired
groups. In this test:
The rejection and acceptance condition remains the same for different cases:
If p<
If p>
The aim of this test is to rank the entire data available for each condition and then compare the total
outcome of the two ranks.
82
Mann-Whitney Test
The steps to perform Mann-Whitney test are as follows:
Find the average of the
ranks for all the identical
values
83
Mann-Whitney TestExample
An example of performing Mann-Whitney test is shown here.
Group
G1
G2
Data
Sorted Data
14
2
5
16
9
4
2
18
14
8
2
2
4
5
8
9
14
14
16
18
Group
G1
G2
G2
G1
G2
G1
G1
G2
G1
G2
Final Rank
Rank A
1
2
3
4
5
6
7
8
9
10
Avg. = 1.5
Avg. = 7.5
1.5
1.5
3
4
5
6
7.5
7.5
9
10
G1 Rank
(R1)
G2 Rank
(R2)
1.5
4
6
7.5
9
Total = 28
n1 = 5
1.5
3
5
7.5
10
Total = 27
n2 = 5
84
U1 = n1 n2 +
[n1(n1 + 1)]
U2 = n1 n2 +
[n2(n2 + 1)]
2 R1
2 R2
In this example,
U1 = 12 and U2 = 13
To be statistically significant, the obtained U value must be equal to or less than this critical value.
Since the calculated U value is 12 (not less than 2), there is no statistical difference between the mean of
the two groups.
86
Kruskal-Wallis Test
The Kruskal-Wallis test is also a non-parametric test used for testing the source of origin of the
samples.
Characteristics of Kruskal-Wallis test are as follows:
Medians of two or more samples are compared to find the source of origin of the sample.
Unlike the analogous one-way analysis of variance, it does not assume the normal distribution of
the residuals.
Null hypothesis is when medians of all the groups are equal, and
Alternative hypothesis is when at least one population median of one group is different than that of at
least one other group.
87
Form a
contingency
table
Find expected
value for each
cell
Find chi-square
value
88
Friedman Test
Friedman test is a form of non-parametric test that does not make any assumptions on the shape and
origin of the sample.
Unlike ANOVA, it does not require the dataset to be randomly sampled from normally distributed
populations with equal variances.
The test uses null hypothesis where the population medians of each treatment are statistically identical to
the rest of the group.
89
Here, H0 is the hypothecated median or assumed median of the sample, which belongs to the
population.
91
approximately equal.
The conclusion in this test is that if the value is on the mid-point, you can continue and accept the null
92
Conclusion:
= 0.05
93
Quiz
QUIZ
1
Which of the following describes the population parameters based on the sample data
using a particular model?
a.
Statistics
b. Inferential Statistics
c.
Probability
d.
Correlation
95
QUIZ
1
Which of the following describes the population parameters based on the sample data
using a particular model?
a.
Statistics
b. Inferential Statistics
c.
Probability
d.
Correlation
Answer: b.
Explanation: Inferential statistics describe the population parameters based on the sample
data using a particular model.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
96
QUIZ
2
a.
Poisson distribution
b. Normal distribution
c.
Chi-square distribution
d.
Probability distribution
97
QUIZ
2
a.
Poisson distribution
b. Normal distribution
c.
Chi-square distribution
d.
Probability distribution
Answer: a.
Explanation: Poisson distribution is an application of the population knowledge to predict
the sample behavior.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
98
QUIZ
3
a.
Correlation
b. Probability
c.
F-distribution
d.
Regression
99
QUIZ
3
a.
Correlation
b. Probability
c.
F-distribution
d.
Regression
Answer: d.
Explanation: The degree of movement of variable Y as X changes is calculated using
regression.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
100
QUIZ
4
A null hypothesis states that a process has not improved as a result of some
modifications. The type II error is to conclude that:
a.
d.
101
QUIZ
4
A null hypothesis states that a process has not improved as a result of some
modifications. The type II error is to conclude that:
a.
d.
Answer: b.
Explanation: A type II error means that we have failed to reject the null hypothesis (H0)
when it is false.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
102
QUIZ
5
The test used for testing significance in an analysis of variance table is the:
a.
Z-test.
b. t-test.
c.
F-test.
d.
Chi-square test.
103
QUIZ
5
The test used for testing significance in an analysis of variance table is the:
a.
Z-test.
b. t-test.
c.
F-test.
d.
Chi-square test.
Answer: c.
Explanation: The appropriate ANOVA test is the F-test. ANOVA is a test of the equality of
means.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
104
QUIZ
6
Which of the following is the only way to analyze the variance by ranks?
a.
Friedman test
d.
Kruskal-Wallis test
105
QUIZ
6
Which of the following is the only way to analyze the variance by ranks?
a.
Friedman test
d.
Kruskal-Wallis test
Answer: d.
Explanation: The Kruskal-Wallis test is the only way to analyze the variance by ranks.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
106
QUIZ
7
What distribution is used while making inferences about a population variance based
on a single sample from that population?
a.
Chi-square distribution
b. Normal distribution
c.
t-distribution
d.
F-distribution
107
QUIZ
7
What distribution is used while making inferences about a population variance based
on a single sample from that population?
a.
Chi-square distribution
b. Normal distribution
c.
t-distribution
d.
F-distribution
Answer: a.
Explanation: The chi-square distribution is used to compare a sample variance with a known
population variance.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
108
QUIZ
8
If p-value is less than the significant value, the null hypothesis has to be:
a.
rejected.
b. accepted.
c.
maintained as it is.
d.
re-evaluated.
109
QUIZ
8
If p-value is less than the significant value, the null hypothesis has to be:
a.
rejected.
b. accepted.
c.
maintained as it is.
d.
re-evaluated.
Answer: a.
Explanation: If the p-value is less than the significant value, the null hypothesis has to be
rejected as the data is not supporting the null hypothesis and the difference will be
statistically significant.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
110
QUIZ
9
Which of the following is a nonparametric test that is used to test the equality of
medians from two or more different populations?
a.
b. Kruskal-Wallis test
c.
Friedman test
d.
111
QUIZ
9
Which of the following is a nonparametric test that is used to test the equality of
medians from two or more different populations?
a.
b. Kruskal-Wallis test
c.
Friedman test
d.
Answer: a.
Explanation: The Moods median is a nonparametric test that is used to test the equality of
medians from two or more different populations.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
112
QUIZ
10
a.
F-distribution
b. t-distribution
c.
Poisson distribution
d.
Binomial distribution
113
QUIZ
10
a.
F-distribution
b. t-distribution
c.
Poisson distribution
d.
Binomial distribution
Answer: a.
Explanation: The F-distribution is a ratio of two chi-square distributions.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
114
QUIZ
11
Which of the following is the probability of correctly rejecting the null hypothesis when
it is false?
a.
b. Power of a test
c.
d.
115
QUIZ
11
Which of the following is the probability of correctly rejecting the null hypothesis when
it is false?
a.
b. Power of a test
c.
d.
Answer: b.
Explanation: The power of a test is the probability of correctly rejecting the null hypothesis
when it is false.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
116
QUIZ
12
Which of the following assumes that the existing sample is randomly taken from a
population, with a symmetric frequency distribution around the median?
a.
Kruskal-Wallis test
d.
Friedman test
117
QUIZ
12
Which of the following assumes that the existing sample is randomly taken from a
population, with a symmetric frequency distribution around the median?
a.
Kruskal-Wallis test
d.
Friedman test
Answer: c.
Explanation: 1 Sample Wilcoxon test assumes that the existing sample is randomly taken
from a population, with a symmetric frequency distribution around the median.
Copyright 2012-2014,Simplilearn,All rights reserved
Copyright 2014, Simplilearn, All rights reserved.
118
Summary
Here is a quick
recap of what we
have learned in this
lesson:
A t-test is used for 1-sample and 2-sample tests are used for comparing two
means.
119
Summary (contd.)
Here is a quick
recap of what we
have learned in this
lesson:
The KruskalWallis Test is used for testing the source of origin of samples.
The Moods median test is used to test the equality of medians from two
or more different populations.
The Friedman test does not make any assumptions on the shape and
origin of the sample.
The 1 Sample Sign test is the simplest of all the non-parametric tests that
120
THANK YOU