Beruflich Dokumente
Kultur Dokumente
Introduction to
Non-Parametric Statistics
Kim Carmela D. Co
Email: kimcarmelaco@up.edu.ph
Learning Objectives
At the end of the session, the participants should be able to:
1. Discuss the process of hypothesis testing and statistical
significance and its required assumptions
2. Describe and differentiate distribution free tests from strictly
non-parametric tests
3. Discuss the advantages and disadvantages of non-parametric
statistical methods
4. Discuss scenarios where non-parametric methods are useful
5. Discuss criteria in selecting statistical tests
1
06/08/2019
Branches of Statistics
Descriptive Inferential
Hypothesis Testing
1. Statement of statistical hypotheses
(Statements about the population)
• Null hypothesis is a statement of no effect or no difference
• Alternative hypothesis indicates presence of an effect or
difference, can be directional or nondirectional
2
06/08/2019
Hypothesis Testing
3. Assess the statistical significance
Decision on whether obtained difference is due to presence of
genuine effect or may be attributable to chance is based on the
computation of the p-value:
• Probability of obtaining a result as extreme or more extreme than the
sample result, given that the null hypothesis is true
3
06/08/2019
https://www.probabilitycourse.com/chapter8/8_4_2_general_setting_definitions.php
4
06/08/2019
Hypothesis Testing
• Conclusions using these techniques are only valid if assumptions
can be substantiated
• For most statistical tests (z-test, t-test, F-test):
• Randomly drawn from a normally distributed population
• Consist of independent observations, except for paired values
• Consist of values on an interval or ratio measurement scale
• Have populations with approximately equal variances
5
06/08/2019
• Generally, normality is
difficult to assume for
asymmetric histograms,
especially for
small sample sizes
Alternatives?
If normality cannot be assumed, then the data should at least:
• Approximately resemble a normal distribution
• Have an adequately large sample size
• Invocation of the Central Limit Theorem
“Sampling distribution of the mean becomes approximately normal
regardless of the distribution of the original variable”
6
06/08/2019
Distribution-Free Tests
• Methods based on functions of the sample observations whose
sampling distribution can be determined
without knowledge of the specific distribution of the
underlying population
• No strict assumptions regarding the underlying population
distribution or level of measurement
• Analysis methods can be applied to samples from populations
having distributions which need not belong to a specific
distribution family (particularly the normal one)
• Usually, exact p-values can be computed
Branches of Statistics
Descriptive Inferential
7
06/08/2019
Parameters
Numerical characteristics of the population from which sample was
drawn
1. Measures of Location
2. Measures of Variability
3. Measures of Association between Variables
8
06/08/2019
Non-Parametric Statistics
• Include procedures that test hypotheses which are not statements
about population parameters
• Hypothesis is concerned only with the form or shape of the
population distribution (e.g. goodness of fit tests) or with some
other characteristic of the probability distribution of the sample
data (test of randomness or trend)
Non-Parametric Statistics
• Generally used to refer to both non-parametric and distribution-
free tests
9
06/08/2019
Advantages
• Tests of hypotheses which are not statements about parameter
values have no counterpart in parametric statistics
• Require few assumptions about the underlying populations from
which the data are obtained
• Conclusions reached in nonparametric methods do not require many
qualifiers to be considered valid
• In most cases, quick and easy to apply
• Tests often involve simple arithmetic, and easy to understand
• Can be used for data in any level of measurement
Advantages
• Scope of application is wider
• Can be used for instances where it is impractical or impossible to obtain
quantitative measurements
• Many nonparametric procedures require just the ranks of the
observations
• Whereas the parametric procedures require the magnitudes
• Can simply be a different approach to solving standard statistical
problems
• Relatively insensitive to outlying observations (uses median rather than mean)
• Enables the user to obtain exact P-values for tests, without relying on assumptions
that the underlying populations are normal
10
06/08/2019
Disadvantages
• Difficulty in construction of confidence intervals
• May overlook information that may have led to a better solution.
• For some tests, may require a larger sample size to achieve the same power as
parametric counterparts
Disadvantages
• May be considered by some as discarding information (conversion of
quantitative to ranks), since quantitative data is not required by
nonparametric methods
• BUT it may be argued that:
• If underlying distribution is known, and classical testing can be applied,
then there is no need to use nonparametric methods
• Usually, the nonparametric procedures are only slightly less efficient than
their normal theory competitors when the underlying populations are
normal
• Nonparametric methods can be mildly or wildly more efficient than these
competitors when the underlying populations are not normal
11
06/08/2019
12
06/08/2019
Example:
HIV viral load - can range from "not detected" or "below the limit of
detection" to hundreds of millions of copies. Thus, in a sample some
participants may have measures like 1,254,000 or 874,050 copies and others
are measured as "not detected."
13
06/08/2019
14
06/08/2019
15
06/08/2019
Nonparametric Methods
Satisfy at least one of the following:
1. May be used on data with a nominal scale of measurement
2. May be used on data with an ordinal scale of measurement
3. May be used on data with an interval/ratio scale of
measurement, where the distribution function of the random
variable is unspecified
16
06/08/2019
References
• Daniel, W. W. (2009). Biostatistics: A Foundation for Analysis in
the Health Sciences, 9th Edition. Wiley & Sons, Inc. USA.
• Gibbons, J. D. & Chakraborti, S. (2003). Nonparametric Statistical
Inference, 4th Edition. Marcel Dekker, Inc., USA.
• Hammel, T. (2017) Materials for Stata 464: Applied
Nonparametric Statistics. Pennsylvania State University. Accessed
from: https://onlinecourses.science.psu.edu/stat464/node/1
• Sprent, P. & Smeeton, N. C. (2001). Applied nonparametric
statistical methods, 3rd Edition. Chapman and Hall/CRC., USA.
17