Sie sind auf Seite 1von 15

QABD

Sampling
Having covered descriptive statistics in Ch. 1 to 3

Probability in Chapter 4

Probability Distributions in Ch. 5 and 6

Let us take the first baby steps towards

Inferential Statistics in Ch. 7

2017 July 26 Sampling 1 of 15


QABD

Average Height of Batch 17


• The average of the current batch of students may be
unknown but does not change.

• It is a population parameter.

• But if I draw a sample of say 7 students, then the


average of the 7 students can be different each time,
depending on who are in my sample.

• The sample mean is one possible statistic which


describes the sample.
2017 July 26 Sampling 2 of 15
QABD

Sample Statistics vs. Population Parameters


 Population – consists of all items of interest in a statistical
problem.
– Population Parameter is unknown but does not change. It is
has a constant value because there is only one population.

 Sample – a subset of the population.


 Sample Statistic is calculated from sample
 its value can change from one sample to another
 Even when number of items per sample is the same across
the many samples.
 used to make inferences about
2017 July 26 Sampling
the population. 3 of 15
QABD
Estimators
• The sample mean X is one possible statistic.

• The other statistics pertaining to a sample are its mode, median,


standard deviation, variance etc.

• A particular statistic (say mean, or median or mode) can be used


to estimate a population parameter (say the population mean)
and hence is called an estimator

• An ‘estimator’ is NOT a person whereas a ‘surveyor’ describes the


person i.e. a person who carries out a survey for e.g. an insurance
surveyor.

2017 July 26 Sampling 4 of 15


QABD
Estimator vs. Estimate
• Suppose we decide to use the sample median

• as an estimator of the population mean,

• and we take a sample of 7 students,

• and then find out their median height = 5’5”

• That particular value (5’5”) is called an estimate of the


population of mean using the estimator (viz. the sample
median).

• Caveat : This difference in the two terms is often a source of


confusion for students in later subjects in the MBA.
2017 July 26 Sampling 5 of 15
QABD
Characteristics of estimators
• Just like humans are described by their characteristics like
height, weight, age etc.. Estimators have characteristics.
• Here, we are interested in only one characteristic viz. BIAS
• BIAS is the tendency of a sample statistic to systematically
over-estimate (or under-estimate) a population parameter.
• That means when we use this particular statistic as an
estimator of a particular population parameter, and do this
many times
• i.e. compute it for many samples drawn from this population,
• the long run average of these estimates is consistently more
than (or consistently less than) the population parameter.

2017 July 26 Sampling 6 of 15


QABD
Selection Bias
• Suppose I stand near the basket ball court and draw a sample,
• the sample could have a disproportionate number of tall people.
• The average height of players in the US basket ball league was
roughly 6'7” – In IFMR it may be 5’10” which > batch average.
• So the sample statistic (whether mean or median) used by me as
the estimator,
• Will over the long run (i.e. if many samples are drawn)
• More often than not, deliver estimates higher than the batch
average height.
• My sampling process suffers from SELECTION Bias
• i.e. the way I have drawn the sample or the statistic that I use as
an estimator, causes this to happen.
2017 July 26 Sampling 7 of 15
QABD
Non-response bias
• Suppose, we had a guest lecture and I have to send a report to
our Director.
• One part of that report is about how did our students find the
guest lecture. So I send a mail asking for feedback.
• Suppose most of those who found it boring or bad do not reply to
my mail. AND, most of those who found it interesting reply do
reply to my mail.
• So ,most of the replies I get will be LIKE.
• Whenever there is a difference between respondents and non-
respondents with regard to the issue being surveyed, we say the
sampling process suffers from Non-Response Bias.
• Non-response does not automatically imply non-response bias.
2017 July 26 Sampling 8 of 15
QABD
Let me summarize BIAS
• Bias – the tendency of a sample statistic to
systematically over- or underestimate a population
parameter.
 Selection bias—a systematic exclusion of certain
groups from consideration for the sample.
 Nonresponse bias—a systematic difference in
preferences between respondents and non-
respondents to a survey or a poll.
2017 July 26 Sampling 9 of 15
QABD

Sampling
 Sampling Methods
 Simple random sample is a sample of n observations
which has the same probability of being selected
from the population as any other sample of n
observations.
 Most statistical methods presume simple random
samples.
 However, in some situations other sampling
methods have an advantage over simple random
samples.

2017 July 26 Sampling 10 of 15


QABD
Two other methods of Sampling
 Stratified Random Sampling
 Divide the population into mutually exclusive and collectively

exhaustive groups, called strata.


 Randomly select observations from each stratum, which are
proportional to the stratum’s size.

 Cluster Sampling
 Divide population into mutually exclusive and collectively

exhaustive groups, called clusters.


 Randomly select clusters.

 Sample every observation in those randomly selected clusters.

2017 July 26 Sampling 11 of 15


QABD

Selecting the sample – B17


• I could list the students from 1 to 180 alphabetically or by roll
number or something else; then randomly select 10 students.
• Or, given that I expect ladies to have different average height than
gentlemen, and given that ladies are 30% of this batch, I divide
the batch (in an XL sheet) into ladies (54) and gentlemen (126),
and select 3 from the 54 ladies and 7 from the 126 gentlemen
• Or I can say that this section is no way different from the other
two, and I measure the height of every one in this section.
• These three examples illustrate Simple Random Sampling,
Stratified Random Sampling and Cluster Sampling respectively.
2017 July 26 Sampling 12 of 15
QABD

Sampling
 Stratified versus Cluster Sampling
 Stratified Sampling  Cluster Sampling

 Sample consists of  Sample consists of


elements from each elements from the
group. selected groups.

 Preferred when the  Preferred when the


objective is to objective is to
increase precision. reduce costs.

2017 July 26 Sampling 13 of 15


QABD
Average height your batch
• The average height of IFMR’s first batch of MBA
students may be unknown does not change.
• It is a population parameter.
• But if I draw a sample of say 12 students, then the
average of those 12 students can be different each time,
depending on who are in my sample.
• If we treat the sample mean as a random variable, it will
have distribution of its own.
• That is what we call the distribution of the sample mean
for n = 12
2017 July 26 Sampling 14 of 15
QABD

Alternate Sample Sizes


• We could redo the exercise and get the distribution
of sample mean for n = 6
• And another distribution by taking many samples of
n = 24, that would be called the distribution of the
sample mean for n = 24
• Instead of talking about many such distributions
which obviously differ due to the sample size,
• We talk about the sampling distribution of the
sampling mean, with ‘n’ as a characteristic of the
distribution.
2017 July 26 Sampling 15 of 15

Das könnte Ihnen auch gefallen