6 Sampling and Basic Descriptive Statistics

Sampling and Basic Descriptive
Statistics. Basic concepts and

Techniques.
Lecture 6
Leah Wild
Overview
Sampling In Quantitative Research
Basic Descriptive Statistics And
Graphical Representation Of Data
Quantification,
Variables
Levels Of Measurement
And
Sampling In Quantitative
Research
Total Population
Representative sample
Probability Sampling
Non-Probability Sampling
Sample Size
Total Population
The total collection of units, elements or

individuals that you want to analyse.
These can be countries, lab-rats, light bulbs,
university students, banks, residents of a
particular area, regional health authorities
etc.
The population for a study of infant health
might be all children born in the U.K. in the
1980's.
Sample
A sample is a group of units selected from a larger

group (the population). By studying the sample it is
hoped to draw valid conclusions about the larger
group.
Using example for study of infant health the sample
might be all babies born on 7th May in any of the
years.
samples selected because the population is too
large to study in its entirety.
Important that the researcher carefully and
completely defines the population, including a
description of the members to be included
Representative sample
A sample whose characteristics correspond

to, or reflect, those of the original population
or reference population
To ensure representativeness, the sample
may be either completely random or
stratified depending upon the conceptualized
population and the sampling objective (i.e.,
upon the decision to be made).
A thorny issue in the social sciences- is it
possible to achieve?
Probability Sampling
A probability provides a quantitative description of the likely
occurrence of a particular event.
A probability sampling method is

any method of sampling that uses
some form of random selection. In
order to have a random selection
method, you must set up some
process or procedure that assures
that the different units in your
population have equal probabilities
of being chosen (Clark 2002: 37).
Most Common Types of Probability

Sampling
Simple Random Sampling

Stratified Random Sampling
Systematic Random Sampling
Cluster Or Multistage Sampling
Simple Random Sampling
where we select a group of subjects (a sample) for

study from a larger group (a population). Each
individual is chosen randomly and each member of
the population has an equal chance of being
included in the sample.
Every possible sample of a given size has the same
chance of selection; that is, each member of the
population is equally likely to be chosen at any stage
in the sampling process. (Easton & Mc Coll 2004).
A lottery draw is a good example of simple random
sampling. A sample of 6 numbers is randomly
generated from a population of 45, with each
number having an equal chance of being selected.
Stratified
Random Sampling
Often factors which divide up the population into subpopulations (groups / strata)
measurement of interest may vary among the different subpopulations.
This has to be accounted for when we select a sample from
the population to ensure our sample is representative of the
population.
This is achieved by stratified sampling.
A stratified sample is obtained by taking samples from each
stratum or sub-group of a population.
Suppose a farmer wishes to work out the average milk yield of
each cow type in his herd which consists of Ayrshire, Friesian,
Galloway and Jersey cows. He could divide up his herd into
the four sub-groups and take samples from these (Easton and
Mc Coll 2004).
Systematic Random Sampling
Systematic sampling, sometimes called interval sampling, means that

there is a gap, or interval, between each selection.
Often used in industry, where an item is selected for testing from a
production line (say, every fifteen minutes) to ensure that machines and
equipment are working to specification.
Alternatively, the manufacturer might decide to select every 20th item on
a production line to test for defects and quality. This technique requires
the first item to be selected at random as a starting point for testing and,
thereafter, every 20th item is chosen.
used when questioning people in surveys eg market researcher
selecting every 10th person who enters a particular store, after selecting
a person at random as a starting point;
interviewing occupants of every 5th house in a street, after selecting a
house at random as a starting point.
If researcher wants to select a fixed size sample. In this case, it is first
necessary to know the whole population size from which the sample is
being selected. The appropriate sampling interval, I, is then calculated
by dividing population size, N, by required sample size, n, as follows:
If a systematic sample of 500 students were to be carried out in a
university with an enrolled population of 10,000, the sampling interval
would be:
I = N/n = 10,000/500 =20
Cluster Or Multistage Sampling
Cluster sampling is a sampling technique where the

entire population is divided into groups, or clusters,
and a random sample of these clusters are
selected. All observations in the selected clusters
are included in the sample.
every element should have a specified (equal)
chance of being selected into the final sample.
typically used when the researcher cannot get a
complete list of the members of a population they
wish to study but can get a complete list of groups
or 'clusters' of the population
Cheap, easy economical method of data collection.
Non-Probability Sampling
Main Types
Convenience/ opportunity/accidental
sampling.
Purposive/ judgemental sampling
Quota sampling
Snowball sampling
Convenience/ opportunity/accidental
sampling.
volunteer samples
Sometimes access through contacts or
gatekeepers
easy to reach population.
Purposive/ judgemental sampling
Involves selecting a group of people

because they have particular traits that
the researcher wants to study
e.g. consumers of a particular product
or service in some types of market
research
My own questionnaire research on
New-Age Travellers.
Quota sampling
widely used in opinion polls and market

research.
Interviewers given a quota of subjects of
specified type to attempt to recruit.
eg. an interviewer might be told to go out
and select 20 male smokers and 20 female
smokers so that they could interview them
about their health and smoking behaviours .
Snowball sampling
1.
2.
Involves two main steps.

Identify a few key individuals
Ask these individuals to volunteer to
distribute the questionnaire to people
who know and fit the traits of the
desired sample (e.g. my research on
Travellers)
Sample Size
In general, the larger the sample size (selected with

the use of probability techniques) the better. The
more heterogeneous a population is on a variety of
characteristics (e.g. race, age, sexual orientation,
religion) then a larger sample is needed to reflect
that diversity. (Papadopoulos 2003)
Response rates vary on the type of surveys (e.g.
mail surveys, telephone surveys). Response rates
under 60 or 70 per cent may compromise the
integrity of the random sample. (ibid)
Basic Descriptive Statistics And

Graphical Representation Of Data
Can be divided into two types

Descriptive.
Inferential
Some authors suggest a third type
Associative (Downey 1975)
Descriptive Statistics
Statistics which describe attributes of a

sample or population.
includes measures of central tendency
statistics (e.g., mean, median, mode),
frequencies, percentages. minimum,
maximum, and range for a data set,
variance etc.
organise and summarise a set of data
Inferential Statistics
Used to make inferences or judgments about a larger

population based on the data collected from a small
sample drawn from the population.
Eg Exit polling used during US elections to determine
how the population of voters voted
A key component of inferential statistics is the calculation
of statistical significance of a research finding.
used to determine whether changes in a dependent
variable are caused by an independent variable (Clark
2004)
(HOMEWORK- WHAT ARE SOME OF THE PROBLEMS
ASSOCIATED WITH THESE KIND OF STATISTICS?
Quantification, Variables
And Levels Of Measurement
Rowntree (2000) distinguishes

between category variables and
quantity variables.
Category variables can be nominal or
ordinal.
Quantity variables can be discrete or
continuous.
Examples: Nominal Data
Type of Bicycle
Ethnicity
Mountain bike, road bike, chopper, folding,BMX.

White British, Afro-Caribbean, Asian, Chinese,
other, etc. (note problems with these
categories).
Smoking status
smoker, non-smoker
Ordinal Data
A type of categorical data in which order is

important.
Class of degree-1st class, 2:1, 2:2, 3rd class,
fail
Degree of illness- none, mild, moderate,
acute, chronic.
Opinion of students about stats classesVery unhappy, unhappy, neutral, happy,
ecstatic!
Discrete Data
Only certain values are possible (there are
gaps between the possible values). Implies
counting.
Continuous Data
Theoretically, with a fine enough
measuring device. Implies counting.
Relationships between Variables.

(Source. Rowntree 2000: 33)
Variables
Quantity
Category
Nominal
Ordinal
Ordered
categories
Discrete
(counting)
Ranks.
Continuous
(measuring)
Quantification, Variables
And Levels Of Measurement
1.
2.
3.
4.
Fielding and Gilbert (2000: 15)

distinguish between four levels of
measurement.
Nominal
Ordinal.
Interval
Ratio.
Interval and ratio variables
According to Fielding & Gilbert (2000) these are often used

interchangeably, and incorrectly by social scientists.(pg15)
Interval, ordered categories, no inherent concept of zero
(Clark 2004), we can calculate meaningful distance between
categories, few real examples of interval variables in social
sciences. (Fielding & Gilbert 2000:15)
Ratio. A meaningful zero amount (eg income), possible to
calculate ratios so also has the interval property (eg someone
earning 20,000 earns twice as much as someone who earns
10,000).(ibid)
Difference between interval and ratio usually not important for
statistical analysis (ibid).
Interval variables- Examples
Fahrenheit temperature scale- Zero is

arbitrary- 40 Degrees is not twice as hot as
20 degrees.
IQ tests. No such thing as Zero IQ. 120 IQ
not twice as intelligent as 60.
Question- Can we assume that attitudinal
data represents real, quantifiable measured
categories? (ie. That very happy is twice as
happy as plain happy or that Very
unhappy means no happiness at all).
Statisticians not in agreement on this.
Ratio variables-Examples
Can be discrete or continuous data.

The distance between any two adjacent units of
measurement (intervals) is the same and there
is a meaningful zero point (Papadopoulos 2001)
Income- someone earning 20,000 earns twice
as much as someone who earns 10,000.
Height
Unemployment rate- measured as the number
of jobseekers as a percentage of the labour
force (ibid).
IMPORTANT! SEE
TYPES OF DATA
REVISION SLIDES ON
MY WEBSITE FOR
EXTRA INFORMATION
ON TYPES OF DATA
Frequencies and Distributions
Frequency-A frequency is the number of

times a value is observed in a distribution or
the number of times a particular event
occurs.
Distribution-When the observed values are
arranged in order they are called a rank
order distribution or an array. Distributions
demonstrate how the frequencies of
observations are distributed across a range
of values.
Example
Look at the distribution below:
This distribution shows the recorded ages of patients receiving
treatment for heart disease in the Stroud district. There are
50 observed values. We can easily see how often each value
occurs. What is the frequency of the following values, 79;
81; 94? What is the range of this distribution?(r = h l ).
What is the mode? What is the median? From this
distribution we can also tell that most of the values tend to
cluster around the middle of the range.
62
64
65
66
68
70
71
71
72
72
73
74
74
74
75
75
76
77
77
78
78
78
79
79
79
80
80
80
81
81
81
81
81
82
82
82
83
83
85
85
86
87
87
88
89
90
90
92
94
96
Two elements to a distribution
Scale with a number of values (Usually arrange the scores from the
highest to lowest).
Corresponding observations- Tally up
the scores, convert them into
frequencies.
Types of Distribution
Frequency distribution
Class Intervals
Relative (Proportional or percentage
distributions)
Cumulative distributions.
Frequency Distributions
1.
2.
Shows number of cases having each of the

attributes of a particular variable. Divided
into two types
Ungrouped distribution-scores not collapsed
into categories, each score represented as
a separate values
Grouped distribution. Scores collapsed into
categories so that several scores are
presented together as a group. Groups
usually referred to as a class interval.
Relative (proportional or
percentage) distributions
The proportion of cases in the whole

distribution observed at each score or
value.
Cumulative distribution.
The number of cases up to and

including the scale value. Can appear
in grouped or ungrouped format.
Cumulative relative distribution for any
particular value is the the total up to,
and including, that value

6 Sampling and Basic Descriptive Statistics

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

6 Sampling and Basic Descriptive Statistics

Hochgeladen von

Copyright:

Verfügbare Formate

Sampling and Basic Descriptive

Statistics. Basic concepts and

The total collection of units, elements or

A sample is a group of units selected from a larger

A sample whose characteristics correspond

A probability sampling method is

Most Common Types of Probability

Simple Random Sampling

Simple Random Sampling

where we select a group of subjects (a sample) for

Systematic Random Sampling

Systematic sampling, sometimes called interval sampling, means that

Cluster Or Multistage Sampling

Cluster sampling is a sampling technique where the

Purposive/ judgemental sampling

Involves selecting a group of people

widely used in opinion polls and market

Involves two main steps.

In general, the larger the sample size (selected with

Basic Descriptive Statistics And

Can be divided into two types

Statistics which describe attributes of a

Used to make inferences or judgments about a larger

Rowntree (2000) distinguishes

Examples: Nominal Data

Mountain bike, road bike, chopper, folding,BMX.

A type of categorical data in which order is

Relationships between Variables.

Fielding and Gilbert (2000: 15)

Interval and ratio variables

According to Fielding & Gilbert (2000) these are often used

Interval variables- Examples

Fahrenheit temperature scale- Zero is

Can be discrete or continuous data.

Frequencies and Distributions

Frequency-A frequency is the number of

Two elements to a distribution

Shows number of cases having each of the

The proportion of cases in the whole

The number of cases up to and

Das könnte Ihnen auch gefallen