Discussion For Today: Probability Sampling Non Probability Sampling Questionnaire

Lecture 5
Discussion for Today

Probability sampling
Non probability sampling
Questionnaire
Probability sampling-the types
1-
Random Sampling or Simple Random Sampling
When each and every unit of the population has equal probability of
being included in the sample example: a lottery system.
When to use Simple random sample
1.
Have an accurate and easily accessible sampling frame that lists the entire
population, preferably stored on a computer.
2.
Not suitable for face-to-face data collection methods if the population

covers a large geographical area.
3.
Prefer this sampling whenever possible
4.
It minimizes the biases.
2- Stratified Random Sampling

This is a form of random sampling in which units are divided into groups or
categories (homogenous) that are mutually exclusive. These groups are called
strata.
Within each stratum simple or systematic random is selected.
Grouping by age, sex
Advantages:
a- It provides more accurate impression of the population.
b- It is an improvement over random sampling when the population is more
heterogeneous.
Disadvantages:
a- If not properly designed, overlapping, the accuracy of the results
decreases.
3- Systematic sampling
A form of random sampling involving a system which means there is gap, interval or no
sampling between each selected units
When to use systematic sampling
It is used when the population that we want to study is connected to an identified site, e.g.
I.
Patients attending a clinic.
II.
Houses that are ordered along a road
III.
Customers who walk one by one through an entrance
Advantages:
1.
Sufficiently random to obtain reliable estimates
Disadvantages:
2.
It is not fully random because after the first step each unit is selected with a fixed
interval.
3.
It could be problematic if particular characteristics arise. For example every 10 th house

in the sector may be corner house.
4- Cluster/area Sampling
Clusters are formed by breaking down the area to be surveyed into
smaller areas.
Then a few of smaller areas are selected randomly.
If the clusters is small all the respondents are interviewed otherwise
The units/respondents are selected randomly.
When to use:
It is used when the population is widely dispersed across the regions. For
example universities, villages.
Advantages:
I. When no suitable sampling framework, this is the suitable method.
II. Time and money is saved to avoid travelling.
III. Do not need a complete frame of the population, need a complete list of
clusters.
Disadvantages:
1. Cluster may contain similar units.
Stratum is homogeneous, cluster should be as heterogeneous as possible
Multistage cluster sampling
It is a combination of the methods of random sampling.

Population is divided into number of stages.
It guarantees the greatest representativity for the survey
It is also one of the most complex methods.
Simply speaking it is a series of samples taken at successive stages.
Normally used to overcome problems associated with a geographically
dispersed population when face-to-face contact is needed.
Non-Probability Sampling
It is a process in which the personal judgment determines rather the statistical
procedure which unit is to be selected. It is also called non. Random sampling.
1- Quota Sampling: In this techniques interviewer is asked to select a person
with certain characteristics.
The purpose is to make sample more representative of the population.
Advantages:
I.
An alternative when there is no suitable random framework
II.
Lower cost as the survey is carried rapidly.
Disadvantages:
III.
Identifying the unit is difficult.
2- Snow ball sampling:

Used when the population is hidden, for example sex workers
and drug addictor.
First key informants are identified that help in reaching the
respondents.
With the help of that respondents further are contacted.
The sample increases as it rolls down.
The process continues till the requirement.
Which techniques to use

No rule of thumb
Purpose of the researcher
Resource
Time
Nature of the study
SUMMARY
QUESTIONNAIRE
A QUESTIONNAIRE IS ONLY AS GOOD AS THE QUESTIONS IT

ASKS
Questionnaire
What a Questionnaire is?
A series of written questions in a fixed, rational order to generate the
statistical information from a specific Population needed to accomplish the
research objectives.
Purposes of the Questionnaire
Ensures standardization and comparability of the data across interviews

everyone is asked the same questions
Allows the researcher to collect the relevant information necessary to

address the management decision problem
Criteria to consider
Does it provide the necessary information?

Does it consider the respondent?
Does it meet editing, coding and data processing requirements?
Questionnaire Design
1- List variables
I.
Focus Groups that include
II.
key Informants
III.
Theory or Conceptual Framework,
IV.
Expert opinion.
2- Borrow from other Instruments

A. Save development effort (reinventing the wheel)
B. Borrow reliability, validity
C. Facilitate comparison with previous studies
3. Solicit input from colleagues and friends
Correlation
What
Correlation is:
It measure the degree of relationship/association between the
variables.
The measure of correlation is called the correlation coefficient.
1- It can be positive as well as negative
2- Its range is --------------
( -1 r +1) (DIAGRAM)
3- It is symmetrical in nature; that is, the coefficient of correlation

between X and Y() is the same as that between Y and X(.
4- It is independent of the origin and scale- notes
Causation versus correlating

Causation
1. Cause and effect
Correlation
1- Degree of Association
2. Asymmetric
Y=f(x) is not equal to x=f(y)
3- Causation is necessarily
correlation
2- Symmetric
=
3- Correlation is not
necessarily causation
Notation
Dependent variable
Independent variable
Explained variable
Explanatory variable
Predictand
Predictor
Regressand
Regressor
Response
Stimulus
Endogenous
Exogenous
Outcome
Covariate
Controlled variable
Control variable
LHS
RHS
Regression
History- Francis Galton
Tall parents----------tall children
However average height of children less than parents
Short parents.. Short children
However average height of children was greater than parents.
The average height of children tend to move or regress the
average height of population as a whole. Galton law of universal
Regression
Karl Pearson verified it by collecting data from 1000 people and
called it regression to mediocrity
Modern concept
Regression analysis concerned with the study of dependence of
one variable (dependent variable) on one or more variables
(explanatory variables) with a view to estimate or predict the
average/mean value of the DV in term of the given/fixed value of
the known EV variable.
Example 1- sons height and fathers height
Example 2- height at different age level
Note that this line has a positive slope but the slope is less
than 1, which is in conformity with Galtons regression to
mediocrity.
Statistical Versus Deterministic Relationship

Regression concerns with statistical relationship not functional or
deterministic dependence of variables as in physics.
Example 1: Dependency of crop yield
Y= f ( temp, sunshine, rainfall, fertilizers,.)
Measurement of error, many other variable, prediction is not 100% correct
Newton's law of gravity
F becomes random if the measurement error arises in k.
Statistical versus deterministic Relationship

Functional or Deterministic
Statistical
Concerned
with
dependency
Variables are random
Statistical dependency
variable
Concerned with variable

dependency
Variables are non random
Deterministic or functional
dependency
Can not be predicted with accuracy

Can be predicted accurately
Example: Crop yield
Example: Newton's law
Regression versus causation

Although the regression analysis deal with dependency of one
variable on other variables

It does not necessarily imply causation.
A statistical relationship, however strong can never establish causal
connection.
There is no statistical reason to assume that rainfall does not
depend on crop yield.

Our idea of causation must come from outside statistics ultimately
from some theory or other information.

Key Point: a statistical relationship in itself cannot logically imply
causation.
Simple or Bivariate Regression
Regression analysis is largely concerned with estimating and/or predicting

the (population) mean value of the dependent variable on the basis of the
known or xed values of the explanatory variable(s).
Example: EXPENDITURE-INCOME
Conditional Mean: E(Y/X)
Unconditional Mean: E(Y)
The population regression line is simply the locus of the conditional mean of
the dependent variable for the fixed values of the explanatory variable.
Population Regression Function(PRF)

E(Y/Xi)=f(Xi)---------------------------------------A
The above equation is called conditional expectation function(CEF) or
Population Regression Function PRF.
What form the f(Xi) assume- important question
E(Y/Xi)= B1+B2 Xi
---------------(B)
B1 and B2 are unknown but fixed parameters known as regression

coefficients.
B1 and B2 also known as intercept and slope coefficients.
Other names are Regression, Regression equation, Regression model
used synonymously.
The purpose of the regression is to estimate the values of the parameters i.e.
unknown parameters B1 and B2
Summary
Correlation
Correlation and causation
Regression
Regression and causation

Discussion For Today: Probability Sampling Non Probability Sampling Questionnaire

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Discussion For Today: Probability Sampling Non Probability Sampling Questionnaire

Hochgeladen von

Copyright:

Verfügbare Formate

Lecture 5

Discussion for Today

Probability sampling-the types

Random Sampling or Simple Random Sampling

Not suitable for face-to-face data collection methods if the population

Prefer this sampling whenever possible

It minimizes the biases.

2- Stratified Random Sampling

Patients attending a clinic.

Houses that are ordered along a road

Customers who walk one by one through an entrance

Sufficiently random to obtain reliable estimates

It could be problematic if particular characteristics arise. For example every 10 th house

Multistage cluster sampling

It is a combination of the methods of random sampling.

The purpose is to make sample more representative of the population.

An alternative when there is no suitable random framework

Lower cost as the survey is carried rapidly.

Identifying the unit is difficult.

2- Snow ball sampling:

Which techniques to use

A QUESTIONNAIRE IS ONLY AS GOOD AS THE QUESTIONS IT

Ensures standardization and comparability of the data across interviews

Allows the researcher to collect the relevant information necessary to

Does it provide the necessary information?

Focus Groups that include

Theory or Conceptual Framework,

2- Borrow from other Instruments

3. Solicit input from colleagues and friends

3- It is symmetrical in nature; that is, the coefficient of correlation

Causation versus correlating

Statistical Versus Deterministic Relationship

Example 1: Dependency of crop yield

Y= f ( temp, sunshine, rainfall, fertilizers,.)

Measurement of error, many other variable, prediction is not 100% correct

Newton's law of gravity

F becomes random if the measurement error arises in k.

Statistical versus deterministic Relationship

Concerned with variable

Can not be predicted with accuracy

Example: Newton's law

Regression versus causation

variable on other variables

depend on crop yield.

from some theory or other information.

Simple or Bivariate Regression

Regression analysis is largely concerned with estimating and/or predicting

Conditional Mean: E(Y/X)

Unconditional Mean: E(Y)

Population Regression Function(PRF)

B1 and B2 are unknown but fixed parameters known as regression

Das könnte Ihnen auch gefallen