Beruflich Dokumente
Kultur Dokumente
Chapter 1
Introduction
and Data
Collection
Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
What information could she use to base
her decision on?
1.
Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
What information could she use to base
her decision on?
2.
Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
What information could she use to base
her decision on?
3.
Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
Key Definitions
A population consists of all the members of
a group about which you want to draw a
conclusion
A sample is the portion of the population
selected for analysis
A parameter is a numerical measure that
describes a characteristic of a population
A statistic is a numerical measure that
describes a characteristic of a sample
Sample
Chapters 1-3
[Probability ]
Inferential statistics
Chapter 4
Chapter 5
onwards
Descriptive Statistics
Collect data
e.g. Survey
Present data
e.g. Tables and graphs
Characterise data
e.g. Sample mean =
X
n
Inferential Statistics
Estimation
e.g. Estimate the population mean income
(parameter) using the sample mean income
(statistic)
Hypothesis testing
e.g. Test the claim that the population mean
income of those with a degree is higher than
population mean income of those without a degree.
See CAST
1.2.1
More vocabulary
Income
Total
from
income
source
Income
source
Qualification
**Bivariate (2)
underC
wage
$234
$234
**Multivariate schoolC
wage
$399
$399
Self employ
$196
$304
Govt trans
$150
$200
wage
$298
$298
other
$50
$679
(2 or more)
sixthform
highersch
sixthform
highersch
Types of Variables
Data
Categorical Variables
Numeric Variables
(defined categories)
(Quantitative)
Ordinal
Nominal
(Ordered
categories)
Discrete
(counted items)
Continuous
(Measured
characteristics)
Types of Variables
Categorical (Nominal)
Categorical (Ordinal)
Classifies data into ordered categories e.g. letter grades, tennis
rankings, Likert scales
Numerical (Discrete)
Counted items (finite number of items) e.g. number of children,
number of people who have type O blood
Numerical (Continuous)
Measured characteristics (infinite number of items) e.g. weight,
height, temperature, income
Collecting Data
1.
Important sources:
Data distributed by organisation or individual
Designed experiment
Survey
Observational study
2.
Experiments
Experimentation allows us to study the
specific treatments that are of interest.
Control for other (confounding) variables.
Draw conclusions of the effect of one
variable on another.
Study the combined effects of several
factors simultaneously
EXAMPLE does shelf height affect sales of a
certain supermarket item?
Observational Data
In an observational study the researcher
observes subjects (usually in natural
setting) and measures variables of interest
but does not impose any treatment.
EXAMPLES
Last months sales figures
Focus groups
Surveys
No control over behaviour
Questions asked
Summary questions
Suppose I am interested in knowing the characteristics
of students who are enrolled this semester at Massey
University. As well as using information from the
Massey University database I select a random sample of
250 students enrolled in at least one paper this semester
and ask each student to complete a questionnaire.
The population of interest is:
All students enrolled at Massey University in Semester 1 2015
The sample is:
Categorical nominal
Numerical continuous
Categorical nominal
Ineligible
Nonresponse
Those selected in
sample who failed
to respond young males often
in this group
Sampled
population
The 29, 000 who
completed the
questionnaire
long-term residents
of homes for the
elderly, hospitals
and psychiatric
institutions
Sampling
frame
Population
Parameter
Sample
Statistic
Univariate
Bivariate
Mulitvariate
Categorical nominal
Categorical ordinal
Numeric discrete
Numeric continuous
Experiment
Survey
Observational data
Secondary data
Probability sample
Items in the sample are chosen on
the basis of known probabilities
Samples
Non-Probability
Samples
Judgement
Quota
Chunk
Convenience
Probability Samples
Simple
Random
Stratified
Systematic
Cluster
Probability Sampling
Items in the sample are chosen based
on known probabilities
Probability Samples
Simple
Random
Systematic
Stratified
Cluster
Systematic Samples
Systematic Samples
Randomly choose a
position (in this case 3rd)
select the 3rd individual in
each group
N = 64
n=8
k=8
First Group
Stratified Samples
Stratified Samples
Population
Divided
into 4 strata
Sample
Cluster Samples
Cluster Samples
Population
divided into 16
clusters.
Randomly selected
clusters for sample
Advantages and
Disadvantages
Simple random sample and systematic sample
Simple to use
May not be a good representation of the
populations underlying characteristics
Advantages and
Disadvantages
Stratified sample
Ensures representation of individuals across the
entire population
Cluster sample
More cost-effective
Less efficient (need larger sample to acquire the
same level of precision as SRS)
Sampling Errors
Non-sampling Errors
See CAST
1.4.6 1.4.8
Excluded from
frame
Follow up on nonresponses
Bad or leading
question
Sampling Errors
See CAST
1.3.3
To Do:
Complete background questionnaire and
maths quizzes on Stream
Sign up for Computer tutorials and
workshops (on Stream)
Complete the first part of the Computer
tutorial
Attempt questions for this section in study
guide (solutions on Stream)
Start Assignment 1 question 1