Sie sind auf Seite 1von 18

115.

101 Statistics for Business


NOTICES
Missed the first lecture?
You can download the lecture slides from the
115.101 Stream site (stream.massey.ac.nz)

115.101 Statistics for Business


If you cant access Stream
1. You need to check your enrolment on the Massey
website (using MyEnrolment) and make sure that
you have confirmed your enrolment.
2. Otherwise you should go to Massey Contact on
Level 1 of Quad A to get help.
Bottom line
You cant function in this paper unless you are on
Stream, and you have to be enrolled to get on
Stream!

115.101 Statistics for Business


Please complete the questionnaire and Maths
quizzes on Stream as soon as possible!
Remember to sign up for computer tutorials
(soon) and workshops (later in the week) on
Stream.
Start working on the first part of the computer
tutorial (download from Stream) at home or in a
computer lab.
Please contact me if you spot any errors or
inconsistencies on Stream

Your Statistics Background


If you have done some Statistics before (e.g.
NZ high school) then much of the material
covered in the first three weeks will be familiar
but focusing on using Excel;
If you havent done Statistics before
(e.g.overseas high school) then you should use
the Study Guide and textbook for background
reading and exercises;
EVERYONE needs to understand what
descriptive statistics are, why they are important
and how to use them correctly.

Are you a new BBS student? Yes?


You are expected to attend

When: Wednesday 25th February


Where: Come to SNW Foyer @ 8.30am
Time: Busy all day and party @ Fergs at night
Huh? No classes on
What: Fun day full of activities
Why: Success! Be part of it
Experience and engage
Ignite your BBS degree

Chapter 1
Introduction
and Data
Collection

Learning Outcomes (a)


By the end of this chapter you should be able
to:
Appreciate the meaning and importance of
statistical thinking
Recognise sources of data and understand data
types
Understand and use correctly the terms:
parameter, statistic, population and sample

Learning Outcomes (b)


By the end of this chapter you should be
able to:
Select a probability sample using random
numbers
Critique a survey design, identifying sources of
error

Basic Concepts of Statistics


Statistics is concerned with:
Processing and analysing data
Collecting, presenting and transforming
data to assist decision-makers

Why is data so important?


Data capture and storage capabilities have
EXPLODED in recent years
Everyone is talking about BIG DATA but
very few people know how to analyse it
sensibly
WATCH THE HAL VALERIAN VIDEO ON
STREAM

Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
What information could she use to base
her decision on?
1.

Advice from friends, relatives,


social media etc.

Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
What information could she use to base
her decision on?
2.

Search relevant media?

Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.
What information could she use to base
her decision on?
3.

Objective information source

Example
Jane is considering enrolling in a degree
programme at a New Zealand university.
Before doing so she wants to know what
difference a university degree might
make to her earning potential.

Statistics NZ conducts a survey of


individuals incomes called the New
Zealand Income Survey (NZIS)

Key Definitions
A population consists of all the members of
a group about which you want to draw a
conclusion
A sample is the portion of the population
selected for analysis
A parameter is a numerical measure that
describes a characteristic of a population
A statistic is a numerical measure that
describes a characteristic of a sample

Population vs. Sample


Population

Sample

All people in New


Zealand aged 15 and
over

The 29,000 people


in New Zealand aged 15
and over who
participated in the NZIS

Measures used to describe a


population are called
parameters, e.g. mean
income of population

Measures computed from


sample data are called
statistics, e.g. mean
income of sample

Two Branches of Statistics


Descriptive statistics

Chapters 1-3

Collecting, summarising and presenting


data

[Probability ]
Inferential statistics

Chapter 4
Chapter 5
onwards

Drawing conclusions about a population


based on sample data, i.e. estimating a
parameter based on a statistic

Descriptive Statistics
Collect data
e.g. Survey

Present data
e.g. Tables and graphs

Characterise data
e.g. Sample mean =

X
n

Inferential Statistics
Estimation
e.g. Estimate the population mean income
(parameter) using the sample mean income
(statistic)

Hypothesis testing
e.g. Test the claim that the population mean
income of those with a degree is higher than
population mean income of those without a degree.

Drawing conclusions about a population


based on sample results.

See CAST
1.2.1

More vocabulary

Variables do we look at individually*, or


consider relationships between them**?
*Univariate (1)

Income
Total
from
income
source

Income
source

Qualification

**Bivariate (2)
underC

wage

$234

$234

**Multivariate schoolC

wage

$399

$399

Self employ

$196

$304

Govt trans

$150

$200

wage

$298

$298

other

$50

$679

(2 or more)

sixthform
highersch
sixthform
highersch

Types of Variables

Data
Categorical Variables

Numeric Variables

(defined categories)

(Quantitative)

Ordinal
Nominal

(Ordered
categories)

Discrete
(counted items)

Continuous
(Measured
characteristics)

Types of Variables
Categorical (Nominal)

See CAST 1.2.2


Berenson 1.4

Simply classifies data into categories e.g. marital status, hair


colour, gender

Categorical (Ordinal)
Classifies data into ordered categories e.g. letter grades, tennis
rankings, Likert scales

Numerical (Discrete)
Counted items (finite number of items) e.g. number of children,
number of people who have type O blood

Numerical (Continuous)
Measured characteristics (infinite number of items) e.g. weight,
height, temperature, income

Collecting Data
1.

See CAST 1.2.6


Berenson 1.3

Important sources:
Data distributed by organisation or individual
Designed experiment
Survey
Observational study

2.

Data sources can be classified as:


Primary sources collected for your problem
Secondary sources collected for someone
elses problem

Using Available Data


(secondary sources)
Print or Electronic
Usually produced for some other purpose
but may help us answer a present
question.
e.g. What is the average weekly income
for those with degrees?
Use published New Zealand Income
Survey data from Statistics NZ

Experiments
Experimentation allows us to study the
specific treatments that are of interest.
Control for other (confounding) variables.
Draw conclusions of the effect of one
variable on another.
Study the combined effects of several
factors simultaneously
EXAMPLE does shelf height affect sales of a
certain supermarket item?

Observational Data
In an observational study the researcher
observes subjects (usually in natural
setting) and measures variables of interest
but does not impose any treatment.
EXAMPLES
Last months sales figures
Focus groups

Surveys
No control over behaviour
Questions asked

Summary questions
Suppose I am interested in knowing the characteristics
of students who are enrolled this semester at Massey
University. As well as using information from the
Massey University database I select a random sample of
250 students enrolled in at least one paper this semester
and ask each student to complete a questionnaire.
The population of interest is:
All students enrolled at Massey University in Semester 1 2015
The sample is:

The 250 students surveyed

Summary questions continued


The average (mean) age of all semester 1, 2015,
parameter
students is calculated. This is a
Suppose one question asked in the survey was
about weekly income. The average weekly income
of the 250 students is a
statistic
The average weekly income and a graph of weekly
income are an example of
Descriptive statistics
It is reported that the majority of students own a car.
This is most likely an example of Inferential statistics
Two sources of data are used. What are they?
(Secondary) electronic and (primary) survey

Classify the following datasets as categorical


nominal, categorical ordinal, numerical discrete or
numerical continuous.
Age of students
Numerical continuous
Owning a car
Income

Categorical nominal
Numerical continuous

The number of papers a student is enrolled in


Numerical discrete
The time required to make each coffee in a caf
Numerical continuous
The ranking of four espresso machines after they
have been designated as excellent, good,
satisfactory or poor
Categorical ordinal
telephone area codes

Categorical nominal

Reasons for Drawing a Sample


See CAST 1.3.1
Berenson 7.4

Less time-consuming than a census


(census = sample entire population)
Less costly to administer than a census
Less cumbersome and more practical to
administer than a census of the targeted
population

Population & Sampling Frame


The Target Population is the complete
set of individuals, objects, etc that we
want information about.
The Sampling Frame is a list of
individuals, objects, etc in the population
from which the sample will be drawn.
Ideally this list includes all of (and only)
the Target Population. Often they are
NOT the same.

Population vs Sampling Frame


Target Population -

all independent people in NZ aged 15 and over

Ineligible
Nonresponse
Those selected in
sample who failed
to respond young males often
in this group

Sampled
population
The 29, 000 who
completed the
questionnaire

long-term residents
of homes for the
elderly, hospitals
and psychiatric
institutions

Sampling
frame

Summary: Terminology of Data


Collection

Population
Parameter
Sample
Statistic

Univariate
Bivariate
Mulitvariate

Categorical nominal
Categorical ordinal
Numeric discrete
Numeric continuous

Experiment
Survey
Observational data
Secondary data

Types of Samples Used


Non-probability sample
Items included are chosen without
regard to their probability of
occurrence

Probability sample
Items in the sample are chosen on
the basis of known probabilities

Types of Samples Used


Not studied
in this paper

Samples

Non-Probability
Samples

Judgement
Quota

Chunk
Convenience

Probability Samples

Simple
Random

Stratified

Systematic

Cluster

Probability Sampling
Items in the sample are chosen based
on known probabilities
Probability Samples

Simple
Random

Systematic

Stratified

Cluster

Simple Random Sample (SRS)


Every individual or item from the frame
has an equal chance of being selected
Selection may be with replacement or
without replacement
Samples obtained from table of random
numbers or computer random number
generators

Choosing a Simple Random


Sample
See CAST 1.3.4
Berenson 7.4

1. Number each individual in population


2. Randomly choose the required number
of individuals
Could use random number generator
on calculator ( random or ran#)
Or use =rand() or =randbetween (a,b)
in Excel

Systematic Samples

See CAST 1.4.5


Berenson 7.4

Decide on sample size n


Divide frame of N individuals into groups of k
individuals: k = N/n
Randomly select one individual from the 1st
group
Select every kth individual thereafter

Systematic Samples
Randomly choose a
position (in this case 3rd)
select the 3rd individual in
each group

N = 64
n=8
k=8

First Group

Stratified Samples

See CAST 1.4.1


Berenson 7.4

Divide population into two or more subgroups


(called strata) according to some common
characteristic

A simple random sample is selected from each


subgroup, with sample sizes proportional to strata
sizes

Samples from subgroups are combined into one

Stratified Samples

Population
Divided
into 4 strata

Sample

Stratified Samples - Example


A researcher wants to survey company Internet
presence, and believes that the size of the
company is an important factor.
Stratifies companies into 3 sizes Large,
Medium, Small
Chooses random samples of 3 Large, 10 Medium
and 30 Small companies to survey

Cluster Samples

See CAST 1.4.2


Berenson 7.4

Population is divided into several clusters,


each representative of the population
A simple random sample of clusters is
selected
All items in the selected clusters can be used, or items
can be chosen from a cluster using another probability
sampling technique (two-stage sampling)

Cluster Samples

Population
divided into 16
clusters.
Randomly selected
clusters for sample

Cluster Samples - Example


A nationwide bank employs market research
company to conduct independent face-to-face
customer services manager satisfaction survey
Researcher chooses ten bank branches at
random, conducts survey interviews of all
customer services managers at those branches

Advantages and
Disadvantages
Simple random sample and systematic sample
Simple to use
May not be a good representation of the
populations underlying characteristics

Advantages and
Disadvantages

See CAST 1.4.4


Berenson 7.4

Stratified sample
Ensures representation of individuals across the
entire population

Cluster sample
More cost-effective
Less efficient (need larger sample to acquire the
same level of precision as SRS)

Evaluating Survey Worthiness


What is the purpose of the survey?
Is the survey based on a probability sample?
Survey Errors two types:
Non-sampling Errors

See Berenson 7.5

Sampling Errors

Non-sampling Errors

See CAST
1.4.6 1.4.8

Coverage error (leads to selection bias)


Occurs if some groups are
excluded from the frame
and have no chance of
being selected

Excluded from
frame

Non-response error (leads to non-response bias)


People who do not
respond may be different
from those who do respond

Follow up on nonresponses

Non-sampling Errors (continued )


Measurement or Instrument error (good
questions elicit good responses)
Ambiguous wording of questions
Halo effect
Respondent error

Bad or leading
question

Clerical or recording errors may also occur

Sampling Errors

See CAST
1.3.3

Sampling error (margin of error) - always


exists
Variation from sample to sample will always exist
Reduced by taking a larger sample
Random differences from
sample to sample

To Do:
Complete background questionnaire and
maths quizzes on Stream
Sign up for Computer tutorials and
workshops (on Stream)
Complete the first part of the Computer
tutorial
Attempt questions for this section in study
guide (solutions on Stream)
Start Assignment 1 question 1

Das könnte Ihnen auch gefallen