April 2016

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA)

1

STATISTICS BASIC CONCEPTS

1. QUALITATIVE AND QUANTITATIVE RESEARCH

2. DESCRIPTIVE STATISTICS AND INFERENTIAL

STATISTICS

3. POPULATION AND SAMPLE

4. SAMPLING METHODS

5. SAMPLE SIZE

6. TYPES OF VARIABLES

issue, problem or phenomenon, you can

choose either a qualitative or a quantitative

methodology.

Qualitative

Research

Vs

Quantitative

Research

Different

Different methods,

methods, tools

tools and

and

procedures

procedures to

to analyse

analyse information.

information.

Objective: Understanding the deeply hidden nature of

phenomena.

Obtaining knowledge about emotions, sensitivity thresholds,

barriers, attitudes, evaluations, desires and needs of a target

group.

Qualitative research is inductive (used to start the research

process).

What matters is what was said, not how many times:

processes and meanings are rigorously examined, but not

measured in terms of quantity, amount or frequency.

QUANTITATIVE RESEARCH

Objective: determining the relationship between an

independent variable and a dependent one.

Quantitative research is deductive (hypothesis are identified

before research begins).

Quantitative research often requires recruiting hundreds of

participants (for reducing the likelihood of biases).

Complementary approaches:

Qualitative

Qualitative

Research

Research

Research

Research subject

subject definition.

definition.

Hypotheses

Hypotheses definition.

definition.

Quantitative

Quantitative

Research

Research

Research

Research hypotheses

hypotheses tests.

tests.

Generalizable

Generalizable conclusions.

conclusions.

Quantitative

enumerates,

and qualitative

explains.

measurable what cannot be measured

Galileo Galilei.

of these conditions apply:

You have no existing research data on

this topic.

You are exploring the reasons why

people do or believe something.

The most appropriate unit of

measurement is not certain

(Individuals? Households?

Organizations?)

The concept is assessed with no clear

demarcation points.

most of these conditions apply:

The research is confirmatory rather

than exploratory (i.e. this is a

frequently researched topic, and

numerical data from earlier research is

available).

You are trying to measure a trend.

There is no ambiguity about the

concepts being measured, and only

one way to measure each concept.

it is not only putting numbers into formulas or computers.

Statistics is concerned with the

collection, organization and

description of a dataset, and

the use of probability theory to

make predictions that are

useful for taking decisions in an

uncertainty context.

.

Statistics is about

learning from data.

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 9

features of the data in a study.

Tables (Frequency Distribution)

Graphs

Statitstics (Calculations)

population from the analysis of the properties of a data sample

drawn from it.

Inference: using facts you have to learn about facts you dont have.

(Gary King)

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 10

Population (universe): the entire set of all individuals, items, or

subjects whose characteristics are being studied. The size of the

population is referred as N.

Parameter: measurable characteristic of a population. For

example, the mean of a population is denoted by the symbol .

Sample: subset of items drawn from a population, and used to test

hypotheses about such population. The size of the sample is

referred as n.

Statistic: measurable characteristic of a sample. Statistics vary

from sample to sample. For example, the mean of a sample is

denoted by the symbol x

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 11

all members of a (small) association,) and then there is no

inaccuracy or error.

But researchers often rely on samples!

Why shall we choose a sample (instead of the entire population)?

Budget

Budget or

or time

time restrictions

restrictions (e.g.

(e.g. unemployed

unemployed people).

people).

Impossibility

Impossibility of

of identify

identify and

and access

access all

all population

population members

members

(e.g.

(e.g. people

people who

who may

may suffer

suffer insomnia).

insomnia).

Sometimes

Sometimes analyzing

analyzing an

an item

item means

means destroying

destroying itit (e.g.

(e.g.

bulbs

bulbs produced

produced by

by aa certain

certain factory)

factory)

12

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 12

WHICH

Sampling methods

HOW

MANY

Sample size

4. SAMPLING METHODS

Objective: obtain a sample that is representative of the population,

so that our findings could be generalized to the whole group.

SAMPLING

METHODS

Probability

Non-Probability

SAMPLING

Probability

Non-Probability

population has a known

non-zero probability of

being selected.

no chance of selection, or the

probability of selection can't be

accurately calculated.

calculated, and inference

can be undertaken.

assumptions regarding the population.

Hence, this sampling does not allows

the estimation of sampling errors and

inference cannot be undertaken

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 15

SAMPLING

Probability

Non-Probability

Convenience Sampling

Systematic Sampling

Judgement Sampling

Stratified Sampling

Quota Sampling

Cluster Sampling

Probability Sampling

population has an equal and known probability of being

selected.

Each one of them is assigned a number, and the sample is

determined by generating random numbers.

Applicable when population is small, homogeneous and

readily available

It requires a complete and accurate record of the population.

It can only be done with small populations where all individuals are

identified.

Probability Sampling

some ordering scheme, then a random start is chosen, and then

and then elements are selected at regular intervals (every kth

element from then onwards) through that ordered list.

effort and cost).

It requires a complete and ordered record of the population.

It can produce biased findings if the population data presents

any hidden order, periodicity or pattern.

Example:

A simple example would be to select every 10th name from the

telephone directory (an 'every 10th' sample, also referred to as 'sampling

with a skip of 10').

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 18

Probability Sampling

heterogeneous non-overlapping groups (strata), which contain fairly

homogenous individuals. E.g. age-groups, genders.

Then each stratum is sampled as an independent sub-population,

out of which individual elements can be randomly selected, and

have the same chance of being selected.

population parameters for each stratum.

knowledge of the population characteristics.

90

18

63

ni

(90/180)x100 =

50

20

(18/180)x100 =

10

(9/180)x100 =

5

4

2

14

(63/180)x100 =

35

is needed,

using

stratified proportional

sampling according to those categories.

The first step is to calculate the weight of each group in the total staff:

50% of the sample individuals should be male full time (20 people), 10%

should be male part time (4 people), 5% should be female full time (2

people), and 35% should be female part time (14 people).

Then a SRS within each stratum would be conducted .

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 20

Probability Sampling

1st population is subdivided into groups (clusters) that are

expected be homogenous amongst each other but heterogeneous

internally, so that each of them is as representative of the

population as possible.

In a 2nd step, a random sample of these clusters is selected, and

either all observations in the selected clusters are included in the

sample (one-stage clustering), or a random sample of elements is

selected within each of these groups (two-stage clustering).

Probability Sampling

Simple when population shows a natural arrangement (e.g.

geographical).

Actual clusters are not completely homogeneous, so the sample

may not be representative

Example:

A chain of hardware stores wants to know the buying profile of its

costumers.

Since it may not be possible to list all of the customers of a chain of

hardware stores, it would be possible to randomly select a subset of

stores (stage 1 of cluster sampling) and then interview a random sample

of customers who visit those stores (stage 2 of cluster sampling).

Non-Probability Sampling

convenience or ease: they are ready available or at hand to

the researcher.

Elements are selected arbitrarily from the population, so the

sample is not representative of the population.

There is no randomness and the likelihood of bias is high, so it is

only adequate for subjective assessments or pilot studies.

She goes early in the morning on a given day, so the people that s/he

could interview would be limited to those given there at that given time,

which would not represent the views of other members of society in such

an area, if the survey was to be conducted at different times of day and

several times per week.

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 23

Sample selection is based on the researchers belief that they

would be appropiate for the study.

Often used in political polling: some districts chosen because

their pattern has in the past provided good idea of outcomes for

whole electorate.

Used very often since it involves a low cost and time effort.

Elements are selected arbitrarily from the population, so the

sample is not representative of the population.

There is no randomness and the likelihood of bias is high, so it is

only adequate for subjective assessments or pilot studies.

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 24

Non-Probability Sampling

In 1st place population is segmented into mutually exclusive subgroups (just as in stratified sampling), following one or more

criterion such as age, income, frequency of purchase, or usage

patterns.

Then, in a 2nd step convenience or judgment of the researcher is

used to select individuals within each group (sample size from

each category is proportional to its weight in he whole population).

It is this second step which makes the technique one of nonprobability sampling

ex-ante in order to obtain a similar structure for the sample.

As a non probability technique, inference cannot be undertaken.

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 25

It depends on:

Research objectives

Need for statistical analysis and degree of accuracy required.

Available resources (time and funds)

Knowledge regarding the target population

5. SAMPLE SIZE

HOW

MANY

INDIVIDUALS COMPRISE THE SAMPLE?

Is sample

size so

important?

Tested

Tested

on

on 26

26

women

women &

&

men

men

Tested

Tested on

on 23

23

women

women

Tested

Tested on

on 18

18

women

women

Tested

Tested on

on 20

20

men

men

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 28

Sample information is

not as accurate and

truthful as population

information.

So, the bigger the

sample is, the more

precise information

offers.

the bigger the sample

is, the more

expensive it is the

sampling process.

Insufficient size

No scientific

scientific results

results

Excessive size

Waste of resources

resources

Degree of variability of

the measured variable

Population size,

N

Confidence level

Population

homogeneity

Sampling Error

(precision required)

samples that can be expected to

include the true population

parameter. (It tells you how sure

you can be)

Sampling Method

Statistical

technique

between the population

parameter and its sample

estimate.

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 31

GP Power 3.1

http://www.psycho.uniduesseldorf.de/abteilungen/aap/gpower3/download-and-register

http://biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize

ST Plan

https://biostatistics.mdanderson.org/SoftwareDownload/Singl

eSoftware.aspx?Software_Id=41

http://stat.ubc.ca/~rollin/stats/ssize/

http://www.stat.uiowa.edu/~rlenth/Power/index.html

http://www.raosoft.com/samplesize.html

http://epitools.ausvet.com.au/content.php?page=SampleSize

http://statpages.org/index.html#Power

6. TYPES OF VARIABLES

Variable: any characteristic or attribute that differs for different

subjects.

Variables are classified

according to their

nature or

measurement scale.

6. TYPES OF VARIABLES

According to

quantitative.

QUALITATIVE or

CATEGORICAL

their

nature,

variables

are

qualitative

cannot be measured or quantified. Such

characteristics are not a number, and, if it is a

number, it cannot be used for calculations

dummy)

Gender

Gender (male/female),

(male/female), consumer

consumer (yes/no)

(yes/no)

Polytomous: more than two categories are defined.

Marital status, religious group, ZIP-Code

ZIP-Code

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 35

6. TYPES OF VARIABLES

QUANTITATIVE or

NUMERICAL

Represent characteristic

measured or quantified.

that

can

be

Number

Number of

of children

children in

in aa household,

household, times

times aa place

place has

has

been

been visited

visited

Continous: variable that can assume any numerical value over

one or several intervals.

Weight, temperature, salary

6. TYPES OF VARIABLES

Codification: assigning a certain number to each category of the

qualitative variable. I.e, using numbers to describe the outcomes.

Gender

a) Male

b) Female

Those numbers do not have any meaning, soy they cannot be used for

calculations

Discretisation: Converting a quantitative variable into a qualitative

variable, according to whether or not the quantitative variable exceeds

a critical threshold.

For the variable Monthly income, we can consider the following categories:

If monthly income >= 5.000 : high income

2.000 =< monthly income < 4.999 : medium-high income

1.000 =< monthly income < 1.999 : medium-low income

Loss of

Monthly income < 999 : low income

information

6. TYPES OF VARIABLES

According to their measurement scale, variables are nominal or

ordinal (if qualitative ), or interval or ratio (if quantitative).

NOMINAL

Numbers

Numbers serve

serve only

only as

as labels

labels for

for

individuals,

individuals, but

but they

they are

are randomly

randomly

Categories

Categories cannot

cannot be

be rank

rank ordered.

ordered.

identifying

identifying

assigned.

assigned.

Gender,

Gender, Marital

Marital status

status

ORDINAL

Categories

Categories can

can be

be ordered

ordered in

in aa hierarchical

hierarchical

fashion,

fashion, but

but values

values cannot

cannot provide

provide relative

relative

distance.

distance.

Ranking

Ranking of

of sportsman,

sportsman, socioeconomic

socioeconomic status,

status, opinion

opinion

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 38

6. TYPES OF VARIABLES

INTERVAL

comparison

comparison between

between different

different individuals.

individuals. Origin

Origin

(zero

(zero point)

point) is

is arbitrary.

arbitrary.

Time,

Time, temperature

temperature (

( C)

C)

RATIO

It provides

provides assignment,

assignment, order,

order, distance

distance and

and

origin

origin properties.

properties. Origin

Origin (zero)

(zero) has

has aa meaning

meaning of

of

absence.

absence.

household

Measurement scale determines which statistical techniques

can be applied.

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA) - 39

6. TYPES OF VARIABLES

with RANDOM VARIABLES.

Realisations of a random variable hinge on probability

A sample/dataset is a collection of realised random

variables

STATISTICS - BASIC CONCEPTS

UCAM MASTER IN BUSINESS ADMINISTRATION (MBA)

41

41

