Sie sind auf Seite 1von 28

Chapter 1

INTRODUCTION TO
STATISTICS

What is Statistics?
Statistics
Statistics is the science of conducting studies to collect, organize,
summarize, analyze, and draw conclusions from data.
Types of statistics:
i) Descriptive statistics
Describe a phenomenon
Consists of the collection, organization, summarization, and
presentation of the data.
ii) Inferential statistics
Consists of generalizing from samples to populations, performing
estimations and hypothesis tests, determining relationships among
variables, and making prediction.
2

Definitions
Population
The collection of all outcomes,
responses, measurements, or
counts that are of interest.

Sample
The collection of data from a subset of
the population.

What is Data?
Data
The responses, counts, measurements, or
observations that have been collected.

Data can be classified as one of 2 types:


1. Qualitative Data
2. Quantitative Data

Qualitative Data
Qualitative Data:
Variables that can be placed into distinct
categories, according to some characteristic
or attribute.
Non-numerical measurements.
Examples:
gender (Male or Female)
Geographic locations
Eye color
etc
5

Quantitative Data
Quantitative data:
Numerical measurements and can be ordered
or ranked.
Examples:
Age
Weights
Temperature
Heights

Quantitative Data:
Discrete vs. Continuous
Discrete data:
finite number of possible data values: 0, 1, 2,
3, 4.
Ex: Number of classes a student is taking

Continuous data:
infinite number of possible data values on a
continuous scale.
Often include fractions and decimals.
Ex: Weight of a baby
7

Measuring Variables
To establish relationships between variables,
researchers must observe the variables and
record their observations. This requires that the
variables be measured.
The process of measuring a variable requires a
set of categories called a scale of
measurement (or measurement scales) and a
process that classifies each individual into one
category.

4 Types of Measurement Scales


1.

Nominal scale

Classifies data into mutually exclusive (nonoverlapping), exhausting categories in which no order or
ranking can be imposed on the data.

is an unordered set of categories identified only by name.

Nominal measurements only permit you to determine


whether two individuals are the same or different.

Example:
1. Classified according to subject taught:
History, Mathematics, English, Psychology
2. Classifying survey subjects as male or female
3. Marital status: Single, Married, Divorced, separated
9

continue
2. Ordinal scale
Classifies data into categories that can be ordered or
ranked.
Ordinal measurements tell you the direction of
difference between two individuals.
Example:
1. Student evaluation (from 1 to 5)
2. Guest speaker (superior, average or poor)
3. Sample size that evaluate (small, medium or
large)
10

continue
3. Interval scale
Classifies data into categories that can be ranked.
An ordered series of equal-sized categories.
Interval measurements identify the direction and magnitude
of a difference.
There is no meaningful zero.
Example:
1. Temperature since there is a meaningful difference of 1F
between each unit, such as 72F and 73F (more hot). 0F
does not mean no heat at all.
2. IQ since there is a meaningful difference of 1 point
between an IQ of 109 and an IQ of 110. IQ test do not
measure people who have no intelligence.

11

continue
4. Ratio scale
is an interval scale where a value of zero indicates none of the
variable or exists a true zero.
Ratio measurements identify the direction and magnitude of
differences and allow ratio comparisons of measurements.
Example:
1. Height
2. Weight
3. Number of phone calls received
4. Salary
5. Age
12

Methods of Collecting Data

Observational study
Survey
Experiment
Simulation

13

Methods of Collecting Data


Observational study
A researcher observes or measures
characteristics of interest of part of a
population but does not change any existing
conditions.

Experiment
A treatment is applied to part of a population
and responses are observed.
14

Methods of Collecting Data


Survey
An investigation of one or more characteristics of
a population, usually be asking people questions.
Commonly done by interview, mail, or telephone.

Simulation
Uses a mathematical or physical model to
reproduce the conditions of a situation or
process. Often involves the use of computers.
15

Sampling Techniques

Random versus Non-Random Samples


Convenience Samples
Simple Random Samples
Systematic Sampling
Stratified Sampling
Cluster Sampling

16

Random and Non-Random


Sampling
Random Sampling
Every member of the population has an equal
change of being selected.

Non-Random Sampling
Some members of the population have no
chance of being picked. Often leads to
biased samples.

17

Convenience Samples
Data is collected that is readily available and
easy to get.

Self-selected surveys or voluntary response


surveys (online surveys, magazine surveys)

Often biased in some way such as selfselection bias when people choose to
participate, because they have an interest in
the issue in question.
18

Simple Random Sample


A random sample where every member of the
population and every group of the same size
has an equal chance of being selected.
This produces an unbiased sample which we
hope is representative.
However, it can be difficult and expensive to
take a simple random sample when dealing
with people.

xx x
xx

xxxxxx x xx x x
x
x
x
xx xx xxxx xx xxxxxxxx xxxxx
x xx xx x xx xxxxxx x
xx
x xxx xxxx xxxxxx xxx x xxx x xxx xxx
x
x xxx xx x xxxx xx
x
x x

19

Systematic Sampling
Choose a starting value or starting point at
random. Then, choose every kth member of
the population.
Example: Select every 3rd patient who enters
the hospitals.

20

Stratified Sampling
Divide a population into at least 2 different
subgroups (strata) that share the same
characteristics (age, gender, ethnicity,
income, etc) and select a random sample
from each group.
Advantages:
Unbiased
Good random representative sample
Obtain more information

21

Cluster Sampling
Divide the population into many like
subgroups (clusters); randomly select some
of those clusters, and then select all of the
members of those clusters to be in the
sample.
Advantage: geographically separately
populations

22

Exercises
Classify each of the following as nominal-level,
ordinal level, interval-level, or ratio-level
measurement.
1. Pages in the city of Malaysia telephone book.
2. Rankings of tennis players.
3. Weights of cupboards.
23

6. Ages of students in a classroom.

7. Marital status of the patients in a physicians office.

8.Temperatures inside 10 refrigerators.

24

Exercises
Classify each variable as discrete or continuous.
1. Number of doughnuts sold each day by Doughnut
Heaven.

2. Weights of cats.

25

4. Lifetime (in hours) of 15 flashlight batteries.


5. Number of cheeseburgers sold each day by a
hamburger stand on a college campus.
6. Number of DVDs rented each day by a video store.

26

Exercises
Classify each variables as qualitative or quantitative.
1. Number of bicycles sold in 1 month by a store.
2. Colors of balloon in a party.
3. Times it takes to drive to school.
4. Capacity in cubic feet of six truck beds.
27

5. Classification of children in a day-care center (infant,


toddler, preschool).
Qualitative
6. Weights of fish caught in Lake Mystery.
Quantitative

28

Das könnte Ihnen auch gefallen