Beruflich Dokumente
Kultur Dokumente
Management
Term II
4 credits
MGT 408
Business Statistics
A First course
David M.Levine
Kathryn A.Szabat
David F.Stephan
P.K.Viswanathan
PEARSON PUBLICATIONS 7e
Additional Readings
• Statistics for Business and Economics- Anderson, Sweeney , Williams
• A survey reported women were more likely than men to cite seeing photos
or videos, sharing with man people at one, seeing entertaining or funny
posts, learning about ways to help others, and receiving support from
people in your network as reasons to use Facebook.
• Data are facts about the world and are constantly reported as numbers
by an ever increasing number of sources.
• They can count on other people’s summaries of data and hope they
are correct.
• They can develop their own capability and insight into data by
learning about statistics and its application to business.
Statistics Is Evolving So Businesses Can Use The
Vast Amount Of Data Available
Data Information
The word Statistics derived from the Latin word ‘status’
meaning a state
Statistics is a tool for creating new understanding from a set of
numbers.
Statistics – A way of thinking
Methods that allow to work with data effectively
Method which help to make better decisions
DEFINITION
STATISTICS
COLLECTION
COMPILATION
CLASSIFICATION
PRESENTATION
ANALYSIS &
INTERPRETATION OF DATA
Statistics
• Art and Science of Collecting and Understanding DATA:
• DATA = Recorded Information
• e.g., Sales, Productivity, Quality, Costs, Return, …
• Why? Because you want:
• Best use of imperfect information:
• e.g., 50,000 customers, 1,600 workers, 386,000 transactions,…
• Good decisions in uncertain conditions:
• e.g., new product launch: Fail? OK? Make you rich?
• Competitive Edge
• e.g., for you and your business!
To Properly Apply Statistics Follow A Framework To Minimize
Possible Errors
DCOVA
• Big data
• Collections of data that cannot be easily browsed or analyzed using traditional
methods.
• Use information systems’ methods to collect and process data sets of all sizes,
including very large data sets that would otherwise be hard to examine efficiently
* The total number of data values in a complete data set is the number of
elements multiplied by the number of variables.
Data, Data Sets,
Elements, Variables, and Observations
Observation Variables
Element
Names Stock Annual Earn/
Company Exchange Sales($M) Share($)
Data Set
How Many Variables?
• Univariate data set: One variable measured for each
elementary unit
• e.g., Sales for the top 30 computer companies.
• Can do: Typical summary, diversity, special features
• Bivariate data set: Two variables
• e.g., Sales and # Employees for top 30 computer firms
• Can also do: relationship, prediction
• Multivariate data set: Three or more variables
• e.g., Sales, # Employees, Inventories, Profits, …
• Can also do: predict one from all other variables
Types of Variables
Categorical (qualitative) variables have values that can only be placed
into categories, such as “yes” and “no.”
Ordinal Ratio
The
The scale
scale determines
determines thethe amount
amount of
of information
information
contained
contained in
in the
the data.
data.
The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate.
Levels of Data Measurement
• Students of a university are classified by the school in which they are enrolled using a nonnumeric
label such as Business, Humanities, Education, and so on.
• Alternatively, a numeric code could be used for the school variable (e.g. 1 denotes Business, 2
denotes Humanities, 3 denotes Education, and so on).
• Students of a university are classified by their class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
• Alternatively, a numeric code could be used for the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).
Example of Ordinal Measurement
1 f
6 i
2 n
4 i
s
3
h
5
Ordinal Data
1 2 3 4 5
Numbers or Categories?
• Quantitative Variable: Meaningful numbers
• e.g., Sales, # Employees
• Can add, rank, count
• Qualitative Variable: Categories
• Ordinal Variable: Categories with meaningful ordering
• e.g., Bond rating (AA, A, B, …), Diamonds (VSI, SI, …)
• Can rank, count
• Nominal Variable: categories without meaningful ordering
• e.g., State, Type of business, Field of study
• Can count
Interval Level Data
• Distances between consecutive integers are equal
• The data have the properties of ordinal data, and the interval between observations is
expressed in terms of a fixed unit of measure.
• Interval data are always numeric.
Data
Data can
can be
be further
further classified
classified as
as being
being categorical
categorical
or
or quantitative.
quantitative.
The
The statistical
statistical analysis
analysis that
that is
is appropriate
appropriate depends
depends
on
on whether
whether the
the data
data for
for the
the variable
variable are
are categorical
categorical
or
or quantitative.
quantitative.
In
In general,
general, there
there are
are more
more alternatives
alternatives for
for statistical
statistical
analysis
analysis when
when the
the data
data are
are quantitative.
quantitative.
Categorical Data
Labels
Labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of
each
each element
element
Often
Often referred
referred to
to as
as qualitative
qualitative data
data
Use
Use either
either the
the nominal
nominal or
or ordinal
ordinal scale
scale of
of
measurement
measurement
Can
Can be
be either
either numeric
numeric or
or nonnumeric
nonnumeric
Appropriate
Appropriate statistical
statistical analyses
analyses are
are rather
rather limited
limited
Quantitative Data
Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:
discrete,
discrete, ifif measuring
measuring how
how many
many
continuous,
continuous, if
if measuring
measuring how
how much
much
Quantitative
Quantitative data
data are
are always
always numeric.
numeric.
Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.
Scales of Measurement
Data
Categorical Quantitative
Nominal
Nominal Ordinal Nominal Ordinal Interval Ratio
Types of Data
Data
Categorical Numerical
Examples:
Marital Status
Political Party Discrete Continuous
Eye Color
(Defined categories)
Examples: Examples:
Number of Children Weight
Defects per hour Voltage
(Counted items) (Measured characteristics)
Example
Nominal
Data Level, Operations,
and Statistical Methods
Statistical
Data Level Meaningful Operations
Methods
Cross-sectional
Cross-sectional data
data are
are collected
collected at
at the
the same
same or
or
approximately
approximately the
the same
same point
point in
in time.
time.
Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in February
February 2010
2010 in
in each
each of
of the
the
counties
counties of
of Ohio
Ohio
Time Series Data
Time
Time series
series data
data are
are collected
collected over
over several
several time
time
periods.
periods.
Example:
Example: datadata detailing
detailing the
the number
number of
of building
building
permits
permits issued
issued in
in Lucas
Lucas County,
County, Ohio
Ohio in
in each
each of
of
the
the last
last 36
36 months
months
Time-Series or Cross-Sectional?
• Time-Series Data: Data values recorded in meaningful sequence
• Elementary units might be days or quarters or years
• e.g., Daily Dow-Jones stock market average close for the past 90 days
• e.g., Your firm’s quarterly sales over the past 5 years
• Cross-Sectional Data: No meaningful sequence
• e.g., Sales of 30 companies
• e.g., Productivity of each sales division
• Easier than time series!
Example
Year Unemployment Rate
2003 5.7%
2004 5.4%
2005 4.9%
2006 4.4%
2007 5.0%
2008 7.3%
2009 9.9%
2010 9.4%
Time
Example serie
s
Elementary unit
defined by “year” Quantitative data
Stock Market – Time Series
• Dow Jones Stock Index, monthly since 1928
Dow Jones Industrial Stock Market Index, Monthly from 1928 to early 2011
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
Year
Basic Vocabulary of
Statistics
Basic Vocabulary of Statistics
POPULATION
A population consists of all the items or individuals about which
you want to draw a conclusion.
SAMPLE
A sample is the portion of a population selected for analysis.
PARAMETER
A parameter is a numerical measure that describes a
characteristic of a population.
STATISTIC
A statistic is a numerical measure that describes a characteristic of
a sample.
Population vs. Sample
Population Sample
Subset
Parameter Statistic
Populations have Parameters Samples have Statistics.
Descriptive measures of population descriptive measures of sample
2
denotes population variance
denotes population standard deviation
Symbols for
Sample Statistics
• Collect data
• e.g., Survey
• Present data
• e.g., Tables and graphs
• Characterize data
• e.g., Sample mean =
X i
n
Inferential Statistics
• Estimation
• e.g., Estimate the population
mean weight using the sample
mean weight
• Hypothesis testing
• e.g., Test the claim that the
population mean weight is 120
pounds
Drawing conclusions about a large group of individuals based on a subset of the
large group.
Descriptive Statistics
Population
Sample
Inference
Statistic
Parameter
Select a
random sample
Sources of data collection
Collecting Data Correctly Is A Critical Task
DCOVA
Need to avoid data flawed by biases,
ambiguities, or other types of errors.
Secondary Sources: The person performing data analysis is not the data collector
Analyzing census data
Examining data from print journals or data published on the internet.
Government data: economics and demographics
Media reports – TV, newspapers, Internet
Companies that specialize in gathering data
Sources of data fall into five categories
DCOVA
• Data distributed by an organization or an individual
In
In experimental
experimental studies
studies the
the variable
variable of
of interest
interest is
is
first
first identified.
identified. Then
Then one
one or
or more
more other
other variables
variables
are
are identified
identified and
and controlled
controlled so
so that
that data
data can
can be
be
obtained
obtained about
about how
how they
they influence
influence the
the variable
variable of
of
interest.
interest.
The
The largest
largest experimental
experimental study
study ever
ever conducted
conducted is
is
believed
believed to
to be
be the
the 1954
1954 Public
Public Health
Health Service
Service
experiment
experiment forfor the
the Salk
Salk polio
polio vaccine.
vaccine. Nearly
Nearly two
two
million
million U.S.
U.S. children
children (grades
(grades 1-
1- 3)
3) were
were selected.
selected.
Examples of Survey Data
DCOVA
• A survey asking people which laundry detergent has
the best stain-removing abilities
Studies
Studies of
of smokers
smokers and
and nonsmokers
nonsmokers are
are
observational
observational studies
studies because
because researchers
researchers
do
do not
not determine
determine or
or control
control
who
who will
will smoke
smoke and
and who
who will
will not
not smoke.
smoke.
Examples of Data Collected From Ongoing
Business Activities
DCOVA
• A bank studies years of financial transactions to help
them identify patterns of fraud.
Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happen to be available or were
acquired with little care can lead to misleading
information.
Examples of Types of Variables
DCOVA