Sie sind auf Seite 1von 37

Introduction to

Statistics
Chapter - 1
We are into that
journey…
And, we start
with
What is Statistics?
“statistics is a discipline that is dedicated to drawing actionable
insights from available data”

Statistics
Definition
• Science of gathering, presenting, analyzing, and
interpreting data

• Uses mathematics and probability


Parameter vs. Statistic
• Parameter — descriptive measure of the population
o Usually represented by Greek letters

 denotes population parameter


 2 denotes population variance
 denotes population standard deviation

• Statistic — descriptive measure of a sample


o Usually represented by Roman letters

x denotes sample mean


s 2 denotes sample variance
s denotes sample standard deviation
Copyright 2011 John Wiley & Sons, Inc. 6
Parameter vs. Statistic
Population Sample

Subset

Statistic
Parameter

Populations have Parameters,


Samples have Statistics.
1.7
Indian Census
• Every 10 years, the
Govt. attempts to
measure all persons
living in the country.

• The Census 2011 was


the 15th National
census survey
conducted by the
Census Organization of
India.
Branches of statistics
Descriptive Statistics
…If a business analyst is using data gathered on a group to
describe or reach conclusions about that same group, the
statistics are called descriptive statistics. The methods
include:

Graphical Techniques and


Numerical Techniques

The actual method used depends on what information we


would like to extract. Are we interested in…
• measure(s) of central location? and/or
• measure(s) of variability (dispersion)?

1.10
Inferential Statistics
Descriptive Statistics describe the data set that’s
being analyzed, but doesn’t allow us to draw any
conclusions or make any interferences about the
data. Hence we need another branch of statistics:
inferential statistics.

Inferential statistics is also a set of methods, but it is


used to draw conclusions or inferences about
characteristics of populations based on data from a
sample.

1.11
Statistical Inference
Statistical inference is the process of making an
estimate, prediction, or decision about a population
based on a sample.
Population

Sample

Inference

Statistic
Parameter

What can we infer about a Population’s Parameters


based on a Sample’s Statistics?
Data and Data Sets

 Data are the facts and figures collected, analyzed,


and summarized for presentation and interpretation.

 All the data collected in a particular study are referred


to as the data set for the study.
Definitions…
A variable is some characteristic of any entity being studied that
is capable of taking different values.

E.g. student grades, age of a worker, return on


investment, total sales.

Typically denoted with a capital letter: X, Y, Z…

The values of the variable are the range of possible values for a
variable.
E.g. student marks (0..100)
age of a worker (18.. 65)

Data are the observed values of a variable.

E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
age of a worker: {25, 53, 35, 42, 27}
Types of Data
Data Type Information Type Measurement Type

Categorical Do you practice data? Yes No

Discrete How many books do


you have in ur library? Number
Numerical

Continuous What is your height? Centimeters


or Inches
Scales of Measurement
• A measurement is when a standard process is used
to assign numbers to particular attributes or
characteristics of a variable.

• Many measurements are obvious, such as time


spent in a store shopping by a customer, age of a
worker.

• However, some measurements, such as customer


satisfaction, return on investment, have to be
defined by a business researcher.

• Once such measurements are recorded and stored,


they can be denoted as data.
The following are the four common data
measurement levels used
Nominal Level
 For the nominal level of measurement observations of a
qualitative variable can only be classified and counted.

 Example: In which of the following departments do you


work?

1. Marketing
2. HR
3. Information Technology
4. Operations
5. Finance and Accounting
6. Any other (please specify)
OTHER EXAMPLES
• Gender
• Religion
• Geographic location The numbers assigned in a
• Place of Birth nominal scale cannot be
added, subtracted, multiplied
• Telephone number or divided.
• Employee ID number
ORDINAL LEVEL
• The next higher level of data is the ordinal level
• Rating of a finance Professor

RATING FREQUENCY
5 SUPERIOR 6
4 GOOD 28
3 AVERAGE 25
2 POOR 12
1 INFERIOR 3

One classification is “higher” or “better” than the next


one.
However, we are not able to distinguish the magnitude of
the differences between groups.
Examples of Ordinal Scale
• Mutual funds as investments are
sometimes rated as High, Rank Company
medium and low risk. 1 Walmart
2 Exxon Mobil
High Risk is assigned 3 3 Chevron
Medium Risk is assigned 2 4 Berkshire Hathaway
Low Risk is assigned 1 5 Apple
6 General Motors
• Ranking of top 10 most admired 7 Phillips 66
companies in Fortune Magazine
8 General Electric
in 2015. Companies are ranked
by total revenues for their 9 Ford Motor
respective fiscal years. 10 CVS Health

One can compute median, percentiles and quartiles of the distribution.


INTERVAL LEVEL DATA
• This is the next highest level of data measurement.
• It includes all the characteristics of the ordinal level, but
in addition, the difference between values is a constant
size.
• Example: The high temperature on three consecutive
summer days in Delhi are 42, 44 and 43 degrees Celsius.
• There temperatures can be easily ranked, but we can
also determine the difference between temperatures.
Important : Zero is just the point on a scale. It does not
represent the absence of the condition.

One can compute arithmetic mean, standard deviation, correlation coefficient and
conduct a t-test, Z-test, regression analysis and many more….
RATIO LEVEL DATA
• The ratio level is the “highest” level of measurement.
It has all the characteristics of the interval level, but
in addition, the 0 point is meaningful and the ratio
between two numbers is meaningful.
• Examples: Wages, weight, changes in stock prices,
distance between branch offices, and height.
• Father – Son Income Combination

NAME FATHER SON


Laheyo $ 80,000 $ 40,000
Nale 90,000 30,000
Rho 60,000 1,20,000
Steele 75,000 1,30,000
Examples of Ratio Level
• Ratio scales are usually used in organizational
research when exact figures on objective factors
are desired.
1. How many other organizations did you work for
before joining this job?
2. Please indicate the number of children you have
in each categories:
 Over 6 years but under 12
 12 years and over
3. How many retail outlets do you operate?
Data Level, Operations, and
Statistical Methods
Data Level Meaningful Operations

Nominal Classifying and Counting

Ordinal All of the above plus Ranking

Interval All of the above plus Addition,


Subtraction, Multiplication, and
Division (including means,
standard deviations, etc.)

Ratio All of the above


Qualitative and Quantitative Data
• Data can be further classified as being qualitative or
quantitative.
• The statistical analysis that is appropriate depends on whether
the data for the variable are qualitative or quantitative.
• In general, there are more alternatives for statistical analysis
when the data are quantitative.
Qualitative Data
• Labels or names used to identify an attribute of each element.
• Often referred to as categorical data.
• Use either the nominal or ordinal scale of measurement
• Can be either numeric or nonnumeric
• Appropriate statistical analyses are rather limited
Quantitative Data
 Quantitative data indicate how many or how
much:
discrete, if measuring how many
continuous, if measuring how much

 Quantitative data are always numeric.

 Ordinary arithmetic operations are meaningful for


quantitative data.
Scales of Measurement
Data

Qualitative Quantitative

Numerical Nonnumerical Numerical

Nominal Ordinal Nominal Ordinal Interval Ratio


Cross-sectional versus Time
series data
• Cross-sectional data are collected at the same or
approximately the same point in time.
 Example: data detailing the number of building
(warehouse) permits issued in Jan 2015 in each of
the states of India.
• Time series data are collected over several time
periods.
 Example: data detailing the number of building
(warehouse) permits issued in Delhi/NCR, in each of
the last 36 months
Data Sources
• Existing Sources

o Data needed for a particular application might


already exist within a firm. Detailed information is often
kept on customers, suppliers, and employees for
example.
o Substantial amounts of business and economic data
are available from organizations that specialize in
collecting and maintaining data.
Data Sources
• Existing Sources

o Government agencies are another important source of


data.
o Data are also available from a variety of industry
associations and special-interest organizations.
Data Sources
• Internet

o The Internet has become an important source of data.


o Most government agencies, like the Bureau of the Census
(www.census.co.in), make their data available through a
web site.
o More and more companies are creating web sites and
providing public access to them.
o A number of companies now specialize in making
information available over the Internet.
Data Sources
Some Economic & Corporate databases

 Indiastat.com
 Economic Outlook Database :CMIE
 PROWESS Database
 CRISIL Database
Applications in
Business and Economics
Accounting
Public accounting firms use
statistical sampling procedures
when conducting audits for their
clients.

Economics
Economists use statistical
information in making forecasts
about the future of the economy or
some aspect of it.
Applications in
Business and Economics
Marketing
Electronic point-of-sale scanners at
retail checkout counters are used to
collect data for a variety of
marketing research applications.

Production
A variety of statistical quality control
charts are used to monitor the
output of a production process.
• Finance
Financial advisors use price-earnings ratios and
dividend yields to guide their investment
recommendations.

Das könnte Ihnen auch gefallen