Sie sind auf Seite 1von 48

In the Name of ALLAH, the beneficent, the Merciful,

“O Allah, send your salutations upon Muhammad (PBUH) & on the Family
of Muhammad (PBUH) as you sent your salutations upon Ibrahim & on the
Family of Ibrahim verily you are Most Praiseworthy & Glorious…”

Quantitative Methods for


Decision Making Techniques
A Practical and Philosophical approach
By,
Rahim Anwer Ali
Faculty, IOBM
Why Decision System?
What is Statistics (A science or an art?)
• An activity of obtaining data and then;
• Compiling, summarizing, presenting, analyzing,
interpreting and….
• Drawing conclusions, is called Statistics.
In short it is;
Data  Process  Information/Conclusions
• Statistics is sort of a mixture of science and art,
till process it is a SCIENCE and drawing
conclusions is an individual’s ART.
What is DATA (A word or a Keyword?)
• DATA is a group of raw facts and figures which
may VARY from;
• Person to Person, Object to Object, Distance
to Distance and Time to Time….
• Only the absence of VARIATION can cause a
CONSTANT and it doesn’t exists in our physical
world. Only spiritualism can define a
CONSTANT.
Data v/s Variable
• Variable is the storage of data, its being represented by letters X,Y,Z etc.
There are two types of variables:

• Qualitative Variable: It deals with the data which may vary by it kind,
which provides labels, or names, for categories of like items, i.e. a set of
observations where any single observation is a word or code that
represents a class or category.
• Gender, Complexion, Weather, Type are some examples

• Quantitative Variable: It deals with the numeric data, which measures


either how much or how many of something, i.e. a set of observations
where any single observation is a number that represents an amount or a
count.
• Age, Height, number, price are some examples of Quantitative variable.

Source: http://www.microbiologybytes.com/maths/1011-17.html
Inactivity breaker …
Object: Allocate a blank page from your writing material and divide that page into
two columns in the following manner:

Qualitative Variables Quantitative Variables


1- Gender 1- Age
2- Complexion 2- Height
3- Qualification 3- Weight
4- Weather 4- Price

20. 20.

Try to write atleast 20 variables in each column by observing several fields like
management, agriculture, medical, engineering, geology etc. Submit the same
sheet by writing your full name on the top.
Data Sources
There are three major sources of data:
1. Survey/Census:
An official, usually periodic enumeration of a population,
often including the collection of related demographic
information, is called census. Survey means to inspect and
determine the conditions of interest. www.surveymonkey.com
2. Experiment:
Any activity, which is usually being conducted within an
isolated atmosphere, and produces results, is called
experiment.
3. Simulation:
An artificial way of data collection.
Question of the Day….
What do you think about Quality of
the following in IBA??
1- Teaching 1,2,3,4,5
2- Administration 1,2,3,4,5
3- Structure 1,2,3,4,5
Where 1-Very Poor 5-Excellent
Data Collection/compilation
• Teaching Ranks where 1-Very Poor, 5-Excellent
4.5 3.7 4.3 3.3 2.7 4.7
3.8 4.5 3.4 4.0 3.8 2.7
4.3 3.4 3.2 3.7 3.9 3.8
3.8 3.7 3.6 5.0 4.2 4.1
4.2 4.1 3.9 4.5 5.0 3.7
4.8 3.2 4.2 4.5 4.2 5.0
2.9
• Data collection/compilation is needed for getting
actual behavior of the variable.
Note: The above data is simulated version of the actual.
Data Tabulation (Grouping Exercise)
Step # 01: Finding the range Class Intervals Frequency

Range = Max. – Min


Min ______ Min+h
Range = 5 – 2.7 = 2.3
Step # 02: Min+h ___ Min+h+h

Finding the number of classes


Min+h+h _______ ….
No. of classes = 1 + 3.3 log(n)
= 1+3.3 x log(37)= 6.175 …

Step # 03: Finding the width (h) … ____________ Max


h = Range/No. of classes = 0.4
Data  Process  Information
Ranks Frequency Histogram
12
2.7 3 10

3.1 5 8

Frequency
6
3.4 10
4
3.8 9 2

4.2 5 0
2.7 3.1 3.4 3.8 4.2 4.6
4.6 5 Ranks

The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Data  Process  Information
Ranks Frequency Histogram
12
2.7 7 10

3.1 11 8

Frequency
3.4 9 6

4
3.8 6
2

4.2 3 0
2.7 3.1 3.4 3.8 4.2 4.6
4.6 1 Ranks

The above mentioned frequency distribution table and the Histogram are
revealing the shape of thoughts generated from the minds of students. If we
discover a subsequent Mathematical Model, it will called a Probability
distribution.
Grouping the data (MSEXCEL)

Data Analysis option is located in the “Data menu”, in case if it is not


present there we can activate it by running the Add-Ins present in
“Excel Options”.
Grouping the data (MSEXCEL) cont…
After providing
“data-range” and
hitting the Labels
and Chart-output
options, we can
find the histogram
either in the new
worksheet or in the
specific place of the
existing sheet.

Bin numbers These numbers represent the intervals that you want the
Histogram tool to use for measuring the input data in the data analysis.
Statistical Measures (An introduction)
• The phrase “descriptive statistics” is used generically in place
of statistical measures.
• These statistic(s) describe or summarize the qualities of
data.
• Another name is “summary statistics”, which we mostly used
to ornament our reports/cases/research.
• This would be beneficial if graphical summary is not enough
sufficient for the final conclusions.

Processing Processing
Data By Graph By Measure
Conclusions
Statistical Measures (An Example)
Consider the following group data:

Class Frequency Relative Cumulative


Intervals Frequency Relative Frequency
(R.F.) (C.R.F)
2—4 2 2/25 = 0.08 0.08
4—6 5 5/25 = 0.20 0.28
6—8 9 9/25 = 0.36 0.64
8—10 7 7/25 = 0.28 0.92
10—12 2 2/25 = 0.08 1.00
f=25 R.F.=1

The above data showing Income in 1000’s of Rupees of some individuals in


late 1980’s
Statistical Measures (Quartiles)
• These are 3 values respectively represented by Q1, Q2
and Q3 and divides the data into 4 equal parts.
• Each part contains 25% observations
• Quartiles Usually highlight 4 different classes i.e.
Lower class, Lower Middle, Upper Middle and Upper
class.
25% 25% 25% 25%
Lower Lower Upper Upper
Class Middle Middle Class

Min Q1 Q2 Q3 Max
Computing Quartiles
In order to computer Quartile Values, we need to
consider the same frequency distribution in addition to
the column of Cumulative Frequency.

Class Frequency Cumulative


Intervals Frequency (C.F.)
2—4 2 2
4—6 5 7
6—8 9 16
8—10 7 23
10—12 2 25
f=25
Computing Quartiles (Procedure)
For any group-data, quartiles can be computed by following two
simple steps:
Step-1: Finding the location of ith Quartile: (where i=1,2 and 3)
𝑖 × σ𝑓
4
Step-2: Finding the value of ith Quartile:
ℎ 𝑖 × σ𝑓
𝑄𝑖 = 𝑙 + − 𝐶. 𝐹.
𝑓 4
Where l = lower limit of captured class, h=class-width, f=class
frequency, C.F.=previous class C.F.
Computing Quartiles (Demo)
Class Frequency Cumulative 1st
Intervals Frequency (C.F.) Quartile
Class
2—4 2 2
4—6 5 7
6—8 9 16
8—10 7 23
10—12 2 25
f=25
Step-1 (For Q1): (1 x 25) / 4 = 6.25

Step-2: Q1=4+2/5 (6.25 - 2) = 5.7


Note: Class width=h=2
Quartiles (Income Classes)
25% 25% 25% 25%
Lower Lower Upper Upper
Class Middle Middle Class

Min Q1 Q2 Q3 Max

2000 5700 7222 8786 12000

Quartiles can be computed using MSEXCEL, ungroup


form of data is needed there, the syntax is given below:
=QUARTILE(Data Range,i) where i=1,2,3 showing
quartile numbers.
Computing Quartiles from Ungroup
Data
• We must sort the Data before proceeding, for e.g.
2 2.5 3 5
1st 2nd 3rd 4th
Hence we can obtain our quartile values by following two
simple steps:
Step-1:
For Q1 => 1(4+1)/4
th
1.25 Value.
Q 1=2+0.25 (2.5 – 2) =2.125
Step-2: Q1=1st value + fraction (2nd Value – 1st Value)
Computing Quartiles (Contd)
• Consider the same sorted data :
2 2.5 3 5
1st 2nd 3rd 4th
Step-1: For Q2 ; 2(4+1)/4 = 2.5th Value
Step-2: Q2=2nd value + 0.5 ( 3rd – 2nd )
Q2= 2.5 + 0.5 (3 – 2.5) = 2.75
Finally, for Q3; 3(4+1)/4 = 3.75th Value.
Q3=3 + 0.75 (5 – 3) = 4.5
Where, Min=2 and Max. = 5
Computing Quartiles (Contd)
• Following is the Box-Plot of Treatment A:

Box-Plot of Treatment A
Max=5
5

Q3=4.5
4
A

Q2=2.75
2

Q1=2.125
Min=2
Quartiles, Deciles and Percentiles
Quartiles: Deciles: Percentiles:
To divide the data To divide the data To divide the data
into 4 equal parts. into 10 equal parts. into 100 equal parts.
Quartiles are Deciles are Nine Percentiles are Ninty
three values Q1, values D1, D2 , D3 … nine values P1, P2,….
Q2 and Q3 D9. P99

Step-1: Step-1: Step-1:


i=1,2,3 i=1,2,3,…,9 i=1,2,3,…99

Step-2: Step-2: Step-2:


Practice Questions
Q. What should be the interval of income which covers middle
50% individuals?
Ans. 5700 to 8786

Q. What should be the interval of income which covers middle


40% individuals?
100%

30% 40% 30%


Min Max
D3 D7

Q. What should be the interval of income which covers middle


30% individuals?
Exploratory Data Analysis (EDA) by Sir
John Wilder Tukey
There are two types of studies:
• Hypothetical Study
• Exploratory Study
In Exploratory study, we can perform our
analysis by avoiding conventional
methodologies. In EDA, we can observe the
trend of data by applying different processes
on the data.
• The Box-plot is a very useful part of EDA.
The Box-Plot
Boxplot of Teaching

Inter-quartile Range=Q3-Q1

Min Q1 Q2 Q3 Max
3 4 5

Teaching Ranks
Processing Data using Box-Plots
Boxplots of Female Ages - Male Ages
(means are indicated by solid circles)

45

Males are Less Variable


More Variable Younger than
Less Consistent Females More Consistent
35

Heterogeneous Homogenous

More Diversed
Less Diversed
25
Female A

Male Age
Exploratory Analysis for Quality ranks
from Aventis Field Managers
Boxplots of Teaching, Administration & Structure
(means are indicated by solid circles)

2
Admin
Teaching

Structur
Statistical Measures (Central Tendency)
(Mean, Median and Mode)
• The main problem associated with the mean
value of some data is that it is sensitive to
outliers.
• The median is simply the middle value
among some scores of a variable. It’s the 2nd
Quartile (Q2) of any data.
• The most frequent response or value for a
variable. Multiple modes are possible:
bimodal or multimodal.
Mean, Median and Mode
Measurements are on x-axis and frequencies are on y-axis

The Mode is based on the principal of democracy, while


median (Q2) follows the rule of moderation. Mean took its
place after being influenced by the higher values of
measurements. The above mentioned distribution is +vely
skewed.
Mean and Mode (Computations)
Class Frequency Mid-Points f i  xi
Intervals fi xi
Modal 2—4 2 (2+4)/2 =3 23
Class
4—6 f1=5 (4+6)/2 =5 55

6—8 fm=9 (6+8)/2 =7 97

8—10 f2=7 (8+10)/2 =9 79


10—12 2 (10+12)/2 =11 211
fi=25 f i  xi=179

Mode= 7.333 = 7333/- = 7160/- is the


Majority’s Income Average Income
Empirical relationship b/w
Mean, Median and Mode
• Following are the values for Mean, Median
and Mode obtained from the Income data:
Mean 
 fx i i

179
 7.160
f i 25
Median  Q2  7.222
 f m  f1 
Mode  l     h  7.333
 2 f m  f1  f 2 

Mean  Median  Mode (Thus the data is slightly  vely skewed )


Arithmetic Mean, Geometric Mean
and Harmonic Mean
• For any ungroup data, The Arithmetic Mean is:
Where xi are the observations and n
is the sample size
• For any ungroup data, The Geometric Mean is:

• For any ungroup data, The Harmonic Mean is:


Arithmetic Mean, Geometric Mean
and Harmonic Mean
• Consider the Following ungroup data and compute
A.M. , G.M. and H.M.:
XI : 1,2,3,4,5 n=5

• A.M. = (1+2+3+4+5)/5
= 15/5 = 3.0
• G.M. = (1x2x3x4x5) 1/5
= (120) 1/5 = 2.6052
• H.M. = 5 / (1/1+1/2+1/3+…+1/5)
5/2.28333 = 2.1898
Theorems related to AM, GM & HM
Empirically prove the following Theorems:

Theorem No. 1:
AM>GM>HM
3.0 > 2.6052 > 2.1898
Theorem No. 2:
AM x HM  GM2
3.0 x 2.1898  2.60522
6.569  6.7870 diff. = 0.22
Arithmetic Mean, Geometric Mean
and Harmonic Mean for Group Data
• For any Group data, The Arithmetic Mean is:
Where xi are the Mid-Points and fi are
class frequencies.
• For any Group data, The Geometric Mean is:

• For any Group data, The Harmonic Mean is:


AM, GM & HM (Computations)
For A.M. For G.M. For H.M.
Class Frequency Mid-
Intervals fi Points
fi × xi xi fi f i / xi
xi
2—4 2 3 2×3
2×3 32 2/3
4—6 5 5 5×5 55 5/5
6—8 9 7 9×7 79 9/7
8—10 7 9 7×9 97 7/9
10—12 2 11 2×11 11 2 2/11
fi=25 fi × xi=179  xi fi  fi / xi
Mean, Median and Mode
• MSEXCEL syntaxes for finding three measures
of central tendency are;

• =Average(Data Range) For Mean


• =Quartile(Data Range,2) For Median
• =Mode(Data Range) For Mode
Statistical Measures (Dispersion)
What is DISPERSION??
A dart-game can help us in this…
Based on the visual observation; we
can declare Player-A as a winner
because:
Player A is,
•More consistent/Less
Variable/Homogenous/Less Dispersed
And
Player B is,
•Less Consistent/More
Variable/Heterogeneous/More
dispersed
Measures of Dispersion
Some Important Measures of Dispersion are:
• Range=Max-Min
• Variance
• Standard Deviation
• Mean Deviation
• Inter-quartile Range
• Coefficient of Variation (C.V.)
Dispersion Measures (Cont…)
 xi  x 
2

Variance  V ( X ) 
n
Variance of the following
ungroup data:
X: 1,2,3,4,5
Mean=3

Standard Deviation= V (X )    2
=1.414 ???
Coefficient of Variation (Consistency Check)

• In order to check whether the variable is


consistent or not, we need to compute the
coefficient of variation,
V (X ) 
C.V .  100  100
X 
• For any consistent variable, C.V. < 100%
• C.V. is the unit-less measure of dispersion.
Variance & Standard deviation (group-data)
Class Frequency Mid-Points f i  xi f i (xi-mean)2
Intervals fi xi
2—4 2 (2+4)/2=3 23 2(3 - 7.16)2=34.61

4—6 5 (4+6)/2=5 55 5(5 - 7.16)2=23.33

6—8 9 (6+8)/2=7 97 9(7 - 7.16)2=0.230

8—10 7 (8+10)/2=9 79 7(9 - 7.16)2=23.69


10—12 2 (10+12)/2=11 211 2(11 - 7.16)2=29.49
fi=25 f i  xi=179 =111.34

 f x  x 
2
111.34
Variance  V ( X )  i i
  4.45
f i 25
Variable Comparison (Property of C.V.)

• Coefficient of Variation for 1,2,3,4,5 (n = 5) is,


V (X ) 1.414
C.V .  100  100  47.1%
X 3
• And for the Income-data (  fi = 25 ); it is,

V (X ) 2.111
C.V .  100  100  29.48%
X 7.16
• So technically, Income data is more consistent
than the first five natural numbers.
Hand-Profile Analysis
(An exploratory approach)
X3 S.No. Measurements (X)
X4 X2
1 X1
2 X2
X5
3 X3
Span (X6) 4 X4
5 X5
6 X6
Thumb 7 X7
(X1) in
cms Determine the Mean,
Length Standard deviation and
(X7) Coefficient of Variation.
Computing Mean and Standard Deviation
Using Scientific Calculators
New Models (ES Series) Prev. Models (MS Series)
Press MODE Press MODE
Select STAT Select SD
Select 1-Var Entering the Data:
Enter the Data in appeared data Obs1 M+
column… Obs2 M+
For Finding Mean and Standard Obs3 M+
Deviation: do it for all remaining data
Press Shift and then press 1 observations.
Select VAR For Finding Mean and Stand. Dev.
Select for mean Press Shift and Press 2
Select Xn for Standard Deviation Select for mean
Select Xn for Standard Deviation

Das könnte Ihnen auch gefallen