Sie sind auf Seite 1von 4

Boston International College | MBA Class of 2021 | Trimester 1, 2019

Business Statistics | Madhav Adhikari | August 19, 2019

Data: Any raw information

Data can be of two types;

1. Quantitative (numeric)
a. Discrete
i. Only assumes whole integers
b. Continuous
i. All possible value can be assumed.
2. Qualitative (categorical)
a. Nominal (values are not necessarily in a order)
b. Ordinal (values are in an order)

Correlation (Bivariate Analysis)

If the change in the value of one variable changes the value of other variable, then the variables are said
to be correlated variables.

Bivariate analysis is the analysis of any two variables and correlation is the study of the relationship
between the variables.

Multivariate Analysis: analysis of the effect of more than two variables.

e.g.

1) price and sales of a commodity. (negative correlation)


2) Income and expenditure of a family. (positive correlation)
3) Amount deposited in a bank and the profit earned. (positive correlation)
4) Age of a child and his weight. (might not be true always but is mostly true) (positive correlation)
5) Speed of a motor bike and the probability of the occurrence of an accident. (positive correlation)
6) Pressure and volume of gas. (negative correlation)

Types of Correlation

A. Positive and negative correlation.


Positive correlation:
when we increase one variable and the other variable increases as well, then it is positive
correlation.
Also, when we decrease one variable and the other variable decreases as well, then it is positive
correlation.
The change is in the same direction.

Negative Correlation:
when we increase one variable and the other variable decreases, then it is negative correlation.
Also, when we decrease one variable and the other variable increases, then it is negative
correlation.
The change is in the other direction.
B. Linear and non-linear correlation.
Linear Correlation:
If there is a constant change (at the same rate), then it is linear correlation.
X 5 6 7 8 9
y 50 100 150 200 250

Non-linear Correlation:
If there is variable change in the data, then that is non-linear correlation.
X 5 6 7 8 9
y 50 1000 10050 200700 250800

C. Partial, multiple, and simple correlation.

Simple Correlation – If we only have two variables in our study then it is simple correlation.

Partial Correlation - If we have multiple variables in our study but we only take only two (or
portion of the variables but not all) into account keeping the rest of the variables (that aren’t
taken into account) constant, then it is partial correlation.

e.g. sales ---- price, advertisement, quality.


Correlation between sales and price keeping rest constant.
Correlation between sales and price and advertisement keeping quality constant.

Multiple Correlation – if we have multiple variables in our study and we take into account all the
variables and their correlation, then it is multiple correlation.

Methods of measuring Correlation:

1. Scatter Diagram
2. Karl Pearson Correlation Coefficient
3. Spearman Rank Correlation Coefficient (Do not need to study this for this course)

We will only study methods 1 and 2.

1. Scatter Diagram (only determines linear or non-linear correlation)


a. Highly positively correlated
When there is a gradual increase in y when x is increased but not a perfect increase.

b. Highly negatively correlated


When there is a gradual decrease in y when x is increased but not a perfect decrease.

c. Perfect positive correlation. (Not usually possible in real life).

Page 2 of 4
When there is a constant increase in y when x is increased.

d. Perfect negative correlation.


When there is a constant increase in y when x is increased.

e. Uncorrelated
When there is no decipherable relationship between the increase or decrease x or y.
(e.g. wave, circle)

2. Karl Pearson Correlation Coefficient


A mathematical method of measuring relationship among the dual variables is called Karl
Pearson Correlation Coefficient. It measures the degree (high, low, perfect) and direction
(positive, negative) of the relationship among the variables.

It is denoted by r and is given by


𝑐𝑜𝑣(𝑥, 𝑦)
𝑟=
√𝑣(𝑥)√𝑣(𝑦)

V(x) = variance of X
V(y) = variance of Y
Cov(x,y) = covariance of x and y.
and
1
𝑣(𝑥) = ∑(𝑥 − 𝑥̅ )2
𝑛
1
𝑣(𝑦) = ∑(𝑦 − 𝑦̅)2
𝑛
1
𝑐𝑜𝑣(𝑥, 𝑦) = ∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅)
𝑛
Substituting the values of v(x), v(y), and cov(x,y) in the initial equation);
1
∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅)
𝑟= 𝑛
√1 ∑(𝑥 − 𝑥̅ )2 √1 ∑(𝑦 − 𝑦̅)2
𝑛 𝑛

∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅)


𝑟=
√∑(𝑥 − 𝑥̅ )2 √∑(𝑦 − 𝑦̅)2

Final form for simplification


𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑟=
√𝑛 ∑ 𝑥 2 − (∑ 𝑥 2 ) √𝑛 ∑ 𝑦 2 − (∑ 𝑦 2 )

Page 3 of 4
Properties of r
1. The value of r is always between -1 and +1.
2. R is symmetric i.e. 𝑟𝑥𝑦 = 𝑟𝑦𝑥
3. The correlation coefficient r is independent of change of origin (addition or subtraction) and
scale (division).
x Y 𝑥 − 75 𝑦 − 75
𝑈= 𝑉=
10 10
50 125
60 135
70 120

Correlation between x and y equals U and V as per #3.

4. The correlation coefficient r is a geometric mean of regression coefficient.


𝑟 = ±√𝑏𝑦𝑥 × 𝑏𝑥𝑦

Interpretation of r

1. If r = 0, uncorrelated.
2. If r = +1, perfect positive correlation.
3. If r = -1. Perfect negative correlation.
4. If r is near to +1, highly positively correlated.
5. If r is near to -1, highly negatively correlated.
6. If r is near to 0, low positive/negative correlated.
7. If r is near to 0.5, moderately positive/negative correlated.

Page 4 of 4

Das könnte Ihnen auch gefallen