Regression Analysis

PROJECT REPORT
ON
REGRESSION
ANALYSIS
[IMBA]
PRESENTED
BY
NAME: - D.SRIKANTH
ENROLL NO: - 6NI14059

INTRODUCATION:-
Trevor Bull - Managing Director
Mr. Trevor Bull joined Tata AIG Life as Managing Director in
January 2006. Prior to this, Trevor was Senior Vice President and
General Manager at American International Assurance in Korea
Tate AIG Life Insurance Company Ltd. and

Tata AIG General Insurance Company Ltd. (collectively "Tata
AIG") are joint venture companies, formed from the Tata Group
and American International Group, Inc. (AIG). Tata AIG
combines the power and integrity of the Tata Group with AIG's
international expertise and financial strength. Tata Group holds
74 per cent stake in the two insurance ventures, with AIG
holding the balance 26 per cent stake.
Tata AI G Life Insurance Company Ltd. provides

insurance solutions to individuals and corporate. Tata AI G Life
Insurance Company was licensed to operate in India on February 12,
2001 and started operations on April, 2001. Tata AIG Life offers a
broad array of life insurance coverage to both individuals and
groups, providing various types of add-ons and options on basic life
products to give consumers flexibility and choice.
Tata AIG Life Insurance Company offers products

in Ahmedabad, Bangalore, Chandigarh, Chennai, Guwhati,
Hyderabad, Jaipur, Jamshedpur, Jodhpur, Kochi, Kolkata, Mangalore,
Muinbai, New Delhi, Pune, Rajkot, Trichi, - Vijay Wada and Lucknow
Objective of the Study

The objective of this study is to
measure the regression analysis method used by TATA AIG in
the city of Hyderabad.
Questionnaire Development
For the purpose of this study, a structured questionnaire was

developed. In this stage, an exploratory study was carried
out using personal and focus group interviews
Collection of Data
The above mentioned questionnaire
was used to collect the primary data. For secondary data,
research papers, journals and magazines were referred.
Regression analysis
In statistics, regression analysis is a collective name

for techniques for the modeling and analysis of numerical data
consisting of values of a dependent variable (also called response
variable or measurement) and of one or more independent variables
(also known as explanatory variables or predictors). The dependent
variable in the regression equation is modeled as a function of the
independent variables, corresponding parameters ("constants"), and an
error term.
The error term is treated as a random variable. It

represents unexplained variation in the dependent variable. The
parameters are estimated so as to give a "best fit" of the data. Most
commonly the best fit is evaluated by using the least squares method,
but other criteria have also been used.
Regression can be used for prediction (including

forecasting of time-series data), inference, hypothesis testing, and
modeling of causal relationships. These uses of regression rely heavily
on the underlying assumptions being satisfied. Regression analysis has
been criticized as being misused for these purposes in many cases
where the appropriate assumptions cannot be verified to hold. One
factor contributing to the misuse of regression is that it can take
considerably more skill to critique a model than to fit a model
Underlying assumptions
Classical assumptions for regression
analysis include:
The sample must be representative of the population for the
inference prediction.
The error is assumed to be a random variable with a mean of zero
conditional on the explanatory variables.
The independent variables are error-free. If this is not so,
modeling may be done using errors-in-variables model
techniques.
The predictors must be linearly independent, i.e. it must not be
possible to express any predictor as a linear combination of the
others. See Multicollinearity.
The errors are uncorrelated, that is, the variance-covariance
matrix of the errors is diagonal and each non-zero element is the
variance of the error.
The variance of the error is constant across observations
(homoscedasticity). If not, weighted least squares or other
methods might be used.
These are sufficient (but not all necessary)

conditions for the least-squares estimator to possess desirable
properties, in particular, these assumptions imply that the parameter
estimates will be unbiased, consistent, and efficient in the class of
linear unbiased estimators. Many of these assumptions may be relaxed
in more advanced treatments.
Regression Analysis that involves two variables

is termed bi-variate linear Regression Analysis. Regression Analysis
that involves more than two variables is termed as Multiple
Regression Analysis.
The Bi-variate linear Regression Analysis
involves Analyzing the straight line relationship between two continues
variables the Bi-variate linear Regression can be expressed as:
Y=+X
Where,
Y represents the dependent variable
X is independent
and are two constraint which are know as regression coefficient.
is slope of coefficient
can be symbolically represented as Y/X
= Yi-Xi
= (Yi-Yj)/ (Xi-XJ)
Least square method
The method of least squares or ordinary least squares

(OLS) is used to solve over determined systems. Least squares are
often applied in statistical contexts, particularly regression analysis.
Least squares can be interpreted as a method of fitting data. The best fit
in the least-squares sense is that instance of the model for which the
sum of squared residuals has its least value, a residual being the
difference between an observed value and the value given by the
model. The method was first described by Carl Friedrich Gauss around
1794.[1] Least squares correspond to the maximum likelihood criterion
if the experimental errors have a normal distribution and can also be
derived as a method of moments estimator. Regression analysis is
available in most statistical software packages.
The relationship between the amount spent on advertisement per

month & number of customer visited because of advertisement
given by TATA AIG Life Insurance Co.
The equation for regression line assume by least square is shown below
Y=a+bX+ci
Where,
Y is dependent variable
X is independent variable
a is a Y intersect
b is a slope of line
The below table shows the amount spent on advertisement &

number of customer visited through advertisement.
AMOUNT N.O OF CUSTOMERS
SPENT VISITED
ON ADVERTISING (IN 000S) [Y]
(IN CRORES)[X]
JAN 3.6 9.3
FEB 4.8 10.2
MAR 2.4 9.7
APR 7.2 11.5
MAY 6.9 12
JUN 8.4 14.2
JUL 10.7 18.6
AUG 11.2 28.4
SEP 6.1 13.2
OCT 7.9 10.8
NOV 9.5 22.7
DEC 5.4 12.3
The constant b can be calculated using formula
b=m (XY)-X Y/n (X2)-(X) 2
X is dependent variable
Y is independent variable
a is calculated as shown below:
a = -b
Where,
= the mean of value of dependent variable
= the mean of value of independent variable

ei= is the error. It is called as residual value.
The criterion for the least squar method is given below.
e2i
i=1
Where
ei = Yi i
Yi is the actual value of the

Dependent variable
i is the value lying on the

Estimated regression line.
Let a solve the example previously discussed
using the least square method.
We need to determine the constant a&b to

develop the regression equation. The required
calculation for determining the constant are shown in
table
AMOUNT N.O OF
2
SPENT CUSTOMERS XY X
ON VISITED
ADVERTISING (IN 000S) [Y]
(IN CRORES)[X]
3.6 9.3 33.48 12.96
4.8 10.2 48.96 23.04
2.4 9.7 23.28 5.76
7.2 11.5 82.8 51.84
6.9 12 82.8 47.61
8.4 14.2 119.28 70.56
10.7 18.6 199.02 114.49
11.2 28.4 318.08 125.44
6.1 13.2 80.52 37.21
7.9 10.8 85.32 62.41

9.5 22.7 215.65 90.25
5.4 12.3 66.42 29.16
x=84.1 Y=172.9 XY=1355.61 XY=1355.61
b = 12(1355.61)-
(84.1)(172.9)/12(670.73)-(84.1)2
= 1.768
The step is to calculate a
To calculate the value of small a we need to first determine the mean
of value of variable X&Y
= 84.1/12
=7.0
= 172.9/12
=14.40
Substituting the value in equation
a = 14.40-(1.768)(7)
= 14.40-12.39
= 2.01
We know develop the estimated regression equation by

substituting the value of a & b in equations
= 2.01+1.768X
represents the estimated value of dependent variable

for a given value of X
The Strength of Association R2
R2 can be calculated using the following formula:

R2 = explained variance/total variance
Total variance = explained variance unexplained variance
Explained variance = total variance unexplained variance
Therefore
R2= total variance unexplained variance/total variance
R2 = 1-unexplained variance/total variance

The unexplained variance is given by (Yi ) 2
The total variance by (Yi - ) 2
R2 = 1-(Yi ) 2 / (Yi - ) 2
X Y XY X2 Y- (Y- ) 2 (- (Y-
) 2 ) 2
3.6 9.3 33.48 12.96 8.37 0.925 0.85599 36.30 26.01
48 2 504 304
4.8 10.2 48.96 23.04 10.4 - 0.08785 15.23 17.64
964 0.296 296 809
4
2.4 9.7 23.28 5.76 6.25 3.446 11.8804 66.37 22.09
32 8 3024 035
7.2 11.5 82.8 51.84 14.7 - 10.4950 0.115 8.41
396 3.239 0816 328
6
6.9 12 82.8 47.61 14.2 - 4.88056 0.036 5.76
092 2.209 464 405
2
8.4 14.2 119.28 70.56 16.8 - 7.08198 6.057 0.04
612 2.661 544 505
2
10.7 18.6 199.02 114.49 20.9 - 5.41772 42.60 17.64
276 2.327 176 956
6
11.2 28.4 318.08 125.44 21.8 6.588 43.4070 54.93 196
116 4 1456 181
6.1 13.2 80.52 37.21 12.7 0.405 0.16418 2.576 1.44
948 2 704 667
7.9 10.8 85.32 62.41 15.9 - 26.8033 2.487 12.96
772 5.177 9984 56
2
9.5 22.7 215.65 90.25 18.8 3.894 15.1632 19.41 68.89
06 36 284
5.4 12.3 66.42 29.16 11.5 0.786 0.61779 8.328 4.41
14 6 996
x= Y= XY=1 XY=1 (Y- (Y-
84.1 172. 355.61 355.61 ) 2 (- ) 2
9 ) 2 =38
=126. 1.29
=7. 855 =25
0 =14. 4.4
40 682
Therefore
R2 = 1- (Yi ) 2 / (Yi - ) 2
= 1- 126.885/381.29
= 1- 0.33
= 0.67
= 67%
Conclusion
This implies that of the total variation of Y, nearly 67% is
explain by the variation in X.
Hence there is strong linear relationship between the two

variables.

Regression Analysis

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Regression Analysis

Hochgeladen von

Copyright:

Verfügbare Formate

PROJECT REPORT

ENROLL NO: - 6NI14059

Tate AIG Life Insurance Company Ltd. and

Tata AI G Life Insurance Company Ltd. provides

Tata AIG Life Insurance Company offers products

Objective of the Study

For the purpose of this study, a structured questionnaire was

In statistics, regression analysis is a collective name

The error term is treated as a random variable. It

Regression can be used for prediction (including

These are sufficient (but not all necessary)

Regression Analysis that involves two variables

Y represents the dependent variable

and are two constraint which are know as regression coefficient.

can be symbolically represented as Y/X

Least square method

The method of least squares or ordinary least squares

The relationship between the amount spent on advertisement per

The below table shows the amount spent on advertisement &

AMOUNT N.O OF CUSTOMERS

ON ADVERTISING (IN 000S) [Y]

The constant b can be calculated using formula

b=m (XY)-X Y/n (X2)-(X) 2

a is calculated as shown below:

= the mean of value of dependent variable

= the mean of value of independent variable

Yi is the actual value of the

i is the value lying on the

We need to determine the constant a&b to

ADVERTISING (IN 000S) [Y]

3.6 9.3 33.48 12.96

4.8 10.2 48.96 23.04

2.4 9.7 23.28 5.76

7.2 11.5 82.8 51.84

6.9 12 82.8 47.61

8.4 14.2 119.28 70.56

10.7 18.6 199.02 114.49

11.2 28.4 318.08 125.44

6.1 13.2 80.52 37.21

7.9 10.8 85.32 62.41

5.4 12.3 66.42 29.16

x=84.1 Y=172.9 XY=1355.61 XY=1355.61

Substituting the value in equation

We know develop the estimated regression equation by

represents the estimated value of dependent variable

R2 can be calculated using the following formula:

Total variance = explained variance unexplained variance

Explained variance = total variance unexplained variance

R2 = 1-unexplained variance/total variance

The total variance by (Yi - ) 2

Hence there is strong linear relationship between the two

Das könnte Ihnen auch gefallen