You are on page 1of 2

Coefficient of determination, R2 is used in the context of statistical models whose

main purpose is the prediction of future outcomes on the basis of other related
information. It is the proportion of variability in a data set that is accounted for by the
statistical model.] It provides a measure of how well future outcomes are likely to be
predicted by the model. R2 measures the strength of the relationship between the
dependent and independent variables, or the strength of the fit between the data and
the regression model; for simple linear regression

There are several different definitions of R2 which are only sometimes equivalent.
One class of such cases includes that of linear regression. In this case, R2 is simply the
square of the sample correlation coefficient between the outcomes and their predicted
values, or in the case of simple linear regression, between the outcome and the values
being used for prediction. In such cases, the values vary from 0 to 1. Important cases
where the computational definition of R2 can yield negative values, depending on the
definition used, arise where the predictions which are being compared to the
corresponding outcome have not derived from a model-fitting procedure using those
data. The coefficient of determination (denoted by r2) is a key output of regression
analysis. It is interpreted as the proportion of the variance in the dependent variable
that is predictable from the independent variable.

R2 is a statistic that will give some information about the goodness of fit of a model.
In regression, the R2 coefficient of determination is a statistical measure of how well
the regression line approximates the real data points. An R2 of 1.0 indicates that the
regression line perfectly fits the data. Values of R2 outside the range 0 to 1 can occur
where it is used to measure the agreement between observed and modeled values and
where the "modelled" values are not obtained by linear regression and depending on
which formulation of R2 is used

R2 does not tell whether the independent variables are a true cause of the changes in
the dependent variable , the correct regression was used or the most appropriate set of
independent variables has been chosen

Statistical measure of relative variation that describes the variation in one value that
occurs in proportion to variation Adjusted R2

Adjusted R2 (sometimes written as ) is a modification of R2 that adjusts for the


number of explanatory terms in a model. Unlike R2, the adjusted R2 increases only if
the new term improves the model more than would be expected by chance. The
adjusted R2 can be negative, and will always be less than or equal to R2 .

Relationship between F and R2

The relationship between the coefficient of determination R2 and F statistic is one-


one. Both can be used as measures of goodness of fit, but both should be interpreted
in relation to the degrees of freedom.

If we take the regression example from the article in which 3 independent variables
on a sample of 50 observations. The F distribution then has 3 numerator and 46
denominator degrees of freedom, and at the 5% level of significance, its critical value
is 2.8068 and R2 > 0.0996. we can see that minimum value of R2 required at 5%
significance level and if sample size increases or independent variables decreases less
explanatory power is required to achieve given significance level. Significance tells
probability exist between independent variable and dependent variable but not that
relation is strong.

But studies do conclude that regression is the best way to find about the relation
between variables and to achieve statistically significance answer.