Sie sind auf Seite 1von 8

Demand Estimation by Using Regression Analysis

Regression Analysis a statistical method used to establish a relationship between a variable (Dependent Variable) and other factors that will affect it (Independent Variables). This relationship can be expressed as a functional form: Q = a 0 + a 1 A + a 2 B + a3 C Demand Estimation for a product or service using regression analysis is important in the business world especially to the corporate executives and managers because it will enable them to make reasonable forecast for their goods and services in the near future. The manager can narrow down those factors that are important in influencing their sales and thereby formulate appropriate strategies or policies to achieve their management objectives. The actual process of Regression Analysis can be very complex but it can be summarized into FOUR important steps: 1. Model Specification: Set the objective and identify the important variables which have influence on the dependent variable. 2. Data collected for all the variables specified. 3. Choice of a function form e.g. Linear or non-linear form 4. Estimation and interpretation of results. 1. Model Specification If we want to study the factors affecting the demand for automobiles (Qx) in the country, we must identify the most important variables that are believed to affect the demand for automobiles e.g. a) b) c) d) Price of the automobile Per capita income No. of working population Rate of interest, etc (Px) (Yc) (L) (I)

Qx = f(Px, Yc, L, I,..)

2.

Data collection on the variables. 2 types of data : a) Time Series Data Data is collected for each variable over time (yearly, quarterly, monthly or daily, etc) b) Cross-Sectional Data Data are collected for same time period but from different section or geographical area of the society.

Types of data to be used depend on the availability of data. a) Primary data Data collected from the field through market survey, sampling, & etc. b) Secondary data These are published data by relevant authority such as Statistical Department, Economic Reports, etc. 3. Specifying the form of Equation. i) The simplest model to deal with and the one which is often also the most realistic is the linear model. e.g. Qx = a0 + a1 Px + a2 Y + a3 L + a4 I + ..+ e a0,a1,.,a4 are parameters (coefficients) to be estimated e = disturbance term or error term ii) Non- Linear model Sometimes a non-linear form may be the data better than a linear equation. Qx = a0 Px1.Yc 2. L 3. I 4 4. Testing the (Econometric) Result To evaluate the regression results several statistics are examined. a) The sign of each estimated coefficient must be checked to see if it conforms to what is expected on the theoretical grounds. b) Coefficient of Determination, R2 c) t tests (coefficient) d) Durbin-Watson statistics, etc. e) The F-statistics (F-stats) Note : The statistical procedure in solving Multiple Regression Problems can be very complicated. Fortunately there are many computer softwares available to achieve our objective. i.e TSP (Time-Series Processor) or SPSS can be used to solve our problems. 2 (Power Function)

REGRESSION ANALYSIS It describes the way in which one variable is related to another. Regression analysis derives an equation that can be used to estimate the unknown values of one variable on the basis of known values of another variable. (a) Simple Regression Analysis Y = a + bX Example 1 (Taken from ECO556 Manual Table 4.1, page 136 ) Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Sales (Y) (million dollars) 44 58 48 46 42 60 52 54 56 40 Advertising Expenditure (X) (million dollars) 10 13 11 12 11 15 12 13 14 9 where Y is sales volume & X is advertising expenditure

The result from computer print out : LS// Dependent variable is SAL SMPL range 1986 - 1995 Number of observation 10 Variable C ADV Coefficient 7.6000000 3.5333333 0.851212 0.832614 2.860653 1.224915 -23.58417 bX Std. Error 6.332345 0.5222813 T-Stat 1.2001912 6.751919 2-Tail Sig. 0.264 0.000 50.00000 6.992059 65.46667 45.76782

R-squared Adjusted R-squared S.E. of regression Durbin-Watson stat Log likelihood ^ ^ ^ Y = a + =>

Mean of dependent var S.D of dependent var Sum of squared resid F-statistic

^ ^ ^ Y = 7.6 + 3.53X

(b) Multiple Regression Analysis Y = a1 + b 1 X 1 + b 2 X 2 where Y is sales volume X 1 is advertising expenditure X 2 is price of the product Example 2 (Taken from ECO556 Manual Table 4.3, page 141 ) Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Sales (Y) (million dollars) 44 58 48 46 42 60 52 54 56 40 Advertising Expenditure (X1) (million dollars) 10 13 11 12 11 15 12 13 14 9 Price (X2) (million dollars) 1 1.2 2 1.8 2.1 0.8 1.4 2.0 1.5 1.0 , a1 is the intercept , b1 is the Y/X1, marginal effect of adv on sales , b2 is the Y/X2, marginal effect of price on sales

The result from computer print out : LS// Dependent variable is SAL SMPL range 1986 - 1995 Number of observation 10 Variable C ADV P Coefficient 11.60403 3.4936051 -2.3836921 0.877397 0.842367 2.776058 1.41 ^ b1X1 + b2X2 Std. Error 6.9633945 0.5078770 1.9495316 T-Stat 1.6665152 6.8788413 -1.2226999 2-Tail Sig. 0.140 0.000 0.261 50.00000 6.992059 53.94549 25.04734

R-squared Adjusted R-squared S.E. of regression Durbin-Watson stat ^ ^ ^ Y = a1 + => ^ ^ ^ Y = 11.60 +

Mean of dependent var S.D of dependent var Sum of squared resid F-statistic

^ 3.49X1 - 2.38X2

Evaluation of Results (Computer Printouts) These are the importance statistical results should be interpreted: a. The sign of each estimated coefficient b. Coefficient of determination (R2) c. Standard error of estimate (Se) d. The t-statistics (t-stats) e. The F-statistics (F-stats) Interpretation : a. The sign of each estimated coefficient must be checked to see if it conforms to what is expected on the theoretical grounds. From Example 1: ^ ^ ^ Y = 7.6 + 3.53X

The estimated function show positive value (+ 3.53) , so it conforms to the expected economic theory. If we spend $1 on Advertisement (X) then the Sales(Y) will increase by 3.53 units. b. Coefficient of determination (R2) The value of R2 ranges from 0 to 1 R2 = 0 R2 = 1 R2 = 0.85 (it shows that none of the independent variables explain the changes in the dependent variable) (it shows that all the changes in the dependent variable is explained by the variation in the independent variables) (it shows that 85% of the changes in the dependent variables is explained by the variation in the independent variables, advertising expenditure. The other 15% cannot be explaine by the regression analysis. This may be due to the omission of some important independent variables.)

c.

Standard error of estimate (Se) It is a measure of dispersion of data points from the line of best fit (regression line). Actual points do not lie on the regression line but are dispersed above and below the line. Thus, the value predicted by regression line will be subjected to error. Therefore, the Se measures the probable error in the predicted value. For example, data from table 4.1, when the advertising expenditure is $9 the sales is $40. If we use the regression results, the sales is $39.37. Therefore the value predicted will have an error. The std. error of estimation can be calculated by using the following formula: n Se = t=1 n-k Se is useful to estimate the range within which the dependent variable will lie at a specified probability. At 95% probability the dependent variable will lie in the predicted interval of : Y + t n k * Se Where Y is the predicted value of dependent value based on the regression, n k is the degree of freedom (df), it is used to get the critical value for students distribution, n is the number of observation and k is the number of coefficient estimated.

(Y t Y) 2

Example : Se = 2.8 Y = 39.37 then At 95% confidence interval of sales when Adv. Exp. (X) = 9 and Y + t n k * Se => 39.37 + (2.306)(2.8) 39.37 + 6.457 Thus, at 95% C.I. when adv. Exp. Is $9 million, the range of Sales from $32.913 to $45.827 million d. T-Statistics The t-statistics is used in t test to determine if there is a significant relationship between the dependent and each of the independent variable. To do this test, we need the std. error of coefficient (Sb) and calculate the t value. Then we compare the calculated t value and the critical t value from the student t distribution table. The t value is calculated by dividing the value of coefficient (b) by Sb : Calculated t = b Sb i.e : Calculated t = 3.53 = 6.79 0.52 To calculate the critical value from student t distribution table: n k = 10 2 = 8 df at 95% C.I and the t critical = 2.306 Since t computed ( 6.79) > t critical (2.306) then adv.exp. is statistically significant in explaining the variations in sales at 95% C.I. Note: if there is more than one independent variable then you have to test significance for all the independent vars.

e. Durbin Watson Statistics

It indicates that whether the presence or absence of auto correlation means the problem that can arise in regression analysis with time series data. There are 3 possibilities where autocorrelation or multi-co linearity problem can arise: When independent variables are interrelated or duplicated Where independent variables have been miss- specified Where important independent variables are found missing.

f. F-statistics It is another test of overall explanatory power of regression analysis. (Refer pg 147 manual)

----end of short notes on demand estimation----

Das könnte Ihnen auch gefallen