Sie sind auf Seite 1von 25

WELCOME

Regression Cost Model


Introduction

Regression analysis A statistical method by which estimates are made of the value of a variable from a knowledge of the values of one or more other variables, and the errors involved in this estimating process measured Normally used in situations where relationships between variables is unique Main types Simple Linear Regression Analysis Multiple Linear Regression Analysis

the not

Assumptions
The standard deviation in the error associated with the dependent variable cost remains constant throughout the domain This error is normally distributed The effect of any variable is always expressed in terms of a fixed cost increase or decrease, irrespective of project size or type

Simple Linear Regression Analysis

Two-variable linear regression describes the relationship between two variables by computing a straight line through the data obtained
Dependent

variable (y) - the value to be estimated

Independent

variable (x) the factor from which the estimates are made (a)- the value of y when the independent variable is zero coefficient of x (b)- The slope of the line for straight line

Constant

The

Expression

y=a+bx

Dependentvariable

b=tan

a
Independentvariable

Predictionwithinthe range of values in the dataset is known asinterpolation Predictionoutsidethis range of the data is known asextrapolation

Steps of SLR Model

Steps of SLR Model

Specification
on the relationship

Begins with theoretical reasoning between variables

Form equations to represent the relationships between variables Since the population parameters are unknown, sample is considered and the model is built with estimated values

Estimation

Lean squares estimation procedure is used most of the time Include a series of statistical tests to make sure that the estimated model is a good representation of the postulated relationship

Steps of SLR Model

Validation

Evaluate the quality of the model Evaluated on the basis of following statistics
o Coefficient of determination o Standard error o F ratio test- The ratio of the regression mean square to the residual mean square o T ratio test- The ratio of the coefficient to its standard error

Forecasting

Forecasting should be satisfactory to the users Accuracy depends on the acceptable error amount of the model

Multiple Linear Regression Analysis

This aims to create a relationship with the dependant variable with several other independent variables.

Independent / Response variable - y Dependant/ Explanatory variables- x1, x2, x3.. xn

y=a+b1x1+b2x2+b3x3+bnxn+e

Steps of MLR Model

STEPSOFMLRMODEL
Specification Begins with theoretical reasoning on the relationship between variables Selecting a full set of explanatory(Independent) variables

Estimation

Determining the correlation coefficients between all possible pairs. Resolving multicollinearity Eliminating non-significant variables one at a time until all the remaining variables are significant.
o Use of the t-ratio- a large t-ratio is desirable. o Use of the F-ratio -Test for the significance of the overall dependence of y on the variables (x1, x2, . . . , xn )

STEPSOFMLRMODEL

Estimation Cont Constructing a multiple linear regression model

o Making estimates of the coefficients in the regression model, the method of least squares is used due to its simplicity.

Validation

Validation done before practical use in construction industry using another actual project. Forecasting If validation is a success practical use on construction projects

Application in Construction Industry

Simple Linear Regression Analysis

Case Study: Consider the possible sample values of bricklayer hours and areas of brickwork from 10 fictitious contracts in the following table

Plo tte dsc atte rdiagram

Scatter is caused by the factors other than area which affect the hours required
o Bricklayer-hours : Independent / Response variable o Areas of brickwork : Dependent / Regression variable

To avoid individual judgement in constructing the line method of least squares used In fitting the regression line to a set of data, several parameters are estimated which need to be tested for the significance before being accepted As an overall guide to the strength of association between the two variables the correlation coefficient is calculated

Perfect correlation = 1

Calculated coefficient correlation = 0.998

Shows an excellent degree of correlation which cannot be found by using one variable only Standard error of estimate ; anticipated difference between the actual values and what the regression line predicts, should be calculated

Application in Construction Industry

Multiple Linear Regression Analysis


Case Study: Obtaining a model that will estimate productivity rates of concrete operations. For this a wastewater project was observed in the North-East of Scotland (Project A). The regression analysis methodology used in this study is backward elimination, stepwise regression.

Firstly listing out the independent variables of Project A


o o o o o

Type of pour Total volume(m3) Number of trucks on job Average volume of load(m3) Start time

Ave rage truc k time (minute s) o Numbe ro flo ads o We athe r o Co nc re te mix
o

cyc le

Calculating

the correlation coefficients between all possible pairsbyusingtheinbuiltfunctionsofMicrosoftExcel

Resolving

multicollinearity by removing one variable (Total volume) out of the highly corerated two variables ( i.e Total volumeandNo.ofLoads).

Estimating partial regression coefficients and the corresponding t-statistics from the regression on actual productivity for all eight explanatory variables. Insignificant variables have small absolute values-Should be eliminated Carrying out two further runs, eliminating the insignificant variables: concrete mix (t-statistic=0.97) and the start time (t-statistic=1.72) from the regression model.

An important assumption made is the variability of the data does not change for different levels of the response or explanatory variables.
o This is checked by carrying out residual plots.

Constructing a multiple linear regression model for actual productivity for a single server concrete system.

Pactual =1.31Tp+1.75Va+0.56Tn+0.59W0.01Ct0.37Ln6.95

Tp=Typeofpour Va=Averagevolumeofconcrete Tn=Numberoftrucksonjob W=Weather Ct=Averagecycletime Ln=Numberofloads

Validation is done by using an actual concrete pours from another wastewater project in Scotland by a different contractor (Project B). The actual productivities achieved on 32 operations observed on Project B are compared to the predicted productivities using the derived regression model

Drawbacks in regression

Multicollinearity

If the explanatory variables in multiple regression are correlated, and if the correlation coefficient (positive/ negative) is high it is difficult to get their separate effects on the dependent variable. Leads to a poorly estimated partial regression coefficient.

Omitted variables

If independent variables that have significant relationships with the dependent variable are left out of the model, the results will not be satisfactory. E.g location, quality etc cannot be quantified Biasness of selecting independent variables

Endogeneity

Changes in the dependent variable cause changes in the independent variable.

Development of Regression Model

Development of technologies in computing, accessing, processing and storing data


packages performleast

All major statistical software squaresregression analysis.

Simple linear and multiple regression using least squares can be done in somespreadsheet applications and on some calculators. Specialized regression software has been developed for use in fields (survey analysis, neuro imaging). TheConstructive Cost Model (COCOMO)- An example of an algorithmicsoftware cost estimation model developed using basic regression formula.

Conclusions

Regression Analysis falls under the Algorithmic Cost Model which uses mathematical formulae linking costs/inputs with metrics to produce an estimated output. It is used not only for estimating costs but also for forecasting productivity, time and any other parameter. A widely used method not just in the construction Industry. When there is only one major factor affecting the response SLR can be used When there are more than 1 major factor affecting the respone MLR can be used There are several drawbacks and limitations in this method. The knowledge of using Regression analysis in a specialized cost estimation software, in spread sheets and in a calculator is beneficial for the Quantity Surveyor

Das könnte Ihnen auch gefallen