Sie sind auf Seite 1von 64

PERFORMING STATISTICAL ANALYSIS ON EARNED VALUE DATA Eric Druker Dan Demangos Booz Allen Hamilton Richard Coleman

Northrop Grumman Information Systems Some Earned Value Methods, particularly those described in the equations on the DAU Gold Card, suffer from the shortcomings that they are backwards looking (they do not make a prediction of the final CPI) and do not allow for inferential or descriptive statistics. This leads to the propensity for these estimates to tail-chase as the CPI changes over time, meaning that the EAC for an over running program will systematically lag in predicting the overrun, and the EAC for an under running program will systematically lag in predicting the under run. Additionally, the Estimates at Completion (EAC) they yield cannot be evaluated for bias or uncertainty, nor can statistical significance tests be applied to them. Lastly, without quantified uncertainty measures, there is no method by which to perform risk analysis on these estimates without relying on subjective methods. The purpose of this paper is to present a method by which statistical analysis techniques can be applied to Earned Value data to better predict the final cost of in-progress programs. EACs developed using statistical methods make a prediction of the final CPI and thus do not tail-chase. Also, since they rely on historical data, they are testable, that is to say that they can be subjected to statistical significance tests, and are thus defensible. Estimates produced using statistical analysis techniques will be unbiased and the descriptive statistics developed as a byproduct of the method allow uncertainty to be quantified for risk analysis purposes. For demonstration, the method will be applied to a set of representative data. The paper will continue with an example of the paradigm shift this type of analysis caused when it was implemented across a production facility. The conclusion will discuss the recommended process and types of data needed to implement this type of analysis.

Performing Statistical Analysis on Earned Value Data


Eric Druker, Dan Demangos Booz Allen Hamilton Richard Coleman Northrop Grumman Information Systems
This document is confidential and is intended solely for the use and information of the client to whom it is addressed.

Introduction
Problem Statement Performing Statistical Analysis on EVM Data Why Statistics are Rarely Used With EVM Data

Druker_Eric@bah.com

Introduction: Problem Statement


Currently, Earned Value Management calculations suffer from several shortcomings that lessen their viability as a cost estimating tool 1. Estimates developed using most EVM equations are subject to tail-chasing whenever the CPI changes throughout the life of a program Tail-chasing is when the EAC for an over running program systematically lags in predicting the overrun, and vice-versa This occurs because these equations are backwards looking in regards to CPI; they lack the ability to predict changes in the CPI looking forward, and fail to perceive trends Tail-chasing is thus inevitable because, as Christiansen wrote: in most cases, the cumulative CPI only worsens as a contract proceeds to completion.1

2. Since the traditional EVM equations are simple algebra, and not based on statistical analysis, estimates developed using them are not unbiased, testable or defensible Bias is the difference between the true value of an estimate and the prediction using the estimator Testable estimates are those which can be subjected to decisions based on measures of statistical significance

3. Quantitative cost risk analysis can not be performed on EVM data without subjective inputs
1Christensen,

David S (1994, Spring). "Using Performance Indices to Evaluate the Estimate At Completion." Journal of Cost Analysis and Management, pp 17-24.
Druker_Eric@bah.com 2

Introduction: Performing Statistical Analysis on EVM Data


Performing statistical analysis on EVM data solves all of the aforementioned shortcomings 1. EACs developed using statistics include a forecast for the final CPI and thus are not subject to tail-chasing 2. EACs developed using statistics are based on historical data, and are therefore testable and defensible Statistical significance can be used to defend the estimate 3. Statistical methods will produce unbiased estimates that include the uncertainty measures needed for risk analysis 4. Statistical methodologies can be applied alongside traditional earned value methods and easily incorporated into the EVM process They provide an independent cross-check of the calculated estimates Once the statistical analysis has been performed the first time, it can be updated with very little recurring effort

Although not discussed in this paper, similar methods can be applied to the SPI to develop statistically based schedule estimates using EVM data

Druker_Eric@bah.com

Introduction
A pre-requisite for just about any defensible cost estimate, statistical techniques have yet to be widely applied to EVM data for various reasons EVM traditionally falls within the realm of program management or financial controls, not within the realm of cost analysis EVM was developed as a program management technique for measuring progress in an objective manner From a cost estimators perspective, it is difficult to acquire the data needed to perform statistical EVM analysis There arent many databases dedicated to historical EVM data Data gathering/normalization is often the most time consuming part of statistical analysis The techniques needed to perform statistical analysis on EVM data can be complicated, especially when there are events such as rebaselining involved Patterns within EVM data are generally not obvious just by looking at trends on a scatter plot Despite the difficulties in applying statistical analysis techniques to EVM data, the ability to produce defensible, unbiased estimates that include risk analysis is well worth the effort

Druker_Eric@bah.com

Table Of Contents
Introduction Performing Statistical Analysis on EVM Data A Real World Example: Progress-Based EACs Conclusion

Druker_Eric@bah.com

Performing Statistical Analysis on EVM Data

Druker_Eric@bah.com

Performing Statistical Analysis on EVM Data: Goals


The theory behind statistical EVM analysis is that programs of a similar nature, or performed by a similar contractor, can be used as a basis to project patterns in the CPI over time Example: For ship production programs, the cost of 1% of progress rises (and thus the CPI drops) over time This occurs as ships move from the shop, to the blocks, to the water, and, e.g., workers move from welding at their feet to welding above their heads Looking only at the current, or average, CPI, estimates for these ship production programs would always tail-chase The results of this analysis provides program managers and decision makers with: An EAC that is historically based, unbiased, testable and defensible Testable refers to the ability to apply statistical significance to a relationship The statistical uncertainty around the EAC for use in risk analysis and portfolio management An example using representative data follows on the next several slides

Druker_Eric@bah.com

Performing Statistical Analysis on EVM Data: Example


CPI vs. % Progress
1.2 1 0.8 CPI 0.6 0.4 0.2 0 0% 20% 40% 60% % Progress 80% 100% 120% Program 1 Program 2

Program 3 Program 4 Program 5 Program 6 Program 7

Program 7 BCWP $ 20 BAC $ 100 % Progress 20% ACWP $ 22 CPI 0.91

The above graph shows the CPI over time vs. % reported progress for 7 different programs Examining the lines, it is not apparent that there is a trend that would yield any applications to the in-progress program (Program 7) Data from Program 7s latest EVM report is on the right

Druker_Eric@bah.com

Performing Statistical Analysis on EVM Data: Example


CPI (20% Progress) vs CPI (100% Progress)
1.2 Final CPI (!00% Progress) 1 0.8 0.6 0.4 0.2 0 0.9 0.92 0.94 0.96 0.98 1 CPI at 20 % Progress Programs 1 - 6 Linear (Programs 1 - 6) y = 3.2877x - 2.3027 R2 = 0.8246

Significance ~ 0.012

Program 7 BCWP $ 20 BAC $ 100 % Progress 20% ACWP $ 22 CPI 0.91

With a closer look at the data, it is revealed that there is a significant relationship between a programs CPI at 20% progress and its final CPI This implies that a programs CPI at 20% progress can be used to estimate its final CPI and thus its EAC This relationship (and others like it) will be used to develop a new estimate for Program 7

Druker_Eric@bah.com

Performing Statistical Analysis on EVM Data: Example


Program 7 Cost Distribution
1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Program 7
Downside, $157.65 80% Confidence Level Most Likely EAC, $145.10 50% Confidence Level

CV ~ 20%

Traditional EVM, $109.89 1.5% Confidence Level $75 $100 $125 $150

$50

$175

$200

$225

$250

BCWP $ 20 BAC $ 100 % Progress 20% ACWP $ 22 CPI 0.91 Traditional EVM EAC $ 109.89 Statistical EVM Analysis Predicted Final CPI 0.69 EAC $ 145.11

Cumulative Probability

Final Cost
Prediction Interval Distribution Point Estimate Traditional EVM CV 80th Percentile

Using the knowledge gained from the regression analysis, a predicted final CPI of 0.69 (rather than the current reported CPI of 0.91) is applied to the BAC This EAC differs dramatically from that produced using traditional EVM More importantly, it is statistically significant and unbiased Because statistics were used to develop the estimate, the risk curve is a byproduct of the estimate

Druker_Eric@bah.com

10

Performing Statistical Analysis on EVM Data: Example


Program 7
BCWP BAC % Progress ACWP CPI EAC Predicted Final CPI EAC $ $ $ 20% 20 $ 100 $ 20% 22 $ 0.91 $ % Progress 40% 60% 80% 40 $ 60 $ 80 100 $ 100 $ 100 40% 60% 80% 54 $ 96 $ 131 0.74 0.63 0.61 Traditional EVM 135.94 $ 159.36 $ 164.05 Statistical EVM Analysis 0.69 0.69 0.69 145.11 $ 145.11 $ 145.11
EACs vs. % Progress

90% 100% $ 90 $ 100 $ 100 $ 100 90% 100% $ 136 $ 145 0.66 0.69 $ 150.68 $ 145.11

$170.00 $160.00 $150.00 EAC $140.00 $130.00 $120.00 $110.00 $100.00

$ 109.89

0.69 $ 145.11 $

0.69 0.69 $ 145.11 $ 145.11

0%

20%

40%

60%

80%

100%

120%

% Progress Traditional EVM Statistical EVM Analysis

In the chart above, EACs developed using the gold card equations change with each data drop This is an example of EVM producing biased estimates Statistical analysis uncovers that the CPI exhibits predictable trends over time and thus some changes in the CPI over time can be anticipated Since these shifts in the CPI are predictable, the data can be normalized to yield an unbiased EAC that will not change so long as Program 7 behaves similarly to the historical programs

Druker_Eric@bah.com

11

Performing Statistical Analysis on EVM Data: Data Requirements


This analysis requires EVM data from completed programs of a similar nature Programs performed by the same contractor as is performing the work in question Programs that would be considered close enough an analogy to include in a CER Examples of progressing data: Earned value reports Dated cost reports with an estimated completion date Any data that allows a measure of progress to be developed will work (ex: percent of estimated schedule, percent of final schedule, BCWP/BAC, milestones such as PDR, CDR, etc.) The best form of data would be a measure such as first flight or launch, that is a dependable measure of progress The most difficult step in this method is not data collection but data analysis Analysis tools such as dummy variables can be used to handle re-baselinings within the data

Druker_Eric@bah.com

12

Performing Statistical Analysis on EVM Data: The Process


The aforementioned techniques can be easily incorporated to fit within the EVM process Due to the comparably high start-up cost for developing statistically-based EVM estimates (generally 1-3 weeks after the collection of historical data is complete), these methods are best applied when there is low confidence in the currently available estimates This could be due to the calculated EAC demonstrating tail-chasing, if there is significant variance between the grassroots estimate and the calculated EAC Once the statistically-based estimate is available, it provides an independent crosscheck of the available estimates Once the statistical analysis is complete, the recurring cost to update the estimate is minimal (4 hours 1 day) Updating the estimate may not be needed if it verifies the calculated EAC The following slides will show the success of this method when applied to an actual program

Druker_Eric@bah.com

13

A Real World Example: Progress Based EACs


From the paper: Ending the EAC Tail-Chase: An Unbiased EAC Predictor Using Progress Metrics; Druker, Eric, Coleman, Richard, Boyadjis, Elisabeth, Jaekle, Jeffrey, SCEA Conference, June 2006, New Orleans, LA

Druker_Eric@bah.com

14

Introduction
A client was facing a two-fold problem in estimating production units at their facility Estimates developed using EVM were found to tail-chase and were viewed with wide skepticism by their government client By tail-chase it is meant that by the time an EAC was reported, the latest EVM metrics would already yield an increase above and beyond that EAC A natural disaster had occurred at the production facility causing a sharp and prolonged decrease in productivity The PM for one of the programs at this facility reached out to see if there was a way to produce more accurate and defensible estimates than currently available The resulting analysis represented the authors first experience with performing statistical analysis on EVM data This specific implementation is known as the Progress-Based EAC method This analysis differs from that in the previous example in that the final cost was regressed against ACWPs at various progress points As opposed to the final CPI being regressed against the CPI at various progress points

Druker_Eric@bah.com

15

The Key Graphic


Reported Progress

As-reported EVM data was gathered for all units of the same type being estimated that had been produced at the facility The ACWP at intervals of 10% progress was scatter plotted on a chart to see if any patterns were visible It became immediately apparent that the pattern in the points representing the final cost of each unit became visible as early as 30% of progress

Druker_Eric@bah.com

16

The Key Graphic Continued


The graph to the right focuses in on units 12 through 20, when the facility experienced unexplained cost growth on many of their units In all cases, this growth was not recognized till the unit was significantly along in its production cycle From this graph it is apparent that had the facility compared the ACWP of any two units at equal percent progresses, they would have been able to predict at least relative cost growth This chart led to regression analysis being performed on the EVM data Could the final cost of a unit be predicted knowing only its ACWP at a certain percent progress?

Druker_Eric@bah.com

17

Regression Results
At each 10% increment of reported progress, the final cost was regressed against the ACWP At 20%, the fist significant regression was found With an unbiased error of 4% Conclusion: By 20% progress, the facility could predict the cost of any unit, unbiased, 4% The further along the unit, the less the error
SUMMARY OUTPUT Regression Statistics Multiple R 0.956210345 R Square 0.914338224 Adjusted R Square 0.90982971 Standard Error 173979.0514 Observations 21 ANOVA df Regression Residual Total 1 19 20 Coefficients 152908.7692 6.610824914 SS MS F Significance F 6.13857E+12 6.139E+12 202.80255 1.36728E-11 5.75105E+11 3.027E+10 6.71368E+12 Standard Error 262941.1092 0.464214768 t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% 0.5815324 0.5677177 -397433.2962 703250.834 -397433.296 703250.834 14.240876 1.367E-11 5.63921224 7.58243759 5.63921224 7.58243759

Intercept 20%

Druker_Eric@bah.com

18

Regression Results: Error Tracking


Progress Based EAC Error vs. % Complete 2 Unit Types
60% 50% 40%

Error (%)

30% 20% 10% 0% -10% -20% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

% Complete Type 1 ME (%) Type 1 SEE (%) Type 2 SEE (%) Type 2 ME (%)

Druker_Eric@bah.com

19

Regression Analysis Continued


With the success of the regression analysis, further work was done to gain more insights The next step was to perform a regression of regressions: Each of the previous regressions was of the form: Final Cost = A * ACWP% Progress + C After taking a look at the results, the intercept C was removed from the regression to produce the equation: Final Cost = A * ACWP% Progress A represents a multiplier that is used to extract the final cost of any unit from an ACWP 1/A represents the true percent progress in terms of cost C was removed because it was unstable and degraded the utility of the model When C was removed the other terms proved sufficiently stable With the regressions complete, the A term was charted against its associated % reported progress These plots were developed for two types of units with different schedules, costs and physical parameters The lines representing the A multiplier for the two types of units were found to be the exact same

Druker_Eric@bah.com

20

Regression Analysis Continued

Several breakthrough insights were gained through the above graph 1. As the % Complete (in terms of cost) vs. % Reported Progress line is non-linear, the facilitys EACs (using traditional EVM) must tail-chase as the CPI is always degrading 2. The A multiplier for both types of units produced by the facility follow the same curve meaning the analysis can be used to estimate units of types not included in the data 3. Each % progress costs progressively more as the unit moves along in production

Druker_Eric@bah.com

21

Estimating Final Cost

To estimate the final cost of a unit, the A multiplier for the current % progress was found from the chart above A was then applied to the current ACWP to find the EAC For example, an ACWP of $50 at 10% progress would yield an estimate of: $50 * 13.2 = $690

Druker_Eric@bah.com

22

Implications
Since the multiplier lines for two different programs overlay each other, the facilitys progress points are standard across unit type and directly related to cost This implied that the method could be applied to any unit produced by the facility, even those that were not a part of the historical analysis This was proved to be true over the next two years As the cost per 1% progress rises throughout construction, traditional EVM would never produce an accurate EAC The degrading CPI would lead to consistent tail-chasing This degradation however is predictable a-priori, which is why the method works The multiplier curves can be used to predict the ACWP at a future % reported progress Comparing the actual ACWP to this provides a method by which productivity can be monitored

Druker_Eric@bah.com

23

Summary
This method is a wholly-data-based method of EAC projection that relies upon Progress-andMH data alone. The model is Able to project EACs for all unit types at the facility within about 2% - 5% after about the 20% progress point Able to work incrementally projecting work remaining given MH Able to include uncertainty with the estimate because it is statistically based Unbiased the error is symmetric specifically, it does not result in a tail chase In the case of short term effects, the model, because it is progress based, is able to separate out specific effects such as additional costs due to a fire or other exogenous event for units that were at least 20% complete before the event This "effect cost" is obtained by subtracting the as-would-have-been cost from the actual end cost In the case of long-term effects, because of its incremental ability, the model is able to add actuals up to an event, and, since it can predict ETC after any post-event increment of about 20% of progress has occurred, can predict ETCs after the event.

Druker_Eric@bah.com

24

Since the Analysis


The previous was nothing short of a revelation for the client, who had programs that had experienced multiple rebaselinings To date, the method has correctly estimated the final cost of all 4 units it has been applied to Midway through the production effort of one of these units (in 2006), the Progress-Based EACs method forecasted 60% cost growth in the final cost This cost growth was predicted prior to latest program estimate recognizing a single dollar of cost risk After significant resistance, it took a full 2 years (2008) before the program team recognized that 60% cost growth was even feasible It took another 6 months (2009) before the program team recognized that 60% cost growth was, in fact, accurate Following this success, the method was expanded This analysis is performed on all in-progress programs and the results are presented to executive management regularly The method is also used to monitor productivity on all in-progress programs

Druker_Eric@bah.com

25

Conclusion

Druker_Eric@bah.com

26

Conclusion
Performing statistical analysis on EVM data provides an invaluable capability in that: CPI forecasts can be developed, thus avoiding the problem of tail-chasing when estimates are developed using only backwards looking equations The EACs developed using statistical methods are unbiased, testable, and defensible The uncertainty in the estimate, for use in risk analysis, is automatically included with statistically based EACs The analysis can be incorporated into the EVM process to provide a third data point in addition to the calculated EAC and grassroots estimate Despite the utility of methods such as these, there are still hurdles to overcome before they can be widely implemented EVM data from completed programs must be compiled and provided to cost estimators Cost estimators must be involved in the EVM process

Druker_Eric@bah.com

27

Backup Slides

Druker_Eric@bah.com

28

Variations on Progress-Based EAC Analysis


More often than not the program being estimated will be at risk of being re-baselined Re-baselinings will almost always be contained in the historical data as well In these cases, regression analysis is performed including a dummy variable for rebaselinings Example: Final Cost = A * ACWP + B * # Re-baselinings This will adjust the Final Cost based on the number of rebaselinings a program has been through

Druker_Eric@bah.com

29

The Gold Card Equations and What They Reveal About EVM

Druker_Eric@bah.com

30

ESTIMATE AT COMPLETION # EACCPI = ACWPCUM + [(BAC BCWPCUM) / CPICUM ] =

BAC / CPICUM

The Gold Card Equations

DoD TRIPWIRE METRICS Favorable is > 1.0, Unfavorable is < 1.0 Cost Efficiency CPI = BCWP / ACWP Schedule Efficiency SPI = BCWP / BCWS

The Gold Card equations reveal several important traits inherent to estimates developed using EVM Cost Efficiency (CPI): The CPI is a cumulative performance metric that measures average productivity Cumulative measures are: Volatile early on Slow to respond to changes once some data has been accumulated Estimate at Complete (EAC): The EAC uses applies the cumulative CPI to the budget to develop an estimate The EAC equation never uses historical data, thus its result can not be tested against actuals from completed programs The next slide will show how using a cumulative performance metric can lead to incorrect conclusions

Druker_Eric@bah.com

31

Cumulative Performance Metrics: Miles Per Hour Example


Suppose I am driving from Colorado Springs to St. Louis (approximately 800 miles) The first 60 miles of my drive is on rural country roads and takes me about 2 hours I am averaging 30 mph Based on this metric my EAC is 26 hours 40 minutes (800 miles / 30 mph) The next 740 miles of driving take place on highways (average speed 80 MPH) 4 hours into the trip, I have traveled 60 miles + 2 hours * 80 MPH = 220 miles I am now averaging 55 mph Based on this metric my EAC is 14 hours 32 minutes The trip ends up taking me 12 hours, but this is not the EAC I get using my mph metric until I actually arrive in St. Louis Because mph is cumulative and includes areas of varying speed, it is always tail chasing my true velocity

Druker_Eric@bah.com

32

Cumulative Performance Metrics: Miles Per Hour Example


Colorado to St. Louis
90 80 70 Miles Per Hour 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Hours Elapsed 30 25 20 15 10 5 0 EAC Instant mph Average mph EAC

The above graph demonstrates this effect, because the average mph contains memory of the first two hours, my EAC continues to drop until I actually arrive back in St. Louis The EAC is said to tail chase What would happen if, 6 hours into the drive, I hit a snowstorm and had to slow to 20 mph for 3 hours?

Druker_Eric@bah.com

33

Cumulative Performance Metrics: Miles Per Hour Example


Colorado to St. Louis
90 80 70 Miles Per Hour 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Hours Elapsed 30 25 20 15 10 5 0 EAC Instant mph Average mph EAC

One hour after the snowstorm hits: My average mph has dropped by only 5 mph Only 1 hour, 23 minutes has been added to my trip Notice how slow the EAC is to respond to the average mph, a cumulative measure

Druker_Eric@bah.com

34

Cumulative Performance Metrics: Miles Per Hour Example


In reality, one would never estimate their ETA using only the average mph In fact, when I took this trip, I calculated the ETA by separating the trip into different parts of productivity When I hit the snowstorm, I used my phone to see where the storm ended, and re-estimated my EAC using the 80mph assumption from there Many GPS systems in fact break up the trip into different speed limits and use that knowledge to calculate the ETA If we wouldnt use a cumulative metric to calculate something as simple as how long a car trip will take, why would use one to estimate the cost and schedule of a complicated program? RLC8

Druker_Eric@bah.com

35

The Gold Card Equations and What They Reveal About EVM
In fairness to EVM, the budget is set up with the intended purpose of accounting for phases with varying rates of spending Much as if I was estimating how long a drive would take I would break it into phases with different speed limits Unfortunately, despite these best efforts, EACs still tail-chase Re-baslinings can also cause dramatic, discrete jumps in the EAC What is needed is a technique that uses EVM data in a way that avoids the pitfalls inherent when cumulative performance metrics are being used The following section will examine how applying Statistical analysis to EVM data can provide insights into program performance unavailable using the Gold Card equations

Druker_Eric@bah.com

36

Productivity Shifts Equation


If productivity changes show a trend, or a trend is expected, the final EAC can be adjusted more accurately This requires productivity monitoring that, using this method, is not difficult This equation allows you to produce the ACWP for an interval where productivity improves linearly from one %Complete to another %Complete SEAC = Hypothetical final cost of starting productivity FEAC = Hypothetical final cost of ending productivity SR = %Complete that improvement begins FR = %Complete that improvement ends = Production Curve function
( SEAC FEAC ) * d ( SEAC FEAC ) * d * i SEAC SEAC * (i 1) ( FR SR ) ( FR SR ) lim d >0 ( SR + (i * d )) ( SR + [(i 1) * d ]) i =1 Where: ( x ) = ax b + c
FR SR d

Druker_Eric@bah.com

37

Backup & Productivity Monitoring

Druker_Eric@bah.com

38

Performing Statistical Analysis on EVM Data: Example


Although the example on the previous slides used a representative data set, it demonstrates RLC9 that: Data with no apparent relationships can yield significant results when examined from another perspective In theory at least, statistics can be used to develop unbiased, statistically significant EACs The example to follow later will demonstrate this on a real-world program Because the EAC was developed using statistical methods: It can be tested, an thus is defensible It is unbiased and will be immune to tail-chasing Uncertainty measures for use in risk analysis are included with the estimate With each new EVM report, the EAC can be quickly re-evaluated to detect subtle changes in productivity The next several slides will show how this method can be used to monitor productivity; beginning with a slide on cumulative performance metrics

Druker_Eric@bah.com

39

ESTIMATE AT COMPLETION # EACCPI = ACWPCUM + [(BAC BCWPCUM) / CPICUM ] = DoD TRIPWIRE METRICS Favorable is > 1.0, Unfavorable is < 1.0

BAC / CPICUM

Cumulative Performance Metric


1. It is not statistically unbiased 2. It is a cumulative performance metric

Cost Efficiency CPI = BCWP / ACWP Schedule Efficiency SPI = BCWP / BCWS

The CPI can be a poor measure of performance for two reasons

Because the CPI is not unbiased, one cannot distinguish between CPI shifts that are typical on similar programs and actual changes in performance Cumulative metrics are: Volatile early on Slow to respond to changes once some data has been accumulated This means that shifts in productivity will be slow to show up in the data Once they do show up, EACs using the CPI will include a mix of past and future productivity, but will be representative of neither If the recent change in productivity can be expected to continue in the future, the EAC will always tail-chase

To avoid the use of cumulative metrics, EVM data from intervals of progress can be used

Druker_Eric@bah.com

40

Performing Statistical Analysis on EVM Data: Productivity Monitoring


CPI (20% - 40% ) Interval vs. CPI (0% - 20% ) Interval
1.2 CPI (20% - 40%) Interval 1 0.8 0.6 0.4 0.2 0 0.9 0.92 0.94 0.96 0.98 1 CPI (0% - 20%) Interval Programs 1 - 6 Linear (Programs 1 - 6) y = 4.4056x - 3.397 R2 = 0.8983 Significance ~ 0.004

Example: Program 2 BCWP(20%) $ 20.00 ACWP(20%) $ 20.89 CPI(0% to 20%) 0.957 BCWP(40%) $ 40.00 ACWP(40%) $ 43.98 BCWP(20% to 40%) $ 20.00 ACWP(20% to 40%) $ 23.09 CPI(20% to 40%) 0.866 CPI(0 to 40%) 0.909

Just as the cumulative CPI can be used to predict the final CPI, the CPI from any interval can be predicted using the CPI from any other interval For example, the CPI for the zero to 20% progress interval can be used to predict the CPI for the 20% to 40% progress interval The next slide will show how the CPI can be projected for future intervals

Druker_Eric@bah.com

41

Performing Statistical Analysis on EVM Data: Productivity Monitoring


Interpolated & Actual Interval CPIs vs. % Progress
1

Program 7
BCW P BAC % Progress ACW P CPI EAC Predicted Final CPI EAC Predicted CPI for Interval Actual CPI for Interval $ $ $

Interval CPI (for previous 20%)

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

% Progress Actual CPI Interpolated CPI

% Progress 20% 40% 60% 80% 20 $ 40 $ 60 $ 80 100 $ 100 $ 100 $ 100 20% 40% 60% 80% 22 $ 54 $ 96 $ 150 0.91 0.74 0.63 0.53 Traditional EVM 109.89 $ 135.94 $ 159.36 $ 187.50 Statistical EVM Analysis 0.69 0.69 0.69 0.57 145.11 $ 145.11 $ 145.11 $ 176.26 N/A 0.61 0.49 0.58 N/A 0.62 0.48 0.37

The above chart shows the actual and predicted CPIs using regressions similar to the one on the previous slide The statistics predict that towards the end of the program, Program 7s CPI deviated from the predictions In the historical data, programs typically experience an improvement in the CPI after 60% progress Program 7s CPI has decreased This is a sign there has been a shift in productivity

Druker_Eric@bah.com

42

Performing Statistical Analysis on EVM Data: Productivity Monitoring


The productivity monitoring ability of the analysis (length of interval needed, uncertainty in regressions) will be determined by the variability within the data In this case, the data has purposely been given a tight, but not unrealistic fit in order to demonstrate the technique The next step would be to determine what occurred during the 60% to 80% interval that caused the drop in productivity Once the root cause analysis is performed, a revised estimate could be developed using the knowledge gained about the Programs new productivity Is the drop in productivity caused by a discrete event and/or temporary? Is the drop in productivity pervasive and expected to continue into the future? Charts similar to the ones presented on previous slides were used to support the analysis in the following real-world example A discussion of how to develop revised EACs based on productivity changes will be included with that example

Druker_Eric@bah.com

43

Productivity Monitoring
Because the production curve is known (and the same) for all units, a final cost can be extrapolated from any interval of progress ACWP Derived Final Cost
30% 40% 30%-40% Interval 2,218 3,233 1,016 10,000 10,670 12,500

For example: The data up to 30% predicts a final cost of 10,000 Using the data up to 40% predicts a final cost of 10,670 Examining the 10% interval occurring between 30% and 40%, unveils a productivity shift equivalent to 2,500 additional hours per whole unit that occurred within the interval Equation for extracting final cost from an interval:
(% 1 ) (% 2 ) ( ACWP 2 ACWP 1 ) [ (% 1 ) (% 2 )] Where: ( x ) = ax b + c at % progress

Progress-Based EACs provides a method for revising EACs for productivity shifts Two examples will follow:

Druker_Eric@bah.com

44

Productivity Monitoring: Cost of an Event


The detection of a shift in productivity in this model could signal several different things A specific event causing a temporary decrease in productivity In this event, the hours attributable to that event can be isolated Example
30% 40% Interval Predicted Interval Cost of Event ACWP 2,218 3,233 1,016 812 203 Derived Final Cost 10,000 10,670 12,500 10,000

By subtracting the expected interval from the actual interval, the cost of the event is isolated This is useful for insurance purposes or for a Request for Equitable Adjustment Once productivity had returned to normal, the hours attributable to the event would be added to the EAC prior to the event occurring A work stoppage (if time is used as the progress variable) In this event, the progress % is just adjusted accordingly to normalize the data An permanent and pervasive change in productivity This requires further analysis

Druker_Eric@bah.com

45

Productivity Monitoring: Piecewise Analysis


Production Curve for 3 Units with different EACs
16,000 14,000 12,000 Cum ACW P
Cum ACW P 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 10,000 12,500 15,000 Mix

Hybrid Production Curve

10,000 8,000 6,000 4,000 2,000 0 0% 20% 40% 60% 80% 100%

10,000 12,500 15,000

0 0%

20%

40%

60%

80%

100%

% Progress

% Progress

Because all curves have the same equation and are wholly defined by their final cost, pieces of different curves can be added together to create one conflated production curve Below three separate production regimes have been added together to create one curve In the previous example, the unit experienced a temporary productivity shift During the productivity shift, the unit was being built with the same productivity as a unit with an EAC of 12,500 To calculate a new EAC, a segment representing the lower productivity period can be added to the current production curve

Druker_Eric@bah.com

46

Productivity Shifts: Continuous Analysis Example


A natural disaster hits the facility, and productivity drops to 50% its original value At the point the disaster occurred, a unit was at 50% progress Productivity is expected to improve linearly to its steady state value over the next several months According to the revised schedule, the unit will be 80% complete at the end of productivity improvement In reality, the revised schedule will depend on this analysis, and thus will have to be iterated For this example it will be assumed that 80% is the correct value The next slide will visualize how productivity shifts can be handled using this method The equations are included in backup

Druker_Eric@bah.com

47

Productivity Shifts: Continuous Analysis Example


Production Curve for Unit with Phased Recovery
14000 12000 CumACWP 10000 8000 6000 4000 2000 0 0% 20% 40% 60% 80% 100% % Progress Expected Curve Disaster Curve Earthquake Recovery Finishes

Expected Actual Cost of Event

Final Cost 10,000 11,861 1,861

In this case, productivity improves linearly until 80% progress, at which point it is discernable that the cost per 1% progress line runs parallel to the originally predicted line This method was implemented in this exact way to estimate an in-progress unit following a natural disaster at the facility

Druker_Eric@bah.com

48

Performing Statistical Analysis on Earned Value Data


June 2009

Booz Allen Hamilton Eric R. Druker Dan J. Demangos Northrop Grumman Information Systems Richard L. Coleman

Biographies
Eric R. Druker CCE/A graduated from the College of William and Mary with a B.S. in Applied Mathematics in 2005 concentrating in both Operations Research and Probability & Statistics with a minor in Economics. He is employed by Booz Allen Hamilton as a Senior Consultant and currently serves on the St. Louis SCEA Chapters Board of Directors. Mr. Druker performs cost and risk analysis on several programs within the Intelligence and DoD communities and NASA. He was a recipient of the 2005 Northrop Grumman IT Sectors Presidents award and the 2008 TASC Presidents award for his work on Independent Cost Evaluations (ICEs) during which he developed the risk process currently used by Northrop Grummans ICE teams. In addition to multiple SCEA conferences, Eric has also presented papers at the Naval Postgraduate Schools Acquisition Research Symposium, DoDCAS and the NASA PM Challenge. He has also performed decision tree analysis for Northrop Grumman Corporate law, built schedule and cost growth models for Hurricane Katrina Impact Studies and served as lead author of the Regression and Cost/Schedule Risk modules for the 2008 CostPROF update. Dan J. Demangos has over 16 years of professional experience in Earned Value Management (EVM) setup and implementation, planning & scheduling, schedule/cost integration, and Project Management training development and delivery. Mr. Demangos clients include FBI where he manages the implementation of an ANSI-748 compliant EVMS; DHS where he led a team that supports budget, acquisition and EVM efforts for the Office of the CIO; and the US Marine Corps. where he led the effort to implement EVM throughout the Budget Department of the Marine Corps. Headquarters. Mr. Demangos has also lead a team to rewrite Booz Allens Management Control System Description (MCSD) and has developed a phased training program that is offered to those in the Firm who practice EVM. Mr. Demangos was also a key contributor to the industry recognized AACE Earned Value Professional (EVP) certification and serves on the Certification Board. Prior to coming to Booz Allen, Mr. Demangos led the Project Controls effort for a telecommunications provider, implemented schedule/cost integration efforts for a Project Management software vendor and helped to manage remediation sites for the Department of Energy. Richard L. Coleman is a 1968 Naval Academy graduate, received an M. S. with Distinction from the U. S. Naval Postgraduate School and retired from active duty as a Captain, USN, in 1993. His service included tours as Commanding Officer of USS Dewey (DDG 45), and as Director, Naval Center for Cost Analysis. He has worked extensively in cost, CAIV, and risk for the Missile Defense Agency (MDA), Navy ARO, the intelligence community, NAVAIR, and the DD(X) Design Agent team. He has supported numerous ship programs including DD(X), the DDG 51 class, Deepwater, LHD 8 and LHA 6, the LPD 17 class, Virginia class submarines, CNN 77, and CVN 78. He is the Director of the Cost and Price Analysis Center of Excellence and conducts Independent Cost Evaluations on Northrop Grumman programs. He has more than 65 professional papers to his credit, including five ISPA/SCEA and SCEA Best Paper Awards and two ADoDCAS Outstanding Contributed Papers. He was author of the Risk Module and a senior reviewer for all the SCEA CostPROF modules and its successor CEBoK. He has served as Regional and National Vice President of SCEA and is currently a Board Member. In 2008, he received the SCEA Award for Lifetime Achievement

Performing Statistical Analysis on Earned Value Data February 2009

Abstract
Traditional Earned Value Methods, such as those described in the equations on the DAU Gold Card, suffer from the shortcoming that they do not allow for inferential or descriptive statistics. The Estimates at Completion (EAC) they yield can therefore not be evaluated for bias or uncertainty, nor can statistical significance tests be applied to them. This leads to the propensity for these estimates to tail-chase, meaning that the EAC for an over running program will systematically lag in predicting the overrun, and the EAC for an under running program will systematically lag in predicting the under run. Lastly, without quantified uncertainty measures, there is no method by which to perform risk analysis on these estimates without relying on subjective methods. The purpose of this paper is to present a method by which statistical analysis techniques can be applied to Earned Value data to better predict the final cost and schedule of in-progress programs. EACs developed using statistical methods rely on historical data and are thus testable, that is to say that they can be subjected to statistical significance tests, and are thus defensible. Estimates produced using statistical analysis techniques will be unbiased and the descriptive statistics developed as a byproduct of the method will allow uncertainty to be quantified for risk analysis purposes. Lastly, because this method normalizes out shifts in the CPI (or SPI for schedule estimates) that seem to be pervasive among similar programs, productivity can be monitored with a high degree of accuracy. These methods can be applied at any level, from the program office to the CAIG, to develop more accurate estimates. For demonstration, the method will be applied to a set of representative data. The paper will continue with an example of the paradigm shift this type of analysis caused when it was implemented across a production facility. The conclusion will discuss the types of data needed to implement this type of analysis/process.

Performing Statistical Analysis on Earned Value Data February 2009

Introduction
Currently, Earned Value Management calculations suffer from several shortcomings that lessen their viability as a cost estimating tool. Most importantly, estimates developed using most EVM equations are subject to tail-chasing whenever the CPI changes throughout the life of a program. Tail-chasing is when the EAC for an over running program systematically lags in predicting the overrun, and vice-versa. This occurs because these equations are backwards looking in regard to CPI; they lack the ability to predict changes in the CPI looking forward and thus fail to perceive trends. Trailchasing is thus inevitable because, as Christensen wrote: in most cases, the cumulative CPI only worsens as a contract proceeds to completion.1 In addition to their propensity to produce EACs that tail-chase, earned value calculations also suffer from the shortcoming that they use simple algebra, not statistical analysis; thus their EACs are not unbiased, testable or defensible. Bias is the difference between the true value of an estimate and the prediction using the estimator. Unless an estimate is proven to be unbiased, it cannot be asserted that it is truly the most likely cost estimate. Testable estimates are those which can be subjected to decisions based in measures of statistical significance. When a desired level of statistical significance has been reached, the estimate is defensible. Lastly, because traditional earned value calculations do not rely on statistical analysis, there is no quantitative measure of the uncertainty in the estimate. Because of this, the only way to perform quantitative cost risk analysis around estimates made using these calculations is to use subjective inputs.

Performing Statistical Analysis on Earned Value Data


Performing statistical analysis on earned value data solves all of the aforementioned shortcomings. Most importantly, EACs developed using statistics include a forecast for the final CPI, are unbiased and thus are not subject to tail-chasing. Similarly, because estimates developed using statistical techniques are based on historical data, they are testable and defensible: measures of statistical significance can be used to defend the estimate. Quantitative risk analysis can be performed using the uncertainty measures that are byproducts of the statistical analysis. Lastly, the statistical methodologies discussed in this paper can be applied alongside traditional earned value methods and easily incorporated into the EVM process as a cross-check of the calculated estimates. Once the statistical analysis has been performed once, the estimates it produces can be updated with very little recurring effort. Although not discussed within the context of this paper, similar methods can be applied to the SPI to develop statistically based schedule estimates. The authors believe that there are several reasons that the types of analysis about to be discussed have yet to be widely applied to EVM data. First, because EVM was developed as a program management technique for measuring progress in an objective manner, it traditionally falls within the realm of program management or financial controls, not within the realm of cost analysis. Unfortunately, the techniques needed to perform statistical analysis, which are typically performed by cost analysts, can be mathematically complicated, especially when there are events such as rebaselining involved. Additionally, the patterns this type of analysis looks for within the EVM data are generally not visible using standard techniques such as scatter plotting. Lastly, from a
1

(Christiansen, 1994)

Performing Statistical Analysis on Earned Value Data February 2009

cost estimators perspective, it is difficult to acquire the data needed to perform statistical analysis EVM analysis because there arent many databases dedicated solely to historical EVM data. Data collection/normalization often ends up being the most time consuming part of this type of analysis. Despite the difficulties in applying statistical analysis techniques to EVM data, the ability to produce defensible, unbiased estimates that include risk analysis and dont tailchase is well worth the effort

The Methodology
The overarching theory behind statistical EVM analysis is that programs of a similar nature, or performed by a similar contractor, can be used as a basis to project patterns in the CPI over time. Perhaps the programs that best personify this are ship production programs. One thing common to all ship production programs is that the CPI almost always drops over time. This occurs because as the ship moves from the shop, to the blocks, to the yard, work becomes more difficult to complete because workers move from working at a workbench (at the shop) to working at their feet when the unit is being assembled upside-down, to working above their head when the ship is in dry-dock or the water. Looking only at the current, or average, CPI, estimates for these ship production programs would always tail-chase. If one could somehow predict the systemic CPI drop prior to it occurring, a more accurate EAC could be developed. To see how to do this, an example analysis using representative data will be performed. Following this, results from a real-world implementation of this method at a production facility will be examined.

Figure 1 charts the CPI over the lifespan of 6 completed programs as well as the latest reported CPI for in-progress program 7. Examining the lines, it is not apparent that there is any trend that would yield any information applicable to estimating program 7. Figure 1 also shows the latest EVM report for program 7.

Program 7 BCWP $ 20 BAC $ 100 % Progress 20% ACWP $ 22 CPI 0.91

Figure 1 CPI Over Lifespan for Historical Programs

When regression analysis is performed on the historical data (programs 1 through 6) it is revealed that there is a significant relationship between a programs final CPI and its CPI at 20% progress. This implies that a programs CPI at 20% progress can

Performing Statistical Analysis on Earned Value Data February 2009

be used to estimate its final CPI. This CPI, along with the most up-to-date BAC, can be used to derive a statistically based EAC. The next step is to apply this relationship (shown in Figure 2) to the latest available EVM data from program 7.
CPI (20% Progress) vs CPI (100% Progress)
1.2 Final CPI (!00% Progress) 1 0.8 0.6 0.4 0.2 0 0.9 0.92 0.94 0.96 0.98 1 CPI at 20 % Progress Programs 1 - 6 Linear (Programs 1 - 6) y = 3.2877x - 2.3027 R2 = 0.8246

Figure 2 Regression of Final CPI vs. CPI at 20% Progress

Using the knowledge gained from the regression analysis, a predicted final CPI of 0.69 (rather than the current reported CPI of 0.91) is applied to the BAC. This EAC differs dramatically from that produced using traditional EVM calculations. More importantly, because regression analysis was used, the EAC is statistically significant, unbiased, and includes the uncertainty measures needed for quantitative risk analysis. Figure 3 shows the probabilistic cost estimate (S-Curve) for program 7, as well as the estimate developed using the traditional calculation from the Gold Card.

Program 7 Cost Distribution


1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

C u m u l a ti v e P r o b a b i l i ty

Downside, $157.65 80% Confidence Level Most Likely EAC, $145.10 50% Confidence Level

CV ~ 20%

Traditional EVM, $109.89 1.5% Confidence Level $75 $100 $125 $150

$50

$175

$200

$225

$250

Final Cost
Prediction Interval Distribution Point Estimate Traditional EVM CV 80th Percentile

Program 7 BCWP $ 20 BAC $ 100 % Progress 20% ACWP $ 22 CPI 0.91 Traditional EVM EAC $ 109.89 Statistical EVM Analysis Predicted Final CPI 0.69 EAC $ 145.11

Figure 3 Probabilistic Cost Estimate Including Uncertainty Measures from Regression Statistics

Statistically Based EACs vs. Calculated EACs


The previous section provided an example for how estimates can be developed using statistical analysis. Comparing how this estimate changes over time to how a

Performing Statistical Analysis on Earned Value Data February 2009

traditional, calculated EAC changes over time, as new data is released demonstrates how using statistical analysis eliminates the propensity for estimates to tail-chase. Although this data is representative, the results being demonstrated are similar to what has been experienced when the method is applied to real-world programs. Figure 4 shows how the statistically derived EAC compares to the calculated EAC (using the gold card best case equation2) at each 20% of progress. Notice how the statistically derived EAC remains stable while the calculated EAC changes from release to release. This is an example of the calculated EAC tail-chasing because it is not an unbiased estimator. The statistically derived EAC does not change because the regression analysis performed in the previous section has revealed trends throughout the life of the program in the CPI. The analysis then normalizes the estimate for changes in the CPI that should be anticipated to ensure an unbiased EAC. This means that so long as Program 7 behaves similarly to the historical programs in its CPI trends, the statistically derived estimate will not change.

Program 7
BCWP BAC % Progress ACWP CPI EAC Predicted Final CPI EAC $ $ $

20%

% Progress 40% 60% 80% 20 $ 40 $ 60 $ 80 100 $ 100 $ 100 $ 100 20% 40% 60% 80% 22 $ 54 $ 96 $ 131 0.91 0.74 0.63 0.61 Traditional EVM 109.89 $ 135.94 $ 159.36 $ 164.05 Statistical EVM Analysis 0.69 0.69 0.69 0.69 145.11 $ 145.11 $ 145.11 $ 145.11

EACs vs. % Progress

90% 100% 90 $ 100 100 $ 100 90% 100% $ 136 $ 145 0.66 0.69 $ $ $ 150.68 $ 145.11

$170.00 $160.00 $150.00 EAC $140.00 $130.00 $120.00 $110.00 $100.00

0.69 0.69 $ 145.11 $ 145.11

0%

20%

40%

60%

80%

100%

120%

% Progress Traditional EVM Statistical EVM Analysis

Figure 4 Statistically-based EACs vs. Calculated EACs Over Time

Data Requirements
In order to perform statistical analysis on EVM data, there are several important data requirements. The critical data required is progressing data from completed programs of a similar nature. Examples of programs that would be of a similar nature include those which are performed by the same contractor as is performing the work in question as well as those that would be considered close enough an analogy to include in a CER. The most obvious set of progressing data is earned value reports, but any dated cost report with an estimated completion date will do. The key is that the data allows a measure of progress to be developed (ex: percent of estimated schedule, percent of final schedule, BCWP/BAC, milestones such as PDR, CDR, etc.) The ideal data is that which has progress measures such as first flight or launch where there is a dependable measure of progress. The most difficult step in this method, besides data collection, is data analysis. Tools such as dummy variables can be used to handle events such as rebaselinings that, although common among programs, can lead to discrepancies in the analysis if not handled correctly.

Gold Card Best Case Equation: EAC = ACWP + (BAC BCWP)/CPI

Performing Statistical Analysis on Earned Value Data February 2009

Statistical Analysis and the EVM Process


The aforementioned techniques can be easily incorporate to fit within the EVM process. Due to the comparably high start-up cost for developing statistically-based EVM estimates (generally 1-3 weeks after the collection of historical data is complete), these methods are best applied when there is low confidence in the currently available estimates. This could be caused by the EAC demonstrating tail-chasing tendencies or a significant variance between the grassroots estimate and the calculated EAC. Once the statistically-based estimate is available, it provides an independent crosscheck of the available estimates. After the statistical analysis is complete, the recurring cost to update the estimate is minimal (4 hours 1 day). Updating the statistically-based estimate may not even be necessary if it verifies one of the original EACs when it is first developed. The next section will show the success of this method when applied across a production facility. It is taken in large part from the paper Ending the EAC Tail-Chase: An Unbiased EAC Predictor Using Progress Metrics.

A Real World Example: Progress-Based EACs


In 2006, a client was facing a two-fold problem in estimating the final cost of production units at their facility. First, estimates using the EVM calculations were found to tail-chase (every time an EAC was reported, the latest EVM metrics would already yield an increase above and beyond that EAC) and were viewed with wide skepticism both within the company and by their government client. Making matters even more difficult, a natural disaster had recently occurred at the production facility causing a sharp and prolonged decrease in productivity. The PM for one of the programs at this facility reached out to see if there was a way to develop more accurate and defensible estimates than were currently available. The resulting analysis represented the authors first experience with performing statistical analysis on EVM data. This specific implementation is known as the Progress-BasedEAC method. The key difference between this analysis and the example in the previous section is that in this case the final cost was regressed against ACWPs at various progress points (as opposed to the final CPI being regressed against the CPI at various progress points). To begin, as-reported EVM data was gathered for all units of the same type being estimated that had been produced at the facility. The ACWP at intervals of 10% progress was scatter plotted on a chart to see if any patterns were visible (Figure 5)

Performing Statistical Analysis on Earned Value Data February 2009

Reported Progress

Figure 5 Cumulative Labor by % Progress for 25 Units

When the above scatter plot was examined more closely, it became immediately apparent that the pattern in the points representing the final cost of each unit was visible as early as 30% progress. This can be seen when units 12 through 20 are examined closely in Figure 6. These units were in production during a period during which the facility experienced unexplained cost growth on many of their units. In many of these cases, this growth was not recognized until the unit was significantly along in its production cycle. From this graph it is apparent that had the facility compared the ACWP of any two units at equal percent progresses, they would have been able to predict at least relative cost growth. This chart lead to regression analysis being performed on the EVM data.

Figure 6 Zoomed in Cumulative Labor by % Progress for Units 11 through 20

Regression analysis was performed to answer the question: Can the final cost of a unit be predicted knowing only its ACWP at a certain percent progress? At each 10% increment of reported progress, the final cost was regressed against the ACWP. At 20% the first significant regression was found with an unbiased error of 4%. This lead the team to conclude that by 20% progress, the facility could predict the cost of any unit, unbiased, 4%. Additionally, the further along a unit was in its production, the less the error. A sample regression is shown in Figure 7.

Performing Statistical Analysis on Earned Value Data February 2009

SUMMARY OUTPUT Regression Statistics Multiple R 0.956210345 R Square 0.914338224 Adjusted R Square 0.90982971 Standard Error 173979.0514 Observations 21 ANOVA df Regression Residual Total 1 19 20 Coefficients 152908.7692 6.610824914 SS MS F Significance F 6.13857E+12 6.139E+12 202.80255 1.36728E-11 5.75105E+11 3.027E+10 6.71368E+12 Standard Error 262941.1092 0.464214768 t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% 0.5815324 0.5677177 -397433.2962 703250.834 -397433.296 703250.834 14.240876 1.367E-11 5.63921224 7.58243759 5.63921224 7.58243759

Intercept 20%

Figure 7 - Sample Regression

With the success of the regression analysis, further work was done to gain more insights into the results. The next step was to perform a regression of regressions. Each of the previous regressions was of the form: Final Cost = A * ACWP% Progress + C. After taking a look at the results, the intercept was removed from the regression3 to produce the equation: Final Cost = A * ACWP% Progress. In essence, A represents a multiplier that can be used to extract the final cost of any unit given an ACWP and associated progress. 1/A also represents the true percent progress in terms of cost. With the above regressions replicated for each 10% of progress, the A term was charted against its associated % reported progress (Figure 8). These plots were developed for two types of units with different schedules, costs and physical parameters. Shockingly, the lines representing the A multiplier for two types of units were found to be exactly the same.

Figure 8 - A Multiplier and % Complete (Cost) vs. % Progress

Several breakthrough insights were gained through the above graph:

In almost every single case, the authors strongly advise against removing the intercept from a regression. It was done specifically in this case to allow the slopes (A) to be regressed against the % progress for the regression of regressions

Performing Statistical Analysis on Earned Value Data February 2009

1. As the % complete (in terms of cost) vs. % reported progress line is non-linear, the facilitys EACs (using traditional EVM) must tail-chase as the CPI is always degrading. 2. The A multiplier for both types of units produced at the facility follow the exact same curve, meaning that the same curve can be used to estimate multiple unit types, even if those unit types were not included in the analysis. This fact was proven to be true over the next two years. 3. Each % progress cost progressively more as the unit moves along in production To estimate the final cost of a unit, the A multiplier for the current % progress is found from Figure 8. The current ACWP is then multiplied by A to find the EAC. For example, an ACWP of $50 at 10% would yield an estimate of: $50 * 13.2 = $690. As the cost per 1% progress rises throughout construction, traditional EVM calculations can never produce an accurate EAC. This is because the degrading CPI leads to consistent tail-chasing. Fortunately, this degradation is predicable a-priori (using the previous regression method) which is why the method works. The multiplier curves can be used to predict the ACWP at a future % reported progress. Comparing the actual ACWP to this prediction provides a method by which productivity can be monitored.

Summary: Progress-Based EACs


This method is a wholly-data-based method of EAC projection that relies upon progress and man-hour data alone. The model is: 1. Able to project EACs for all unit types at the facility within about 2% - 5% after about the 20% progress point 2. Able to work incrementally projecting work remaining given MH 3. Able to include uncertainty with the estimate because it is statistically based 4. Unbiased the error is symmetric specifically, it does not result in a tail chase Outside the scope of this paper, but just as important. Because the model can be used to predict future ACWPs, it can also be used to monitor productivity and measure the cost of events that cause drops in productivity. In the case of short term events, the model, because it is progress based, is able to separate out specific effects such as additional costs due to a fire or other exogenous event for units that were at least 20% complete before the event. This effect cost is obtained by subtracting the would-have-been cost from the actual final cost of the unit. In the case of long-term effects, because of its incremental ability, the model is able to add actuals up to an event, and, since it can predict ETC after any post-event increment of about 20% of progress has occurred, can predict ETCs after the event. This analysis proved nothing short of revolutionary for the client who had programs that had experienced multiple rebaselinings. To date, the method has correctly estimated the final cost of all 4 units it has been applied to. In 2006, midway through the production effort one of these units, the Progress-Based EACs method forecasted 60% cost growth for the program. This cost growth was predicted prior to the latest program estimate recognizing a single dollar of cost risk. After significant resistance, it took a full 2 year (till 2008) before the program team recognized that 60% cost growth was even feasible. It

Performing Statistical Analysis on Earned Value Data February 2009

10

took another 6 months (into 2009) before the program team recognized that 60% cost growth was, in fact, accurate. Following its success, the methods use was expanded. The analysis is now performed on all in-progress programs and the results are presented to executive management regularly. The method is also used to monitor productivity on all in-progress programs.

Conclusion
Performing statistical analysis on EVM data provides an invaluable capability in that: 1. CPI forecasts can be developed, thus avoiding the problem of tail-chasing when estimates are developed using only backwards looking equations 2. The EACs developed using statistical methods are unbiased, testable, and defensible 3. The uncertainty in the estimate, for use in risk analysis, is automatically included with statistically based EACs 4. The analysis can be incorporated into the EVM process to provide a third data point in addition to the calculated EAC and grassroots estimate Despite the utility of methods such as these, there are still hurdles to overcome before they can be widely implemented. EVM data from completed programs must be compiled and provided to cost estimators and cost estimators must become more involved in the EVM process.

Performing Statistical Analysis on Earned Value Data February 2009

11

Bibliography
Christiansen, D. (1994). Using Performance Indices to Evaluate the Estimate at Completion. Journal of Cost Analysis and Management , 17-24. Druker, E., Coleman, R., Boyadjis, E., & Jaekle, J. (2006). Ending the EAC Tail-Chase: An Unbiased EAC Predictor Using Progress Metrics. Society of Cost Estiamting/Analysis. New Orleans.

Performing Statistical Analysis on Earned Value Data February 2009

12

Booz Allen Contact Information


For more information please contact:

Eric Druker Senior Consultant

Booz Allen Hamilton Tel (314) 368-5850 Druker_eric@bah.com

Dan Demangos Senior Associate

Booz Allen Hamilton Tel (703) 902-6986 Demangos_dan@bah.com

Performing Statistical Analysis on Earned Value Data February 2009

13

Das könnte Ihnen auch gefallen