Sie sind auf Seite 1von 63

1

This presentation deals with 6 topics:


1. We start with a description of today„s software industry, using a metaphor.
2. We then summarize valuable empirical information that is hardly known
and seldomly used. I hope you are one of the users.
3. After this, a best practice set of 16 Key Performance Indicators is
introduced that we use to analyse and benchmark software capability.
4. We then show the results of 2 out of many case studies, showing the
powerfullness of this set, at least that is the experience so far of our
customers.
5. We end the presentation with conclusions.
6. Finally, we take the liberty to present a new non-profit organization to you,
that focusses on further exploring and exploiting the KPI set.

2
When I started my career in software engineering in the early eighties (yes, I am that
young), I started with designing and programming as a freelancer, working for many
companies in the embedded software industry. After more than 10 years, I became
more and more interested in reasons for project failures as this was a common,
returning factor in all assignments: too late, over-budget and many pre- and post-
release defects. I switched to quality engineering, started to define and implement
quality systems (using ISO 9001) and became involved in an SPI program within
Philips, where they introduced the SW-CMM in the early nineties. The SW-CMM was
the first eye-opener to me, and in my opinion much better than ISO standards.
Whereas earlier many manager explained to colleagues how good their own
performance was, now everybody smiled to each others, saying: „Yes, I am at level 1
as well!“. So, the SW-CMM became my thing, and after a few years, I founded a
consulting and training company. We grew in 4 years to 25 consultants/trainers and
served many organizations in- and outside The Netherlands.

However, in those years, we had little real successes and became frustrated seeing
many software manufacturers go for level-hunting only. We observed, that they didn„t
know their real capability other than being a typical Level 1 organization, adopting the
SW-CMM as the reference model for improvement and defining as next year„s target
Level 2. And we are afraid, this is still true today for many organizations. I think you
will agree to this, although there are good examples as well.

This presentation addresses how to measure real capability and to use the results as
a baseline for defining improvements that make sense and are measurable.

3
In our current assignments, we still see the majority of software projects being
steered at the scheduled deadline, the available budget and removing major
defects prior to releasing. And to put it bluntly, this is often irrespective of
CMMI levels as we shall see later.

Focussing on these metrics only is very dangerous. It ignores three important


dimensions:
1. What is the scope of the product being built?
2. What is the current and future quality of the product being built?
3. How efficient is the process applied for building the product?

Let us illustrate this with a metaphor.

4
This slide as well as the 2 next slides are from one of our workshops. We start
this workshop by asking people how fit this biker is. As you can see, the biker
needs more than 2 hours to cover a distance of 32 km or 20 miles. This is
actually my Sunday morning trip.

The typical reaction of workshop participants is that the average speed is


approximately 14 km/hour or 9 miles/hour, so that I am probably not that fit.
Are they correct? Well, we need some additional information to draw a
conclusion here. Analyzing distance and schedule only is a bit naive as we
shall see.

5
Here you see some additional information regarding the bike tour. As you can
probably see, the area where I live is not that flat. I live in a remote village,
high in the Swiss Alps at the end of a valley, somewhere between Berne and
Geneva. As a consequence, I am not driving a bicycle as Lance Armstrong,
mine is a mountain bike and part of the tour is on an unpaved road.

I have been driving this tour for years, and up till 2008, I had a very simple
approach. Take two bottles of water and 2 bananas, start biking, try to go as
fast as possible and never stop. If I had to make a stop, being out of breath,
my Sunday was ruined. If it took me more time than average, the same ...
Well, sometimes I was successful, but my performance over all those years
never really changed. Worse, growing older, I more and more often had a
ruined Sunday ...

Let us have a further look into some additional information.

6
Early 2008, I bought a nice GPS, including a cadence and heart rate sensor, in
addition to me speedometer. I started to record more data, and here we see 3
additional sources of information:
1. The elevation ranges from apprixomately 1„000 meters to 1„650 meters, with a total
elevation of 1„100 meters or 3„300 feet. At certain slopes, elevation is between 14%
and 16%.
2. Heart frequency ranges from 70 BPM to 160 BPM, going up when climbing. During
the last climb, heart fequency is still in control.
3. Cadence, being the number of wheel rotations per minute, is relatively constant,
from the beginning till the end of the tour.

The figures presented here are of summer 2009, and very different from when I
started recording data. In 2008, heart frequency often was up to 180 BPM, in which
case I had to pause. Cadence towards the end of the tour slowed down, being
extremely tired. The only positive thing was that I burned a lot of calories ...

So, having adopted this multi-perspective view has brought me a lot. When I am
driving now, I monitor my heart frequency to stay below 160 BPM and that makes the
big difference. I normally end the tour without stopping and in addition: I can better
predict the schedule, I go faster, I burn less calories and I arrive relatively fresh.

So, let us revisit the original question: am I fit? Yes, I am very fit, although this cold
winter has prevented me for undertaking tours since December ... We„ll see in April,
when the snow is gone ...

7
Unfortunately, many managers today feel naked when it comes to decision-
making as they lack useful information . Examples of typical questions faced:
1. Is this project plan realistic? Answser: well, this new project manager
seems very experienced ...
2. Are we still on track? Answer: well, we entered the coding phase according
to schedule, so probably yes ...
3. When can we stop testing? Answer: well, let us remove the most critical
defects found and hope for the best ...
4. How much effort should be reserved for maintenance support? Answer:
well, let us take last year„s budget, multiplied by 2 ...

And we base our hopes on adopting standards and models, that everyone else
seems to be adopting as well.

The biking metaphor clearly demonstrates that improving performance needs


measurement. I hope you agree to this.

8
So, maturing the software discipline to an engineering discipline is not about
satisfying models and/or standards. Improving without baselining your current
performance and setting realistic and measurable targets makes very limited
sense. You want to become more predictable regarding your performance,
faster by working more efficiently, more effective by reducing the scope of your
effort to what is needed and deliver better output regarding current and future
quality.

Well, you might say, interesting, but as a level 1 organisation I first have to
define and implement processes before I can collect and analyze any useful
data. In our experience, this is not true. First, the software industry has
collected and analyzed already a lot of empirical data for you. In the second
place, it is our experience that in every organisation, irrespective of maturity
levels, many sources with useful data can be found.

9
Those software manufacturers organizations that are successul, have some
characteristics in common:
1. They know what their capability is, not only in terms of compliance to
standfards like ISO and/or models like CMMI.
2. Estimates are not underpinned and any changes to assumptions made will
be analyzed, including the impact on the estimates.
3. Projects are actively tracked, hereby understanding that estimates are not
predictions. Not only schedule and effort are monitored, but also the
distribution of effort across activities, the scope of the project assignment
and the intermediate product quality.
4. New initiatives are not automatically adopted. Instead, their expected
effects on known capability is assessed first and carefully piloted.

And in all this areas, quantitative data is transformed to information, enabling


organizations to make informed decisions.

Let us now discuss some examples of empirical insigths and data.

10
I would like to share with you some empiricial models, that we often use in our
assignments, some empirical laws that everyone should know and some
benchmarking data, hereby referring to the work of Capers Jones.

It is our experience that what we shall summarize here, is hardly known and
seldomly used. This is not to criticize, it is a unfortunately fact.

11
The two, most widely adopted empirical models are probably COCOMO II from
Barry Boehm (University of Southern California) and the Putnam Slim Model
from Larry Putnam (QSM). For both models, supporting tooling is available on
the market.

These models allow organizations to make estimates for schedule and effort
and take estimated feature size and factors that influence capability or
productivity as inputs. Using such models requires experience and calibration.
Experience is gained over time, by using such models, which can sometimes
be accelerated by hiring external consultants. Calibration is best performed by
using the models the other way around. From past projects, schedule, effort
and size can be collected, that allows for determination of
capability/productivity factors.

Other models are available as well, and yes, using models in parallel will not
always bring converging estimates. On the other hand, experience and
calibration makes these models very powerful, and we recommend our
customers to learn about them and use at least one to underpin project
estimates.

12
Nearly all available models have in common that they allow for a trade-off
between schedule and effort. In software engineering, there is non-linear
relationship between those two, and accelerating schedule has a high penalty
on effort as we see here. One of the main reasons is that schedule
acceleration will require more people working in parallel, which requires more
communication and leads to more miscommunication. Vice versa, reducing
effort will have a penalty on schedule.

Further, most models distinguish an impossible region and an impractical


region. The impossible region is the region where everybody should stay away
from. It is the region where schedule is accelerated approximately 25% or
more compared to the nominal schedule, based on current capability, and
history has neraly never revealed projects being capable in doing that. The
impractical region is the region in which the limited project crew will normally
not be capable of dealing with all issues.

The relationships depicted here between fastest schedule, nominal


schedule/effort and lowest effort are general approximations and will differ in
individual organizations. Main message is that there is trade-off and especially
an impossible region where one should stay away from.

13
The second eye-opener in my career was when I read the book “Measures for
Excellence” from Putnam and Myers. This book or related books should be on the
shelf of every software manager and introduces two equations:
1. Putnam used some empirical observations about productivity levels to derive the
so-called software equation from a Rayleigh distribution. The software equation
includes a fourth power and therefore has strong implications for resource
allocation on large projects. Relatively small extensions in delivery date can result
in substantial reductions in effort. We see this equation depicted here, its distance
to the origin determined by size and productivity.
2. To allow effort estimation, Putnam introduced the so-called manpower-buildup
equation and is also depicted. Simplified, this equation tells us how fast people
can be effectively allocated to a project team. We call this the staffing rate and use
it later. Regard it as a measure for the slope in an earned value chart

Using nominal values for a project and its environmental characteristics, the nominal
development time is where the two equations intersect. It now becomes obvious that
accelerating schedule requires developing less size and/or becoming more productive
and/or increase staffing rate, as illustrated in this graph. Lowering size and increasing
productivity only, will allow for effort reduction and schedule acceleration.

Very powerful instruments to reason with software managers about capability and
trade-offs.

14
Another well-respected person in the software industry is without doubt Capers
Jones. He is a leading author and speaker on software productivity and
measurement. Although retired, he is still very active, at the same supportive in
helping other people. Many of his books can be recommended, one of the
latest ones being “Applied Software Measurement”. In this book,
benchmarking data can be found for different metrics, dependent on industry
segment and product size expressed in function points.

A summary is presented here. We see for instance in the upper left corner that
the average schedule for a Management Information System (MIS) of small
size (100 function points) is 6 months, whereas the best-in-class reported
schedule is 4.5 months. Other metrics are effort, productivity, removal
efficiency and defect density.

Such benchmarking data are considered very powerful as it helps to baseline


an organization‟s capability with external industry data. Unfortunately, most
managers have no clue how they perform in relation to industry. Even worse,
the question “How much does one function point for new applications cost
you?” remains unanswered.

15
Available benchmarking data can also be used for something else as we shall
see later. First, let us examine a Cost-of-Quality model as displayed here.
Distinction is made between four categories:
1. Core. Costs in this category are essential and bring direct value to a
customer by changing the product in some way: requirements, architecture,
coding.
2. Support. Costs in this category are essential but do not bring direct value
to a customer: project management, configuration management,
administrative support.
3. Prevention. These are costs incurred to prevent (keep failure and
appraisal cost to a minimum) poor quality: quality planning, process
improvement teams, reviews, inspections.
4. Appraisal, rework. These are costs incurred to determine the degree of
conformance to quality requirements: mainly testing and defect
detection/removal.

Using these definitions, it will be obvious that improving efficiency will normally
mean reducing the overall costs by avoiding or reducing appraisal and
rework costs generated at the right. This can for instance be done by
increasing prevention costs.

16
We use these categories to derive another reference for benchmarking. By
mapping available effort ratios for different project activities to these four
categories, we find the distribution over these four categories as displayed
here. In this example, using average project data from Capers Jones („Applied
Software Measurement“), we see for instance that the average Prevention
ratio when developing Management Information Systems equals 2.0%.

Interestingly, the ratios per industry segment are not very different from each
other. We are still investigating the explanation for this. In our own study
environments, we complement the ratios here with our own data.

17
This Cost-of-Quality approach is very simple, however not many organisations
are aware of its existence, although it has been published fro sofwtare
engineering in similar forms since many decades. It is very easy to implement
and to use. Take a project plan, map the effort of all scheduled activities to the
four categories and you have your ratios. Again, it is a very powerful
instrument that enables you for instance to validate the feasibility of a project
plan. If the projected ratio for the Appraisal/Rework ratio is 10% or lower, there
should be a good explanation for this! In addition, by analysing project from the
past, it is relatively easy to find out your own ratios as an internal reference.
When we present this model during our work, managers normally become very
enthusiastic and many of them start using it. Templates for project plans and
reporting are adjusted accordingly and control is substantially increased, if the
project managers have the discipline in updating their plans regularly and
accurately.

18
So far for available empirical information, we shall now continue with definining
a set of best practice Key Performance Indicators that help organiaztions to
measure the immeasurable: their own capabiity.

Worth mentioning here is that measuring is of course not new to software


engineering. Many metrics have been proposed and many experimented with.
However, these metrics are often at engineering level and can often not be
transferred to meaningful management information: it is simply the wrong data,
and either too much or too little.

When someone proposes a metric, always ask the question „Why this metric,
for what purpose?“ and „Suppose its value is X, Would this be good or bad?
And compared to what?“

19
For the remainder of this presentation we shall make a distinction between
Key Performance Indicators (KPIs) related to meaningful management
information on one hand and metrics related to process/product attribute data
at a lower detail level on the other hand. A set of KPIs will normally be derived
from business objectives, metrics will on their turn be derived from those KPIs.
The Goal-Question-Metric approach (from Victor Basili) might be used here, or
the refined Goal-Question-Indicator-Metric approach from the Software
Engineering Institute.

In summary, KPIs are to inform management, related metrics reside at a lower


detail level.

20
We shall direct in this presentation to a set of what we consider best practice
KPIs, e.g., KPIs that can often be found in software manufacturer
organizations. However, the set presented is not just the collection of what is
being used, the KPIs have been carefully selected.

Different selection criteria were applied as shown here. Ideally, each KPI
should be of value to project management, software management, business
unit management and corporate management. In other words, they should
support short-term development and maintenance efforts as well as mid-term
and long-term trends. Is this possible? Let us first see what is needed on
project level.

21
Typical questions to answer during product development and maintenance are
shown here. Distinction is made between four categories:
1. Project performance: What is the prediction of the performance of the
project? Classical questions in this category are: will the project meet its
schedule and budget?
2. Process efficiency: How efficient is the development process? Remember
the earlier discussed Cost-of-Quality model: how are project activities divided
among categories?
3. Product scope: How large and stable is the scope of the planned effort in
terms of features and implementation size? Increased feature size may
indicate scope creep, where increase of implementation size might have
hardware penalties.
4. Product quality: What is the expected quality of the resulting product?
Although many quality attributes can be important, dominating common ones
are reliability as an indicator for current product quality and maintainability as
an indicator for future product quality.

22
Here we see a further decomposition of these categories, for each of them 4
KPIs are listed.
1. Project performance: besides schedule and budget (or cost), we also
define staffing rate (remember the manpower-buildup equation of Putnam) and
productivity. Note that we prefer to express productivity in function points per
time unit, not in LOC per time unit.
2. Process efficiency: How efficient is the development process? Yes, we use
the four categories as KPIs here as defined in our earlier discussed Cost-of-
Quality model.
3. Product scope: How large and stable is the scope of the planned effort in
terms of features and implementation size? In addition to feature size, we also
define deferral rate (ratio of features delayed to future releases) and re-use as
KPIs. Deferral rate is an important indicator to see how well we can meet the
agreed scope, whereas re-use tells us how much available code can be used
in the new product.
4. Product quality: What is the expected quality of the resulting product?
Here, we define four KPIs that are related to reliability and maintainability. High
complexity at architectural level and/or code level is bad for maintainability. It
also negatively influences the other 3 KPIs (test coverage, removal efficiency
and defect density), that are related to reliability.

23
In summary, we now have four KPI categories with each four KPIs. Normally,
everybody agrees that these KPIs make sense at project management level as
well as management level. They support project management in analysing,
planning and monitoring projects. At the same time, they inform management
where a project stands and in what direction it is heading.

But do they also support business units in measuring their capability


improvement over time and organisations in comparing/benchmarking
business units? Let us see.

24
Here, the typical effects on all sixteen KPIs are shown as maturity or capability
increases.
1. Project performance. Schedule and budget (cost, effort) will be reduced,
whereas staffing rate and productivity will increase.
2. Process efficiency. By investing in prevention, appraisal/rework will
decrease dramatically, have a positive effect on the ratios for core and
support activities.
3. Product scope. Feature size will be brough back to what is really needed,
less features will be deferred to next releases. As a consequence,
implementation size will drop. Normally, the re-use level will go up.
4. Product quality. By reducing complexity at architectural and/or code level,
significant gains will be obtained as through higher test coverage, the
removal efficiency will increase and therefore defect density will decrease.

25
So what do we have so far?
1. We have demonstrated that assessing real capability requires a multi-
dimensional view. Remember the biking metaphor! Only by measuring
more than just a few things, you get a basis for real and focussed
improvements.
2. There is nothing wrong with standards and models, they will help as well.
However, compliance to them is no guarantee for capability improvements.
3. The sixteen KPIs not only help project managers and their direct superiors,
but also allow for measuring trends over time.

After this long introduction, I am sorry for this, we can now have a look at
some results from case studies we performed.

26
We shall present here the case study results of two different organizations,
obtained through the application of the discussed best practice KPI set.

Both organizations develop embedded systems and deliver them to other


businesses. Although the systems are not consumer products, volumes are
not low and vary from hundreds to many thousands. One system is considered
safety critical, the other system is complicated due to regulations in the area of
security of information, that may not be manipulated in any way.

In early discussions with both organisations, it was revealed that process


improvement is institutionalized since many years and that CMMI L3
compliance is within reach.

27
In both cases, we knew we had to deal with some common issues:
1. So far, we have not identified strong benchmarking data for deferral rates
and re-use levels. We are building our own database for this, feeding it with
our own study data.
2. The same holds for benchmarking data regarding test coverage and
complexity. Although we had a good idea ourselves what should be strived
for, solid reference data has not been identified so far.
3. Finally, a common issue in embedded systems is the way feature size is
calculated. Although we prefer function points, the only data available was
lines of code. We used the backfiring technique to convert lines of code to
function points. In both cases the programming language used was C, the
resulting number of function points was close to 1„000 in both cases.

Having said this, we can now have a look at a summary of the results.

28
Here we see how both case studies compare to benchmarking data regarding
schedule in calendar months. A bit higher than industry average, which might
be a consequence of safety requirements for case study A and security
requirements for case study B.

From the graph, we conclude that there are no big surprises here.

29
Here we see how both case studies compare to benchmarking data regarding
effort in staff months. We see that both cases reveal much higher effort than
industry average, which might again be a consequence of safety requirements
for case study A and security requirements for case study B.

From this graph, we conclude that there are still no real big surprises here,
although we already know that this will also result in lower productivity levels.

30
Here we see how both case studies compare to benchmarking data regarding
staffing rate. Again, staffing rate is a measure for how fast people can
effectively be allocated to a project team. Regard it as a measure for the slope
of your earned value or S-curve. We will avoid here explaining the formulae
used.

From the graph, we conclude that staffing rate is only a bit higher than industry
average, so the staffing profiles reveal nothing special.

31
Here we see how both case studies compare to benchmarking data regarding
productivity in function points per staff month. We see that both cases reveal a
much lower productivity level than industry average, which we concluded
already earlier when discussing effort levels.

32
Now, things are getting more interesting. Here we see how both case studies
compare to average effort distribution, using the Cost-of-Quality model. We
see in both cases a ratio for Appraisal/rework of approximately 50%, which is
very high. One might argue that this again is a consequence of safety
requirements for case study A and security requirements for case study B.
However, in that case we would have expected much higher ratios for
Prevention.

Presenting these results to both organizations was received with a shock.


Remember how easy these ratios can be calculated, also remember that these
organizations have a rich history in process improvement. It was in both cases
a major eye-opener to them with consequences.

33
Finally, we present some benchmarking results regarding defect density and
removal efficiency. To explain how they are calculated, look at the example
presented here.

The removal efficiency can only be calculated post-release time, based on the
defects discovered during development and a certain time after havign the
sofwtare released. It is the percentage of defects found before the software is
released to its intended users. Defect density is calculated as the number of
defects found post-release related to the feature size (preferably in function
points). As said, in embedded system environments feature size is often
backfired from the implementation size in lines of code, which is not the best
approach.

34
Now, things are getting even more interesting. Here we see how both case
studies compare to benchmarking data regarding defect density in defects per
1„000 line of code. For software with safety requirements and security
requirements one might expect have have better figures than industry
average, the contrary is the case here.

The answer of management in both cases was that after releasing the
software, many additional tests took place and delivery was to a limited
number of users only. In other words, the defect density that finally reached
the end-user was less high. On the other hand, they acknowledged that post-
release maintenance and support costs were extremely high and should be
reduced.

35
And, as we would have expected, the high defect density find its origin in the
low removal efficiency: too many defects remain uncovered during
development and are detected post-release time.

36
What we have not discussed so far, is the quality of the software itself. During
our studies, we normally perform in in-depth analysis of the source code itself
to assess the quality of the architecture and the source code. The important
KPI we focus on is complexity. This is an indicator how well a project team can
perform its development and maintenance tasks. As such, it also influences
the quality of the software product as experienced by the end-user. Higher
complexity will normally lead to lower test coverage, higher defect densities
and thus, higher running costs.

But how can complexity be measured? Is it the famous cyclomatic complexity


measure of McCabe or is it more? Let us see …

37
In our definition of complexity, we take a much broader view than cyclomatic
complexity at source code level. We use an aggregated measure the
combines complexity seen from three different perspectives:
1. Understandability. Here we look at dependencies at different levels and
compute the fan-out and cyclomatic complexity as an indicator.
2. Modifiability. Here we look at the number of places that require attention
upon modification and compute both change propagation and code
duplication as an indicator.
3. Verifiability. Here we look the number of required tests and requires test
time and compute with automated tooling the fan-out and cyclomatic
complexity as an indicator.

Let us look at some examples.

38
Here, we see the results of an automated analysis at source code level.

The source code is divided into rectangles, each presenting part of the source
code. The size of each rectangle corresponds to the number of files. Each
rectangle is sorted from the upper left corner to the right corner below, based
on change frequency. A black area indicates no changes, a blue area indicates
changes over the time frame selected. Within each blue sub area, on the lower
right corner, the files are listed with high change frequency and high cyclomatic
complexity. This helps teams to quickly identify not only which parts of their
code is complex, but also in those areas that are frequently changed. These
areas are bad for understandability and verifiability.

Refer to www.solidsourceit.com for the tools used here.

39
Here, we see growth of a software application over time, each file
represeneted by a horizontal line. In addition, it has been derived when
changes are made to the files (red spots). We see here that in several
situations, when new functionality was implemented (growth), many changes
propagated over many files. This indicates a low level of functional cohesion,
which is bad for modifiability.

Refer to www.solidsourceit.com for the tools used here.

40
Finally, we show an example of code duplication, which is also bad for
modifiability. At the right top corner, you see duplication between modules or
packages. When there would be duplication, the inner circle would be empty.
In this case, there is approximately 17% code duplication, which we often
found in our studies. This is actually very bad, as it undermines quality and
productivity.

Note that the tooling used here, allows to select an area of code duplication
and in the right top and see the duplication itself at the lft bottom window at
code level in the two related files.

Refer to www.solidsourceit.com for the tools used here.

All analysis shown here, are included in our studies and results are included in
the debriefing sessions.

41
Okay, back to our case studies.

In both cases, is was very clear to all stakeholders that there are two main
weak areas:
1. The effort distribution revealed a very insufficient process, with a high ratio
for Appriasal/rework. If post-release efforts for fixing defects would be
included, the ratio would even become higher!
2. The code quality in both cases was low: many areas with high cyclimatic
complexity were found, a high level of code duplication and a high level of
change propagation. As a result, problems arise regarding understandability,
modifiability and verifiability.

These two areas were considered the primary causes for low overall capability
and were used as the basis to define improvements as we shall present now.

42
No surprise of course ... Appraisal and rework efforts can potentially be
dramatically reduced by smartly investing into prevention. In this case, it was
recommended to introduce architecture evaluations as well as active design
reviews. Secondly, it was decided to re-inforce inspections for critical modules.

The expected effects are clear. Less defects before coding and testing will
positively influence removal efficiency and defect density, lead to lower
appraisal and rework costs and finally lead to a faster schedule, lower effort
and higher productivity.

43
No surprise here as well of course ... Problems in the area of
understandability, modifiability and verifiability are a direct result of low code
quality. In this case, tooling was connected to the configuration management
system to support engineers and managers to monitor the code quality and
implement improvements for identified critical code parts.

The expected effects are clear. By reducing, eliminating and even avoiding
such problems, test coverage will for instance increase. Further, we see the
same pattern. Higher test coverage will positively influence removal efficiency
and defect density, lead to lower apprasial and rework costs and finally lead to
a faster schedule, lower effort and higher productivity.

The good things is that after such a study, everybody agree on what is weak
and in which areas improvements should be sought. In addition, the availability
of quantitative data allows for developing a solid business case. That is
different than stating „We want to be at level X by the end of next year“!

44
The conclusion that investing in process improvement is a waste of money
would be wrong. Process improvement makes sense, as it standardizes
development processes. This creates transparency with respect to roles,
responsibilities, activities and deliverables. However, standardization is no
guarantee for real performance improvements. That is why aiming at higher
CMMI levels only is a wrong approach. The recommendation is to focus on a
multi-dimensional assessment of the performance of an organization and
derive improvements that make sense. A first step will be baselining the
current performance using the best practice KPI set. In case measurements
are not place, the first improvement actions are identified. As a second step,
realistic target values must be determined for a given period of time. The gap
between target and actual values is the basis for deriving improvement steps.
It is hereby essential that the expected direction of each indicator is known. If
changes in the right directions can be achieved, then a performance
improvement is guaranteed. In addition, the positive effects are likely to show
up in each category. This reduces or even eliminates the chances of sub
optimization.

45
Process improvement and standardization should not be a goal in itself. Real
performance improvement is achieved by taking a quantitative view on
processes and products, and setting realistic and quantified improvement
targets.

This brings us to an important overall conclusion. The interesting fact is that


improvements can only be achieved by changing the way of working. And of
course, a validated improved way of working should be shared: considering
standardization across projects becomes a natural process. Therefore,
improving capability along a quantitative axis can perfectly combined with
other improvement initiatives like CMMI. Educating an organization in
collecting, analyzing and using data will automatically set requirements for
processes to be used. The difference is: now we know why and what we hope
to get as a result.

46
Time for another maturity model ...? No ...

This model only shows what we should expect in the area of measurement
when maturing through the commonly known CMMI stages. From a situation
where measurements do not exist or only on an ad hoc basis, we should strive
for a level where performance or capability is further optimized using KPIs.
Whether these KPIs match the best practice KPI set as presented here is not
improtant. You may start with this set and find out along the way that you need
other as well. Feel free!

47
Finally, I would take the liberty to shortly introduce to you a new non-profit
organization that was founded last year. The benchmarking studies as
described here, related workshops and supporting tooling of which you have
seen some examples were brought under one umbrella in the Software
Benchmarking Organization. The resulting portfolio is brought to the market
through a netwework of accredited partners.

You can read more about this organization on the website to which the link is
showed here.

48
Founding partners of SBO are SE-CURE AG, the company currently
presenting to you, and SolidSource BV, from which you have seen some
product examples.

49
There is probably no need to discuss in detail what the objectives of SBO are.
In summary, we want to repeat many studies as we presented to you today.
And yes, we consider ourselves a non-profit organization. Any profits
remaining from our activities will be re-invested in maturing and extending the
portfolio. There is still a lot of work to do!

50
You could help us, not only by contacting us to conduct a benchmarking study
or to organize a workshop. If you are offering services yourself, we could
discuss the possibility to involve you as a partner. It won‟t make you rich, but
the work is exciting and very useful (at least, we think …).

Finally, I would to share with you the comments of two of our clients. Both
clients were not involved in the case studies presented here.

51
Let this slide speak for itself.

Note: this reference was not one of the case studies discussed here.

52
Let this slide speak for itself.

Note: this reference was not one of the case studies discussed here.

53
As said, the Software Benchmarking Organization has been founded to
support the industry, not for us trying to become easily rich. We are available
talking to you, exploring how we can help the software industry to the next
level, a bit closer to a real engineering discipline. Please feel free contacting
us.

This brings me to the end of this presentation. Let me thank you for your time
and patience!

Dr. Hans Sassenburg

54
55
56
57
58
59
60
61
62
63

Das könnte Ihnen auch gefallen