Sie sind auf Seite 1von 2

.

Table 1 Workload of the Chief Developer and Developer B

Case Study

Period of Time

Development Area
Library Application 1 Application 2 Library Application 1 Application 3

The key to successful measurement programs is to make the metrics meaningful and tailor them to the organization however small it might be. Here the author explains how he helped three small companies reduce the time spent handling change requests, correcting errors, and generating system versions and production releases. B oth the developers and customers appreciated the improved practices.

Originally, the software developers greeted our activities with a great deal of skepticism, but they eventually accepted them because we kept the measurements simple, tailored them to each organization, and ensured that they produced valuable information. In the end, the programs provided a foundation for taking care of customers and for planning and carrying out future work.

They also participated in regular project meetings. The overall project was coordinated in meetings of researchers and the local project leaders, wherein we discussed formal and informal matters and exchanged ideas.

Period 1 (JanuaryApril) Period 2 (MayOctober)

Number of New Program Statements


13,972 200 817 8,919 524 40

Hours Spent by the Chief Developer


480 0 10 339 4.5 5

Hours Spent by Developer B


40 107 92 53 204 22.75

DEVELOPING METRICS PROGRAMS


During the rst six months, we dened the metrics scheme in parallel with the development of the configuration and change management routines. The original project plan included a formulation of measurable outcomes, but early discussions made it clear that neither the project leaders nor the other developers were convinced about the benets of an overall metrics program. The prevailing mood
Period of Time
Period 1

PROJECT BACKGROUND
The metrics programs came about as part of a process improvement experiment sponsored by the EUs European Systems and Software Initiative program. The aim of ESSI is to motivate European software-producing organizations to test and deploy software best practices. Under this initiative, organizations perform their experiment in a reallife, commercial project over a period of up to 18 months. They can employ external consultants or researchers for support. Our project involved three small companies that received ESSI support to perform a process improvement experiment focused on controlling source code versions. ESSI asks that the companies use metrics to verify and validate the effect of their improvement actions. I was part of a team of researchers called in to consult and mentor the project teams to this end. All three companies were less than ve years old and had five or fewer employees. Each based its business on developing one main product, which was either administrative or technical software. Everyone on staffincluding the company founderswere working as system developers. Staff membersexperience ranged from a few months to 10 years, but few had a formal background as software or business professionals; most were trained in engineering or other natural sciences. In each company, a senior developer acted as local project leader and principal project member. Their responsibilities were to develop and introduce the configuration and change management routines and to establish the metrics program that would monitor and control the routines impact. The project leaders were also responsible for collecting the measurement data. Other developers in each company provided feedback about the feasibility of the practices and the measurement data.

Table 2 Tutoring Time by Chief Developer and Developer B Effort


Development Area
Library Function 1 Library Function 2 Library Function 3 Application 1 Application 2 Library Function 1 Library Function 2 Application 1 Application 3

Tutoring Hours (Chief Developer)


5 5 2 0 0 5 4 4 0.75

Hours Spent on Task (Developer B)


30 17 4 107 92 24 11.75 204 22.75

Wolfgang B. Strigel, editor wstrigel@spc.ca

Making Sense of Measurement for Small Organizations


Karlheinz Kautz, Copenhagen Business School

Period 2

Early discussions made it clear that neither the project leaders nor the other developers were convinced of the benets of a metrics program.
among developers was resistance and doubt. They doubted software work was measurable at all, questioned the usefulness of data collection, and feared that bureaucracy would overwhelm their small enterprises. They also expressed concern about extra workload and anxiety about the measures being used to control the employees. Thus, instead of developing a comprehensive metrics program where little of the collected data would be fruitful, we followed the recommendations of Shari Lawrence Pfleeger and her colleagues,1 and worked with the practitioners to develop simple, quantitative, small-scale metrics programs for each company. In discussions informed by actual work with the new configuration and change request management routines, developers came to a shared understanding of metrics and decided what would be interesting to measure. Distrust gradually disappeared and the developers began to see that measures were taken to control the processes, not the people. They then were clear to decide which measures were most critical for them. Each company had different key objectives. Company A wanted to let all developers work on all parts of their product, thus reducing reliance on the chief developer. Company B wanted to increase the number
March/ April 1999 IEEE Software
1 5 1 6

easurement and metrics programs are fundamental to software process improvement; without them, it is difcult to know how improvement efforts are actually affecting our process. Although large and resourceful organizations report their measurement experiences in journals and magazines, we know little about software process improvement in general, and less about how metrics are used in small organizations. Given that in many countries smaller organizations constitute the majority of software enterprises, this knowledge is not only needed, but vital to the eld. In this article, I describe my experiences working with project leaders and staff to develop and use small-scale metrics programs in three small software companies. In each company, we implemented metrics programs to evaluate how new practices and tools for conguration and change management were affecting the software process. These practices were accompanied by new procedures for product and process documentation
.

of accepted change requests handled in a given time frame. Company C wanted to reduce the time it took to handle customer change requests and nalize releases for shipping. Data for each project was collected during two periods: Period 1, which lasted two months for Company C and four months for companies A and B; and Period 2, which lasted six months for all three companies. The gures were aggregated using existing data, which was easily gathered from the configuration item libraries and the change request databases. Using existing data was a compromise; it eased developersresidual doubt about measurement benefits and freed them from the workload increase that daily data collection demands. During Period 1, companies A and B collected data when routines were not fully in use; only a rst, rudimentary version of the source code control routines was operational. Company C was not working on a new release of their product during Period 1 and thus decided not to collect any data; instead they reconstructed historical data from an earlier release covering a two-month period. This reconstruction was partly based on rough estimates, particularly of working hours. Gerald Weinberg observed that faking time reports is part of the universal culture of software development and that programmers tend to tell the
IEEE Software March/ April 1999

lies their managers want to hear.2 Given this, the quality of the collected metrics data might be in doubt. However, the ESSI program considers projects successful when routines and tools do not produce the expected results, as long as the outcome is derived from quantitative and qualitative measurements. Thus, there was no reason to falsify data in this experiment. Project leaders conrmed that both their own data and that accumulated by other developers were collected with as much rigor and care as possible to secure credibility and precision. In addition, at the projects end we backed up the quantitative data with qualitative data gathered through assessment interviews with the project leaders, other developers, and customers.

MEASUREMENT IN COMPANY A
Company A collected effort data from their change-request database, which also included timesheet data during the project. They compared the ratio of new program statements and the time spent to produce them in periods 1 and 2 to assess whether they were meeting their aim: to reduce the workload of and dependence on the chief developer. To reduce extra work, this number was computed automatically by comparing the program statements of different versions of the library items

FROM THE TRENCHES:

1 4

IEEE Software

March/ April 1999

0740-7459/99/$10.00 1999 IEEE

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

Period of Time
Period 1 (JanuaryApril) Period 2 (MayOctober)

Status
Fixed Open Fixed Open

Table 3 Change Requests Received and Delivered


Requests
11 9 65 6

Delivered on Time
5 50

Delivered Late
6 6 15 3

Still within Delivery Date


3 3

Period

Period 1 Period 2

Developer A
72.0 192.8

Table 4 Hours Spent on Error Correction


Developer B
13.0 85.2

Developer C
9.5 47.8

Total

94.5 325.8

Average per Error Correction


2.4 4.9

and the applications. Because this involved adding, changing, and removing statements, negative numbers were possible. Period 1 figures are based on data prior to improvements; Period 2 shows gures after conguration and change management had been implemented. Table 1 shows the results from the two measurement periods. Relative to the chief developer, the hours developer B spent on library development increased from 8 to 16 percent. This allowed for more workorganization exibility. The chief developer spent fewer hours on library development in the second period; the time saved was used for business administrationa positive by-product of the experiment. During the two periods, the number of new statements per work hour was about the same, indicating that efciency did not decrease despite the decrease in the chief developers programming time in Period 2. Company A also tested to see if improved documentation procedures would decrease the chief developers tutoring time, thus enabling all developers to work in all development areas and facilitate future expansion of the development team. Period 1 measured effort prior to improved documentation procedures; Period 2 measured effort after both conguration and change management routines and documentation procedures were in place. Potential benets were measured by how many hours the chief developer spent tutoring and assisting developer B, and howmanyhoursdeveloperBspentonthetasks. As Table 2 shows, the relative number of hours spent tutoring and carrying out a task had not changed; the tutoring hours remained constant, despite improved documentation. Thus, the company aims were not supported in this case. However, the results can be used to more accurately plan time and resources for future projects.

on time; that is, if they delivered on the estimated date and if the delivery rate improved as a result of the improvement project. Period 1 gures are based on data prior to the improvements; Period 2 after the new configuration and change management routines were in place. Table 3 shows changerequest data that includes all categories of change requests, but consists mainly of error reports from internal testing and from customers. In Period 1, an average of ve change requests were registered each month; in Period 2 the average per month was nearly 12. The company said the increase was due to improved registration routines, an increase in customers, an increase in customers software use, and more systematic internal testing. As the table shows, despite the increase in requests, the number of xed change requests delivered on time increased from 45 to 77 percent. Company B also measured the time spent merging the code to produce a consistent version after different parts of the product were changed. Merging time was reduced to less than a third, from 90 to 20 minutes on average. A similar reduction was observed for the weekly review meeting of the test runs, from 120 to 30 minutes on average. This allowed for extra testing without spending more time. Althoughwehavenohardevidenceregardingeffort and error rates, in their nal assessment employees said that working conditions had improved andsoftwarecorrectnessandintegrityhadincreased. This was supported by a decrease in customer error reports after delivery and thus higher customer satisfaction. Interviews with customers further veried this claim. Developers attributed the decrease in errors to improved error preventionthey said they made fewer mistakes, particularly in the merging process.Theyalsosaidthatcorrectingerrorstookless time as the stored test results let them easily locate erroneous code. This may explain the improved ontimedeliveryrate;however,substantiatingthiswould require a further extension of the metrics program.

Period

Period 1 Period 2 Period 2

Description

Table 5 Hours Spent Preparing Releases


Developer A
44.5 20.2 1.0

Preparing release X Preparing short-notice release Preparing release Y

Developer B
26.0 8.0

Total
70.5 28.2 1.0

MEASUREMENT IN COMPANY B
Company B was interested in tracking the extent to which they delivered accepted change requests

MEASUREMENT IN COMPANY C
Company Cs interest was centered on planning and estimation. They emphasized time spent hanMarch/ April 1999 IEEE Software

dling customer change requests, whether for extensions,whichwereclassiedaserrorsandimmediately addressed, or for new functionality, which was not. During Period 1which relied on historically constructed dataonly eight of 40 requests were classied as errors and subsequently corrected. The company attributed this to a lack of information about the requests, which prevented thorough analysis. In Period 2, analysis and testing had been restructured: earlier releases were reconstructed and simplified, and errors were systematically logged and stored in the change request database. The developers said that this helped prevent the loss of information and error reports. The Period 2 data supports this assertion: Out of 67 requests, 55 were accepted as errors and reprogrammed. As Table 4 shows, the more precise examination and additional coding doubled the average time spent on a request in Period 2. This gure was not surprising; the company considered it more realistic and could thus better plan and inform the customer about delivery of error corrections. My discussions with customers confirmed that, despite some reservations, they were more content with the companys service and software because of both more accurate error correction estimates and an increased number of corrected errors. Finally, Company C measured the time spent for thenalpreparationofareleasebeforeshipping.Table 5 shows the results. During Period 1before the introductionofthenewpracticesnalpreparationof releases included manual integration and test of sourcecodeinterfaces.Thistaskwastypicallynotperformed as changes occurred, and often didnt occur at all until the last weeks before actual release. Developers viewed this as cumbersome and timeconsuming. The new practices allowed a continuous and controlled integration of all changes during regular enhancement and development processes. As Table 5 shows, the time spent on nal preparation of a release was drastically reduced. Even
17 18
IEEE Software March/ April 1999

when only parts of the routines were in place and a release had to be prepared quickly, the new practices led to a reduction of more than half the necessary effort. After the full deployment of the new practices, the activity became routine and took about an hour. In nal assessment, employees said that the real advantage lay in the fact that compiling a release package during development was less likely to produce errors, though there is no hard evidence for this. They also said they found the work less tedious and thus more professional and motivating. Again, further measurements are required to conrm this impression.

that the organization must understand the impact of these changes on its culture. Everett Rogers contributes as well with a broader theory of innovations diffusion by stressing the role of communication, which inuences the perception of the innovation itself.5 He emphasizes the need for appropriate knowledge, the importance of suitable communication channels, the signicant role of external change agencies and agents, and advises that resource-poor organizations be organized into mentor-supported networks and granted economic compensation. In this project, we took all of the preceding issues intoaccount.However,thefundamentalfactorinour projects success was that we adjusted each key element to suit the company. It is not simply a matter of following predened checklists and guidelines. Debate is ongoing as to how and when to apply metrics, and which metrics to apply for evaluation.1,6,7 Our project was guided by the insight that technology transfer is not a context-free technical matter, but a process taking place in a social environment.8 Thus, each subproject developed their own way of defining and using metrics. We chose an approach inspired by the GQM9 paradigm and the ami method,10 but used only those steps suitable for the project. Knowledge of large-scale improvement approaches was helpful, but the project showed that it is not necessary to follow them completely to be successful. For smaller companies, where most work is coordinated through direct supervision and mutual adjustment,11 it is important to nd a balance between these mechanisms and formal, dened, and closely documented procedures to facilitate organizational learning.12 Even very small companies with ve or fewer developers need some basic formal routines. We achieved this equilibrium by selecting metrics programs that had a dedicated purpose rather than those that were more abstract and likely useless. Implementing the metrics programs was demanding. We had to overcome resistance and nd appropriate metrics. However, the project demonstrated that developers will accept metrics when they prove useful and are integrated with other development-practice modifications. Our approach, which introduced changes in a step-wise manner,

Context is key

met this goal and was judged cost-effective by the organizations, which spent $60,000 and approximately 200 person-days each on the experiment. Besidesfewererrorsreportedfromcustomers,the companies measured several other project benets. Improved procedures reduced the time developers spent on handling change requests, correcting errors, and generating system versions and production releases. Improveddocumentationproceduresindicated thepossibilityofmoreexibleworkorganizationand development-team extension. Improved documentation helped developers to better understand, plan, and structure their work and supported the customers need for accurate information. However, these measures could arguably be attributedtofactorsotherthantheintroductionofroutinesandtools.Historicallyreconstructingdata,computing lines of code, and assessing average time

DISCUSSION
All project participants accepted the metrics and measurements and consider them useful. In the future, they plan to use them on a regular basis and collect and evaluate data more often. What are reasons for this success? There are several. This project proceeded according to guidelines provided by current literature. Barrie Dale and his colleagues offer a list of key elements for success in quality management in general,3 and my work with Tom McMaster offers key factors of successful methods and techniques for system development.4 Both lists largely overlap and include several decisive factors for successfully introducing new technologies or techniques: management support and commitment; project planning and organization; staff involvement and teamwork; education and training; and assessment, monitoring, and evaluation of the new work practices and their technical support. A final decisive and common factor was that the changes introduced must be usable and valid, and

We adjusted each key element to suit the company. It is not simply a matter of following predened checklists and guidelines.
spent over long intervals can introduce imprecision and inaccuracy. Given these potential gray areas, it is allthemoreimportantthatweusemetricswithcare. Nonetheless, the companies involved in this project were both comfortable with the measures and well aware of the danger of over-interpreting the data. Supported by the qualitative measures, the participants remained condent that these metrics indicate interesting tendencies, and they will continue to use and improve their metrics programs. For them, the project was a rst step in quality management. Further metrics are needed to control the development process, and collecting and evaluating the data more often are the next steps.

Recommended guidelines

rom a research viewpoint, we need further study to understand work practice in small organizations. We must also investigate the role of software process assessment and improvement approaches, and especially metrics, for their ability to enhance practice and quality in small organizations. Once this is done, we can give small companies more qualified advice on how to start software process improvement programs and on how to establish metric schemes.
March/ April 1999 IEEE Software

19

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

This project was supported by the European Union under ESSI. The Norwegian Computing Center provided the environment for the original research. Thanks to all participants from each of the project companies and to Even by Larsen and Kari Thoresen for their continued cooperation. The IEEE referees and the members of the Process Improvement project at the University of Aalborg, Denmark, contributed with valuable comments.

A CKNOWLEDGMENTS

R EFERENCES

1. S.L. Peeger et al., Status Report on Software Measurement, IEEE Software, Mar./Apr. 1997, pp. 3343. 2. G.M. Weinberg, Quality Software Management Vol. 4: Anticipating Change, Dorset House Publishing, New York, 1997. 3. B.G. Dale et al., Total Quality ManagementAn Overview, Managing Quality, Prentice Hall, New York, 1994, pp. 3-40. 4. K. Kautz and T. McMaster, Introducing Structured Methods: An Undelivered Promise? A Case Study, Scandinavian J. Information Systems, Vol. 6, No. 2, 1994, pp. 5980. 5. E.M. Rogers, Diffusion of Innovations, third ed., The Free Press, New York, 1983.

6. J.D. Brodman and D.L. Johnson, Return on Investment from Software Improvement as Measured by US Industry, Software ProcessImprovement and Practice, Pilot Issue, Aug. 1995, pp. 35-47. 7. T. Hall and N. Fenton, Implementing Effective Software Metrics Programs, IEEE Software, Mar./Apr. 1997, pp. 5564. 8. R. Hirschheim et al., A Social Perspective of Information Systems Development, Proc. Eighth Intl Conf. Information Systems, ICIS, Pittsburgh, 1987, pp. 4556. 9. V.R. Basili and H.D. Rombach, The TAME Project: Towards Improvement-Oriented Software Environments, IEEE Trans. Software Eng., Vol. 14, No. 6, 1988, pp. 758773. 10. K. Pulford et al., A Quantitative Approach to Software MeasurementThe ami Handbook, Addison Wesley Longman, Reading, Mass., 1996. 11. H. Mintzberg, Structures in Fives: Designing Effective Organizations, Prentice Hall, Englewood Cliffs, N.J., 1983. 12. I. Nonaka, A Dynamic Theory of Organizational Knowledge Creation, Organization Science, Vol. 5, No. 1, 1994, pp. 1437.

About the Author


Karlheinz Kautz is an associate professor at the Institute for Informatics at Copenhagen Business School. Previously, he was senior researcher at the Norwegian Computing Center and a lecturer at universities in Germany, England, and Norway. He has served as project manager for the Norwegian activities of the European Software Process Improvement Training Initiative (ESPITI) and has been involved in several software process improvement experiments and dissemination actions as part of ESSI. His research interests are in technology transfer, software quality and process improvement, evolutionary systems development, and the organizational impact of IT. Kautz received an MSc in computer science from the Technical University of Berlin (West) and a PhD in systems development from the University of Oslo, Norway. He is a member of IEEE and ACM.

Address questions about this article to Kautz at Copenhagen Business School, Department of Informatics, Howitzvej 60, DK2000, Frederiksberg, Denmark; Karl.Kautz@cbs.dk.

20

IEEE Software

March/ April 1999

Authorized licensed use limited to: Carlos Monsalve. Downloaded on June 02,2010 at 20:37:48 UTC from IEEE Xplore. Restrictions apply.

Das könnte Ihnen auch gefallen