Sie sind auf Seite 1von 109

Measuring job performance: why, how, and whether

Paul F. Ross
Puget Sound Chapter, American Statistical Association 4 February 2003

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 1 of 109 pages All rights reserved.

Thank you for the invitation to present to this audience, the Puget Sound Chapter of the American Statistical Association, on a topic of my choice. Being detected by some of you as having an underlying quantitative orientation and being invited to present to an audience of your background is a special pleasure.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 2 of 109 pages All rights reserved.

Overview

Why How Whether

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 3 of 109 pages All rights reserved.

Why measure job performance? is the question I want to address first. The answer is found in some fundamentals guiding human behavior. That answered, the next question is How does one measure job performance? In answering I describe what has been tried, the problems met, the solutions found. Solutions have been found. Job performance in all its complexity can be measured although it isnt being measured well in any organization known to me. Thus the third matter is Whether state-of-the-art job performance measurement will be adopted in our society, our culture.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 4 of 109 pages All rights reserved.

The reasons why job performance measurement will deliver benefits are found in some fundamentals of human behavior that are widely accepted in psychology and, I suspect, among scientists in general including biologists and medical scientists. Those fundamentals are that .

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 5 of 109 pages All rights reserved.

Why The psychological framework for behavior

People learn. People change behavior as a function of practice.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 6 of 109 pages All rights reserved.

Why . . . The psychological framework for behavior

Feedback shapes what is learned. Future behavior is shaped by feedback given now.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 7 of 109 pages All rights reserved.

Why The psychological framework for behavior

What is learned is not carried into the next generation through genes. Each generation must learn what is good and useful.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 8 of 109 pages All rights reserved.

Why The psychological framework for behavior

Both individuals and societies learn . such as about what is successful, what is valued.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 9 of 109 pages All rights reserved.

Why The psychological framework for behavior

Work life is a very large part of our lives making feedback on the job very important.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 10 of 109 pages All rights reserved.

Why The psychological framework for behavior

Society attends to (sometimes measures) what it values.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 11 of 109 pages All rights reserved.

I was prompted to think about what we measure, and what we dont measure, when reading a statement by the noted biologist and conservationist, Edward O. Wilson. Wilson writes

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 12 of 109 pages All rights reserved.

Why The psychological framework for behavior

The wealth of the world, if measured by domestic product and per-capita consumption, is rising. But if calculated from the condition of the biosphere [such as by counting species], it is falling.
Edward O. Wilson, The future of life, 2002, Knopf, NY NY, p42

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 13 of 109 pages All rights reserved.

We measure per capita income and set our personal and social goals based on what we learn about per capita income. We are still in the process of learning to measure Earths resources and learning to live within the available resources. Why measure job performance?

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 14 of 109 pages All rights reserved.

Why Why measure job performance?

.. to signal an organizations and societys objectives and values .. to identify individuals with demonstrated competencies (at achieving these objectives and honoring these values) and give them leadership roles .. to guide distribution of rewards .. to assess group and organization competencies .. to guide individual and organizational development
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 15 of 109 pages All rights reserved.

Turning now to the matter of how job performance measurement is done and can be done, some background information and history are relevant. Psychologists (along with many other scientists) insist that their measures be both reliable and valid.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 16 of 109 pages All rights reserved.

How Standards for measurement

Reliable Valid

similar procedures get similar readings the procedure forecasts some future outcome at a useful accuracy the procedure agrees with other reasonable measures of the phenomenon

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 17 of 109 pages All rights reserved.

In measuring job performance, we nearly always are going to rely on an informed person to tell us about the job performance that is the target of our interest. It is true that for a salesman we can observe sales. And for the fishing crews performance, we can weigh the fish caught. But these outcomes are influenced by many variables beyond simply the individuals or the crews performance. Further, most jobs such as the presidents job have no convenient measure of output. As many of you know, management by objectives is appealing in principle but difficult to get right with respect to measurement. As we look back over the work of the last century, we see five methods that have been used for measuring job performance. They are

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 18 of 109 pages All rights reserved.

How Methods for measuring job performance

essay describe instances of effective performance graphic rating ranking paired comparison
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 19 of 109 pages All rights reserved.

When using an essay, we give an informed person a blank piece of paper and ask that the target performance be described in the observers own words. When collecting information about instances representing job performance, we ask our observer, in his or her own words, to describe a moment when observed performance was particularly helpful, or particularly bad, and what made that instance good or bad. When using graphic ratings, we describe a performance element and ask the observer to rate the target performance on a rating scale that is usually anchored with words and numbers. When using ranking we ask the observer to rank a group of people whose performance is known to the observer, ranking them based on a characteristic such as overall value to the company or contributions to our teams effectiveness. When using paired comparison ratings, we give the observer two phrases describing performance and ask the observer to say which one of the two phrases best describes the person observed. Typically, many pairs of phrases are used in a questionnaire.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 20 of 109 pages All rights reserved.

Of these five methods, only the essay and graphic rating have seen substantial use. Their history over the last century has been

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 21 of 109 pages All rights reserved.

How History of current practice

circa 1900 1940 1950 2000

essays and graphic ratings were in use both were known to be inadequate paired comparison method was first used in job performance measurement essays and graphic ratings still monopolize current practice
page 22 of 109 pages All rights reserved.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

The paired-comparison method was first used in measuring job performance by the U.S. Army about 1946-1947 (see Sisson, E. D. Forced choice the new Army rating, Personnel Psychology, 1948, 1, 365-381). So that I can be sure you know what I mean by a graphic rating, notice (but dont use!) this example

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 23 of 109 pages All rights reserved.

How Example of an anchored graphic rating scale [Dont use it!]

Based on job performance during the past year, the career potential this person has in our company at this time is [circle one]: 7 president or chief executive officer 6 vice president or general manager 5 middle manager 4 first line supervisor or team leader 3 highest level, complex, non-supervisory job 2 middle level, non-supervisory job 1 entry level, non-supervisory job
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 24 of 109 pages All rights reserved.

I have seen this item in use in a large corporation, but it should not be used. This item makes three serious errors. First, it assumes the rater, no matter who he or she may be, understands the demands made upon the persons holding all these positions. That assumption is always wrong. Second, the item assumes all jobs have a place on this scale. Where on this scale do you place the job of the person who is adviser to the chief-executive-officer? How do you rate the career potential of the possible replacement of this adviser using this scale? Third, the item assumes that these jobs differ so that a 4 is one point more important to the organization and to society than a 3, and so on. That assumption, too, is wrong. What CEO do you know whose compensation is a mere seven times the compensation for an entry level job? Graphic ratings and essays are the only procedures currently in widespread use for collecting judgments about job performance. We know their flaws

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 25 of 109 pages All rights reserved.

How Problems with essays and graphic ratings

essays

deficient in content costly to process unreliable no validity for any purpose intercorrelate about 0.7 when from one rater [the raters self-consistency is called halo] intercorrelate about 0.0 when from several raters no validity for any purpose
page 26 of 109 pages All rights reserved.

graphic ratings

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

Personnel Psychology (Winter issue, 2002) reports a study by Atkins and Wood, Australian academicians. They studied 63 team leaders in an Australian corporation, a service industry. The teams repair equipment. Based on the fact that the team leaders were in training and the employer had its own assessment center where the team leaders could be assessed, the employer probably was a large corporation. Ill guess the employer was a telephone company or an electric utility. The employer was measuring job performance of the team leaders using a questionnaire with 46 different graphic rating scales. Forty-six different graphic ratings! A team leaders overall rating was determined by adding an individual raters ratings across 46 graphic rating items. Ratings were gathered from the team leader (self), from his or her supervisor, from peers selected by the team leader, and from subordinates selected by the team leader. Atkins and Wood asked a very important question. Are these ratings valid? [Valid for what use was not considered.]

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 27 of 109 pages All rights reserved.

1. Self 2. Supervisor 3. Peers 4. Subordinates 5. All others 6. Team meeting 7. Customer meeting 8. In-tray 9. Behavior interview 10. Video

1 x .15

2 x

10

11

-03 .30* x -08 .02 -05 x -09 .16 .53* .80* x -22 .27* .06 .23 .36* x -22 .14 -06 .13 .15 .39* x -20 .18 .15 .13 .31* .31* .09 x x

-07 .09 -09 -17 -17 -03 .12 -15 x .07 .18 .20 -11 .03 .19 .26* .13 .04

11. Overall assessment -24* .29* .15 .20 .39* .79* .59* .66* .06 .41* x N = 63 Table 4, Atkins and Wood, Personnel Psychology, 2002, 55, 884

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 28 of 109 pages All rights reserved.

[All coefficients in this table are correlations. When the correlation is negative, I omit the decimal point.] Before we answer the Atkins and Wood question about validity, lets examine the degree of agreement among the ratings from self, supervisor, peers, and subordinates. Only one of the correlations is large enough to be statistically significant, large enough to cause the null hypothesis to be rejected. In effect, these four sources of evaluations of the team leaders job performance could not agree among themselves. [The authors report rater self-consistency reliabilities of 0.95, an inferred correlation. Interpreting both findings, the halo, the high self-consistency, with which one rater viewed the ratee had no relationship to the next raters halo about the ratee.] The interrater reliability was zero. Having no shared views with respect to how the ratee performs on the job, these ratings can have no validity they can have no power to forecast a future outcome [unless the supervisor has important power and the supervisors rating becomes a self-fulfilling prophecy with respect to promotions to higher responsibilities, or being fired]. Having multiple, informed views of an individuals job performance (360-degree views) clearly is necessary, but the method of graphic ratings is not adequate to the task.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 29 of 109 pages All rights reserved.

Atkins and Wood also obtained five performance measures for each team leader from a series of highly structured activities job performance tests, as it were done in an assessment center. Notice that these scores from the several job performance exercises, too, dont agree with each other. With neither set of measures of job performance showing within-set agreement, it is no surprise that the graphic ratings from four different competent observers dont agree with the scores from the performance tests done in the assessment center. The authors note that the graphic ratings added together from sources-other-than-self (Variable 5) correlate 0.39 with the scores added together from the assessment center tasks (Variable 11). However I regard 16 percent (0.39 squared x 100) of variance shared as falling far short of the 35 percent to 50 percent that I think is achievable and necessary to win organizational use of the measurements. I also see the assessment center measures as having low face validity when the social and economic need is to forecast career performance. In short, I find zero validity for the 360-degree graphic job performance ratings analyzed by Atkins and Woods because graphic ratings were used. Lets review the raters task for job performance rating.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 30 of 109 pages All rights reserved.

How The job performance rating task

Rater and ratee work together on the job Rater is asked to rate Rater recalls shared experiences Rater describes (or evaluates) those experiences

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 31 of 109 pages All rights reserved.

See Wherry, R. J. Control of bias in rating. Subproject 9, A theory of rating. Personnel Research Branch Report No. 922, Adjutant Generals Office, Department of the Army, 1952. Lets compare and contrast the graphic rating item and the paired comparison rating item as methods for prompting the recall and reporting of job performance.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 32 of 109 pages All rights reserved.

How Graphic and paired comparison methods compared

Graphic rating item


presents one idea asks rater to make mark on a scale (sometimes anchored)

Paired comparison rating item


presents two ideas asks rater to mark the phrase that best describes the ratee

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 33 of 109 pages All rights reserved.

By using the paired comparison method for rating, it is reasonable to think that we can get much more highly differentiated information about the ratees performance from competent observers and so get much higher interrater reliability as well as useful validities in forecasting various kinds of future performance. So that you can experience the paired comparison method, I turn now to a demonstration in which you will be

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 34 of 109 pages All rights reserved.

How Experiencing the paired comparison method

Recording what you remember using the method of paired comparisons


[thus experiencing the job performance rating task in a simulation]

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 35 of 109 pages All rights reserved.

How Experiencing the paired comparison method

In this demonstration, you will see two objects on a table you will see them for 20 seconds please study both objects the image then will be taken away you will receive a questionnaire using the questionnaire, you will be asked to describe one of the two objects please remain silent from now until after the questionnaires have been collected
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 36 of 109 pages All rights reserved.

[[ Show the visual image ]] [The participants saw an image in black and white that presents two childs blocks on a table. The blocks are cubes and both are large with respect to the table, the one on the right sitting on the table and the much larger one on the left standing on its corner on the table with no visible means of support. The block on the left has rounded, rag doll figures on the visible faces of the block that are clear against a striped background. The block on the right has geometrical figures on the visible faces of the block that are crosshatched against a clear background.] [[ Remove image; distribute questionnaires and pencils ]] [The questionnaire received by participants has 24 items, each item consisting of two drawings. The first item has a large square and a small square, both clear. The respondent is asked to circle the drawing that is most like the object being described. Half the questionnaires are distributed with the instruction to Describe A (the object on the left), the other half with the instruction to Describe B (the object on the right). If describing A, the respondent should circle the large square in Item 1. If describing B, the respondent should circle the small square in Item 1.]

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 37 of 109 pages All rights reserved.

How Experiencing the paired- method

Your questionnaire tells you which object to describe. Circle the one drawing in each pair that is most like the object you are describing. Mark your choice for each of 24 pairs. Please remain silent until the questionnaires are collected.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 38 of 109 pages All rights reserved.

How Experiencing the paired comparison method

Using the questionnaire, were you able to describe what you saw?

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 39 of 109 pages All rights reserved.

Respondents on 4 February 2003 answered this question Well, sort of After a few moments of thought and silence, someone said I was able to describe some aspects of the object. There was no forthright, quickly expressed confidence that the questionnaire allowed a good description of what had been seen.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 40 of 109 pages All rights reserved.

How Experiencing the paired comparison method

The questionnaires items ask you to describe seven aspects of what you saw size of cube (large, small) stability of cube (stable, unstable) figures shape (rounded, geometrical) figures shading (clear, cross-hatched) backgrounds shading (striped, clear) shadings pattern (striped, cross-hatched) area shaded (figure, background)

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 41 of 109 pages All rights reserved.

These seven aspects of the two figures the figures the respondent viewed for 20 seconds are measured by this 24-item questionnaire. Each item in the questionnaire is designed so that only one aspect of what is seen, such as size or background shading, is different in the two drawings in an item. That difference represented in the item should guide the response if the respondent (a) has observed accurately, (b) can remember accurately, and (c) is cooperating in reporting accurately. The respondents answers on 4 February 2003 were quite accurate (even though they did not feel confidence about the accuracy of their descriptions). Sixty one percent (61%) of the scores on the seven subtests were error-free descriptions of what the respondent had seen and another seventeen percent (17%) were within one point of an error-free score. A random response will achieve a score that is correct fifty percent (50%) of the time. [Readers who would like to do research with the demonstration materials should contact the author by email at pfrswr@worldnet.att.net .] While this demonstration is a miniature of the real thing, the author hopes it helps the participant understand that, by asking the right questions, multiple characteristics of the object (person whose performance is) being described can be described accurately using the paired comparison item format.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 42 of 109 pages All rights reserved.

Now lets review data from a more realistic job performance rating situation. Two years ago I stumbled into an opportunity to demonstrate the paired comparison method in a job performance-rating situation in a classroom at Bellevue Community College, Bellevue, Washington where the college routinely asks students to evaluate each course they are taking using graphic rating scales, of course.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 43 of 109 pages All rights reserved.

How The paired comparison method in course evaluation

An opportunity to demonstrate paired comparisons January 2001 Bellevue Community College General Business 101

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 44 of 109 pages All rights reserved.

A friend who teaches General Business 101 at Bellevue Community College knew he would not be able to meet his classes for the opening days of the Winter quarter in January 2001. He asked me to meet the students and lead the class. I accepted with the understanding that I be allowed to deliver one of his guest lectures on the topic of measuring job performance. For me, this was an opportunity to see how the method of paired comparisons works in course evaluation at the college level. I met the students for seven classroom sessions during which we focused on the texts content following the course syllabus. Then, on the eighth day, we did the work that I am reporting here (Ross, Paul F., Course evaluation: an experiment in the technology for prompting judgments about job performance, 2001, http://home.att.net/~pfrswr/ ). In preparation for the course evaluation, I cobbled together a 60-item questionnaire for use by the students. These are the steps I should have taken, the steps in brackets being the ones I omitted

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 45 of 109 pages All rights reserved.

How

To use paired comparison in course evaluation list course objectives assemble phrases describing objectives [ask users to use phrases in graphic rating form] [calculate mean rating for each phrase] build pairs of phrases [matched for mean rating] use the questionnaire analyze responses [iterate (rebuild, use again, reanalyze)] [institutionalize questionnaire use] [validate scores]
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 46 of 109 pages All rights reserved.

The single most important task in measuring job performance is determining the characteristics by which the works value will be assessed. To describe course objectives, I wrote as many phrases describing course outcomes as I could imagine, being careful to write them so that they could apply to other courses as well as to our course. I wanted the questionnaire to be useful in many courses throughout the College. With phrases in hand, I sorted them into piles so that the phrases in one pile were very similar in meaning, those in different piles were as different in meaning from phrases in other piles as was possible. When a pile had too few phrases, I wrote more phrases that were similar in meaning to the phrases already found in that pile. Altogether I generated about 130 phrases. The piles of phrases described these eleven course outcomes

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 47 of 109 pages All rights reserved.

How The paired comparison method in course evaluation

Describe educational goals [Key task!] excellence concept rich utility role model exposition uniqueness skill building prompts new learning interpersonal challenge quantitative-observable-definitive

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 48 of 109 pages All rights reserved.

I then began to build pairs of phrases following these rules

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 49 of 109 pages All rights reserved.

How The paired comparison method in course evaluation

Each phrase in a pair to be equally desirable be about same length contain only one idea represent one objective Each pair to present two different objectives List of pairs to be more than 100 pairs balanced with respect to objectives
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 50 of 109 pages All rights reserved.

For work exploring the methods for constructing paired comparison items, see: Berkshire, J. R. and Highland, R. W. Forced-choice performance rating a metholological study. Personnel Psychology, 1953, 6, 355-378. Ross, Paul F. A comparison of two methods of matching in forced-choice rating, Ph.D. dissertation, 1955, The Ohio State University, Columbus, OH Wherry, R. J. Control of bias in rating. Subproject 3, Factor analysis of rating item indices. Personnel Research Section Report No. 914, Adjutant Generals Office, Department of the Army, 1950. Notice that I am calling the method paired comparison rather than forced choice as it was called when invented. No one can force a respondent to make a choice on a questionnaire item. No marketing-sensitive person would name a procedure forced choice, especially when paired comparison is more accurate. In order to understand what the students were saying about our Business 101 course, I needed to know what the students would say about other courses.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 51 of 109 pages All rights reserved.

How The paired comparison method in course evaluation

To compare our course with other courses half the students described our course half the students described another course

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 52 of 109 pages All rights reserved.

How The paired comparison method in course evaluation

Asked to describe a current or recent course other than our course, students described courses in accounting anthropology botany communications business international relations mathematics
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

music theory physics political science sociology speech

page 53 of 109 pages All rights reserved.

How The paired comparison method in course evaluation

On the day of course evaluation 15 min lecture on job performance measurement students answered the questionnaire in 15 min questionnaire = 59 paired comparison, 1 graphic students discussed the experience Five days after evaluating the course students received written report summarizing their evaluations

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 54 of 109 pages All rights reserved.

Having seen in the Atkins and Wood study that halo is reliably reported by a rater but agrees with no one elses view of halo, you can now easily understand that the purpose of a paired comparison item is to get the rater to tell you something beyond their opinion about the ratees halo. The paired comparison item is saying Yes, I know you like him, but tell me this Yes, I know you dont like him, but tell me this Thus the key design rules for a paired comparison item are (1) that the phrases be equally attractive things to say about the ratee, that the two phrases be equally loaded on halo and (2) that the phrases differ with respect to content (organizational-societal goal) or validity (relationship to a desired outcome). Failing equality in the match on attractiveness, most raters choose the nicer thing to say. When choosing a phrase from a pair that is unmatched in attractiveness (acceptability), raters report their judgment about the ratees halo, choosing the nicer of the two phrases if the rater admires the ratee. To get the rater to tell you something in addition to his/her judgment about halo, the statements must be matched with respect to their acceptability. Item 12 appeared in the questionnaire the students used to evaluate their course.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 55 of 109 pages All rights reserved.

How The paired comparison method in course evaluation


Item 12

among the best courses I could expect in college sensitizes my sense of right and wrong
N = 37, [18, 19]

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 56 of 109 pages All rights reserved.

Presented with Item 12, students were asked to mark the one phrase in the pair that best describes the course they were evaluating. Notice that, of the 37 students responding, 18 used the top phrase and 19 used the bottom phrase. For these respondents, the two phrases in the pair were equally attractive. Notice that these phrases are nice things to say about the course. One can build pairs describing bad things about the course, but I suspect raters will object even more to saying bad things about a course they like than they object to saying good things about a course they want to criticize. Those who want to criticize should remember that, in selecting something nice to say, they are overlooking something else that is nice to say. In marking an item, the respondent is making both positive (it is more like this) and negative (it is less like that) judgments about the course (about the ratee). Following is Item 43 from the course evaluation questionnaire.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 57 of 109 pages All rights reserved.

How The paired comparison method in course evaluation


Item 43

get good practice in oral expression more interesting than most courses

N = 37, [20, 17]

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 58 of 109 pages All rights reserved.

Of the 37 respondents to Item 43, 20 chose the top phrase and 17 chose the bottom phrase. These phrases, too, meet the requirement that they be equally attractive. Not all the paired comparison items in my cobbled together questionnaire were as well matched for halo (a nice thing to say) as were these two items. The course evaluation questionnaire had 59 paired comparison items and one graphic rating scale. The graphic rating scale, Item 60, and the students responses to it, are shown here

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 59 of 109 pages All rights reserved.

How A graphic rating describing course evaluation


Item 60

Is this course meeting your expectations? Business 2 9 7 1 Benchmark 7 9 2 0 very satisfied satisfied dissatisfied very dissatisfied
page 60 of 109 pages All rights reserved.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

Notice that, for the students describing our business course, 11 were satisfied whereas 16 students describing the benchmark courses were satisfied. For the business students, 8 were dissatisfied whereas 2 students describing the benchmark course were dissatisfied. Were the students truly dissatisfied with Business 101? The students describing a benchmark course had been asked to select a current or recent course as a benchmark. I think that the students added for themselves the instruction and choose the course you like best among your current courses as the benchmark. If the entire College were using the questionnaire, if every student were describing the course that they had just completed, and if we had responses for all courses to combine as the benchmark responses, then we would be better able to interpret the responses from the students in this business course. With the whole College represented in the benchmark group, the potential bias of describing only the best courses as benchmark courses would be removed. These data do not provide a definitive answer to the question Were the students satisfied/dissatisfied with Business 101? Lets look again at Item 12,

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 61 of 109 pages All rights reserved.

How The paired comparison method in course evaluation


Item 12

Busi- Benchness mark 4 15 14 4


among the best courses I could expect in college sensitizes my sense of right and wrong

Phi (absolute value) = 0.57


Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 62 of 109 pages All rights reserved.

Answering Item 12, the business students said their course sensitizes my sense of right and wrong. In connection with a chapter on business ethics, I had reported a survey describing the rate at which personal computer users in various countries around the world were using stolen software (U.S & U.K about 25%, China and Russia about 95%, others between those extremes), then asked the students to list the reasons why software should be stolen and the reasons why it should be purchased. Students everywhere have experience stealing software or, if not software, then music. The students were better at listing reasons for stealing than reasons for buying. I supplied statements like revenues attract people to a profession, and without that revenue as in Russia you dont have a very strong programming profession and revenues support time and talent for doing discipline-leading work as arguments for why software writing should be given a revenue stream. You can find the students reasons for stealing and for buying software, with my reasons for buying also in the mix, on my website. The responses to Item 12 suggest these students got their only first-seven-days-of-thequarter classroom discussion of right-versus-wrong in our classroom. Lets look again at students responses to Item 43

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 63 of 109 pages All rights reserved.

How The paired comparison method in course evaluation


Item 43

Busi- Benchness mark 13 6 7 11


get good practice in oral expression more interesting than most courses

Phi (absolute value) = 0.30


Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 64 of 109 pages All rights reserved.

These teen-and-twenty-something students gave cues that they are accustomed to having the professor lecture. Instead, I asked them to have read the textbooks chapter-underexamination before coming to class, to answer a quiz about the chapter before any part of the chapter had been discussed, and then I put a chart or table on the screen and asked questions about it. Classroom time was think-and-talk time for the students. That experience shows up in their responses to Item 43 in which the business students say they get good practice in oral expression in the business course as compared with the benchmark courses. Lets now look at Item 32

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 65 of 109 pages All rights reserved.

How The paired comparison method in course evaluation


Item 32

Busi- Benchness mark 15 4 5 13


increases my awareness of direction of change in a quantity easy to understand what the instructor says

Phi (absolute value) = 0.51


Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 66 of 109 pages All rights reserved.

I think that business must give a lot of attention to things quantitative, that one must think quantitatively in order to be successful in business. Thus in discussion in our classroom, we focused on the meaning of numbers. The students responses in Item 32 show the business students saying that our course increases my awareness of direction of change in a quantity whereas they dont say that about other courses. I think I succeeded in raising student sensitivity to things quantitative at least when compared with their experiences in other courses. However, having heard me present this evening, maybe youll look at those responses and say the business students were avoiding saying easy to understand what the instructor says. I hope you have enough evidence to convince you that paired comparison items, properly constructed, evoke similar judgments from multiple informed observers (the interrater reliability gives signs of being respectable) and that the items can elicit reports about diverse elements of the observed experience not just a report about halo. Lets turn to whether our society will measure job performance and measure it well. We have seen that essay and graphic ratings have dominated practice for a century despite knowing for at least 60 years that those methods dont work at all (zero interrater reliability, zero validity except in self-fulfilling prophecies for ratings made by those with the power to influence subsequent events).

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 67 of 109 pages All rights reserved.

Whether The core questions

Do we want a society that is headed for widely-shared goals and led by high-performers as assessed by those goals?

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 68 of 109 pages All rights reserved.

My answer is Yes, I want a society like that! Further, I think we need leaders who lead in shared directions. Still further, I think valid, repeated job performance measurement is the means by which people of top performance heading toward shared goals can be identified and given increasing responsibilities over the course of their careers. Believing that about job performance measurement, how does one cause society to adopt state-of-the-art job performance measurement and put it to work for society? Lets look at what happened at Bellevue Community College where I did several things intended to stimulate adoption of the course evaluation questionnaire I developed. Show people how it works, I told myself, and they will see how much better paired comparisons is than the graphic ratings in use. They will want to use paired comparisons. This is what I did and what happened

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 69 of 109 pages All rights reserved.

Whether To enhance the probability of adoption at BCC, I Invited student participation in the experimental course evaluation 90% attended classes daily; 60% attended the demo Demonstrated paired comparison method technically successful one student: expected dullsville; this was fascinating Invited faculty attendance at demonstration 1 of 15 attended Hoped for faculty-student initiatives supporting use of method has not happened Institutionalized forms use at site of the demonstration not done
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 70 of 109 pages All rights reserved.

Whether To trigger adoption at other sites, when meeting professors I ... Inquire about current practices asked about six in two years all universities have student course evaluations all use graphic rating scales Invite visits to my website to see the form and the research no new visits following my conversations 11 visits in 22 months, 3 or 4 being my own visits Will encourage forms adoption at other sites no invitations to do that (so far )

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 71 of 109 pages All rights reserved.

Whether

Can competent measurement of job performance highlight the goals of our society and guide the distribution of responsibilities and rewards? It can and to our enormous benefit.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 72 of 109 pages All rights reserved.

Whether

Does our society want to distribute leadership responsibilities and rewards based on demonstrated competence on the job? I dont know.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 73 of 109 pages All rights reserved.

In a recent issue of the Harvard Business Review, Jeffrey Sonnenfeld wrote

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 74 of 109 pages All rights reserved.

Whether

I cant think of a single work group whose performance gets assessed less rigorously than corporate boards.

Jeffrey Sonnenfeld, Yale School of Management, Harvard Business Review, Sept 2002

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 75 of 109 pages All rights reserved.

Conclusion

To cause societys goals to be the focus of our attention, we must measure our performance with respect to those goals. To identify leaders who can lead toward those goals, we must identify people whose achievements at every step in their careers demonstrate that they see the goals and find ways to get there. They are the ones we want as leaders. We have the knowledge to support that kind of practice.
Will we do it? I hope we do, but I dont know whether we will.
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 76 of 109 pages All rights reserved.

To see this presentation, results from this evenings demonstration, course evaluation questionnaire, results from course evaluation,
[and, incidentally, arguments for and against stealing software]

go to http://home.att.net/~pfrswr/ and look under technical papers.


PFR, 4 February 2003

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 77 of 109 pages All rights reserved.

Following is a list of questions asked by those who viewed this presentation. Subsequent pages repeat the questions and present the authors answers to the questions: page What benefits follow from measuring job performance well? Should the way leaders are selected be changed? Why is course evaluation like job performance rating? What are the problems with using paired comparison ratings? Should graphic rating scales be abandoned? Essays? Where have paired comparison job performance ratings been used? Is there an almost-as-good alternative to paired comparison rating? Can a paired comparison questionnaire be made to measure the performance of a member of the board of directors or the CEO? How can paired comparison be used in salary administration? How are paired comparison questionnaires scored? Why does the difficult question of whether prompt no discussion? References 79 82 85 86 89 91 93 95 102 104 105 107

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 78 of 109 pages All rights reserved.

A listener asked: What benefits follow from measuring job performance well? When job performance is well measured, widely shared economic, social, environmental, and cultural goals can be out front where we can see them, guiding our behavior and the distribution of rewards. Understanding our shared goals will have a very large effect on who we are and how we behave. Examples of widely shared goals are education to the limit of ones ability is every humans responsibility innovation is to be nourished freedom and justice for all practices free of biases due to customers are valued truth is required full effort is expected religious freedom is valued freedom from advertising push is valued freedom of speech is valued

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 79 of 109 pages All rights reserved.

organizations revenues must equal or exceed expenses organizations are to be managed for long term success trade in legal goods is to be unrestricted (tariff free) theft is forbidden adultery is not advised compassion is valued each persons legitimate needs are respected health care is every humans right a clean environment is valued conserving resources is valued life of every kind is valued every human life is to be respected and preserved government based on consent of the governed is valued law is to be respected productivity can at least double, maybe even increase tenfold, raising the standard of living for everyone. We can avoid wasting that productivity in greater consumption of resources or a larger human population or in damage to our environment.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 80 of 109 pages All rights reserved.

Rather, increases can be spent to enhance the quality of life for everyone as well as being concrete and more generous in our compassion for those unable to care for themselves. fairness can be much enhanced with a resulting decline in resentment and the frustration that drives aggressive behavior. conflict can be decreased because there is greater confidence that rewards arrive fairly and with certainty to those who earn them. We may eventually understand the profound and extended costs of war and crime, replacing our genuflecting to what we perceive as the cosmos-decreed survival of the fittest and I am stronger than you models for human society with a deep respect for each other and for the biosphere into which we have been born.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 81 of 109 pages All rights reserved.

A listener asked: Should the way leaders are selected be changed? Emphatically, yes. In about 1960, we at Exxon (Standard Oil Company (NJ)), I among them, announced the first results of the Early Identification of Management Potential (EIMP) study. Predicting a criterion of salary-corrected-for-age for managers and executives who had been in the oil business for a decade or longer, sometimes three decades, we found that biographical materials (answers to a biographical questionnaire) predicted career success at the level of about 0.6 (correlation). Many research studies by Exxon in the next several decades, some using translations of the biographical questionnaire (Spanish, Portuguese, German, French, Swedish, Norwegian), confirmed the original results in other cultures. Longitudinal studies at Exxon (USA) (see C. P. Sparks in C. E. Clark and M. B. Clark, Measures of leadership, 1990, Center for Creative Leadership, Greensboro NC) further confirmed the efficacy of these biodata items in forecasting career success over ten to twenty years as did use of the biodata questionnaire in other industries (Carlson, K. D., Scullen, S. E., Schmidt, F. L., Rothstein, H., and Erwin, F., "Generalizable biographical data validity can be achieved without multi-organizational development and keying," Personnel Psychology, 1999, 52, p 731-755). For those who understand what they are seeing, these results are beyond belief! They exceed the

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 82 of 109 pages All rights reserved.

accuracy of any other valid forecast of long term human behavior. We can identify remarkably accurately those who will make good executives a decade or two from now. Exxon chose not to use this method for identifying prospective leaders, despite its high validity. However the (EIMP-based) questionnaire used by Carlson and others was available to any qualified user through Richardson, Bellows, Henry and Company in New York as recently as 1999. RB&H was purchased by another consulting firm, that firm by a third, and the latest owner of the EIMP technology has since gone out of business. I dont know whether the valuable EIMP work announced in 1960 is available anywhere at this time. I hope it is. However there are important problems with identifying an anointed group of executives-to-be early in their careers. One problem, of course, is that the forecasting device is not perfect (even though it is better than any other method available) in its forecasts. Another is that, when the anointed discover their privileged status, they may lose their incentive to learn and compete. (I think the EIMP questionnaire still has important, justifiable, early-career uses in organizations such as to track the quality of the organizations intake of leadership talent.)

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 83 of 109 pages All rights reserved.

Identifying leaders based on their performance in their jobs in accomplishing goals important to the entire organization at every step in their careers appeals to me as the most accurate, most effective in motivating effort, and the fairest. Adequate job performance measurement throughout the organization is necessary and the wisest pathway to finding and utilizing world-class leadership. Leaders should be chosen based on their performance in their jobs, as that performance is validly measured against widely shared goals, with performance being measured at every step throughout their careers.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 84 of 109 pages All rights reserved.

A listener asked: Why is course evaluation like job performance rating? Because the measurement challenges are identical: one must provide a means for the knowledgeable participant/observer, the rater, to report what he/she has observed. Because the underlying psychological processes are identical: the rater experiences a series of events, the rater describes from memory what has been experienced, the raters description must be multi-dimensional (reporting accurately about different aspects of the experience) if it is to be useful. Because the purposes are identical: to keep educational (organizational) goals ever before the participants, to distribute rewards based on performance, to guide future decisions about the course and its leader, to provide feedback useful in guiding the courses and course leaders development.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 85 of 109 pages All rights reserved.

A listener asked: What are the problems with using paired comparison ratings? 1. Paired comparison questionnaires are expensive to build and maintain.

Research must be done to build them and to validate them. One must employ someone who knows how to do that. The way around this hurdle is to share the costs with other organizations such as by buying services from a competent R&D organization specializing in measurement of this kind. 2. Few people know how to build these forms.

Psychometrics skills of quality embedded in leaders with maturity are hard to find. The developer must influence what the organization is doing. Botching a job is a way to advertise failure and reduce the likelihood that others will want to try this methodology. 3. Raters dont like the forms.

Executives who used paired comparison forms were asked why they did not want to use them. Because I cant control the outcome, was the answer. The issue raised is whether one is going to manage the organization using information from a process

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 86 of 109 pages All rights reserved.

demonstrated to be valid (developed from validated job performance rating processes) or using judgments based on halo (which are known to have no interrater reliability, no interrater agreement, no validity). The demonstration exhibited in this presentation seeks to persuade participants, based on experiencing the demonstration, that the raters task is describing performance, not evaluating it, not determining outcomes. Whether our culture can learn that describing job performance is what is needed, that one must build a process that will produce valid descriptions from which valid forecasts can be made, is an issue yet to be settled. It is yet to be discovered whether those with power can be persuaded that exercising their power over the fate of individuals is a poor choice (since too many mistakes are made that way) whereas inventing and improving valid processes is the way to enhance organizational success. Getting universities to teach their students these facts is yet to be accomplished. Raters will need training before they use the forms primarily persuasion that the process works. The questionnaires themselves have built-in instructions. The rater needs to understand that, when considering two nice things to mark, the choice of one means that particular nice thing has been upped and the one not marked has been downed. Important, useful information is contained in the judgment as recorded by the mark.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 87 of 109 pages All rights reserved.

4. One must persuade people to be guided by valid information. People who lead organizations already have a career of success behind them. They believe they know how to make decisions, especially about people. They like to lead and exercise power. Sometimes, when experienced enough, they realize they have made at least a few mistakes. They understand that listening to others is still in order. Those wizened persons are the most ready to hear that science can provide information that improves the odds of making the right choices that science can generate instrumentation measuring job performance and that those measurements provide valid forecasts about future individual performance on the job. Sadder but wiser individuals may be the ones who lead innovation in obtaining and using job performance measurements.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 88 of 109 pages All rights reserved.

A listener asked: Should graphic rating scales be abandoned? Essays? For job performance measurement, emphatically, yes. Ask any psychologist grounded in psychometrics and statistics to explain to you what it means to have zero correlations among graphic ratings produced by people close to the ratee (self, supervisor, peers, subordinates) (see Atkins and Wood, Personnel Psychology 2002 as one example), ratings from people who are well acquainted with the ratees job performance, and you will have additional expert testimony that those ratings are worthless for all uses. Get contrary advice from your expert, and it will be the expert whose expertise is in question, not the advice inherent in the Atkins and Wood findings along with many other research findings investigating the validity of graphic ratings. Both psychologists and educational researchers have known for seventy years that essays cannot be reliably judged by experts of any kind for any purpose. (Yes, Educational Testing Service and other test builders at this time, the beginning of the Twenty First Century, use essay-writing in testing school children and do their very best to enhance the reliability and validity of the measures derived from those essays, but ETS does that not because ETS thinks that is the least expensive and most valid way to test essay writing skill but because teachers and parents insist that this testing be done that way. The public attitude is rather like insisting in the Year 2003 that spindles for a wooden chair be
Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross page 89 of 109 pages All rights reserved.

produced on a lathe driven by foot treadle and carved by a human operator.) There are circumstances in which graphic ratings can appropriately be used in job performance descriptions, and attitude measurement may still be an appropriate venue for their use. There are research circumstances in which it is productive to collect essays. But both are useless for operational purposes in measuring job performance. University faculties around the world insist that their young colleagues stack reprints of their publications on the table, along with other essay-like documentation, when seeking tenure in the department and the university. The committees of the Swedish Academy of Sciences use a two or three person committee to read and discuss nominations for recipients of the Nobel prizes in chemistry and physics (Friedman, The politics of excellence, 2001, Henry Holt, New York NY). The fact that these stacks of essays are in prominent use by highly educated people, even in departments of psychology, does not mean that the decisions reached on the basis of these essays are reproducible (reliable) or have much to do with the value to humanity (validity) of the work and the future job performance being assessed. These processes in widespread use are a mark of our broadly shared unwillingness to learn what has been known for nearly a century. (See Ross, Paul F. Identifying excellence in human performance: A review of Friedmans The politics of excellence, 2002, http://home.att.net/~pfrswr/ .) Graphic ratings and essays should be dropped from the process of measuring job performance.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 90 of 109 pages All rights reserved.

A listener asked: Where have paired comparison job performance ratings been used? Having done no search of the literature on this matter, my knowledge is incomplete. Paired comparison job performance rating was first used by the U.S. military in the late 1940s. (See Sisson, E. D. Forced choice the new Army rating, 1948, Personnel Psychology, 1, 365-381). It was abandoned because the raters did not like to use the forms, feeling they had no control over the final result. Paired comparison job performance rating was used briefly at Standard Oil Company (NJ) (later called Exxon) for executives about 1957 or 1958 under the leadership of Ed Henry. It was abandoned because the raters did not like to use the forms, feeling they had no control over the final result. (Interpreting their meaning, the raters were saying I want to report halo, and this form wont let me do that.) During the period 1985-1998, I have gathered information about, but did not view in depth, rating systems for engineering staff, managers, and executives that were in use in Digital Equipment Corporation, Polaroid Corporation, IBM, Texas Instruments, and John Hancock Life Insurance. Essays and graphic ratings were the methods in use, typically

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 91 of 109 pages All rights reserved.

under the general conceptual framework of management by objectives. Paired comparison questionnaires were not in use. The method of paired comparison has been in use in the Kuder Preference Record, an interest inventory, since its inception in about the 1940s. That inventory has proven to be very effective in differentiating interests and at least somewhat useful in identifying those who will succeed in a profession.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 92 of 109 pages All rights reserved.

A listener asked: Is there an almost-as-good alternative to paired comparison rating? As in the first race for the Americas Cup in 1851, there is no second to a well conceived, carefully constructed, carefully used, carefully validated paired comparison job performance rating questionnaire. Of course, no one yet has built that ideal paired comparison job performance rating questionnaire for their organization! Asking a rater to rank a list of names (each of whose job performance is known to the ranker), ranking on overall value to the company, is an approach used regularly in research projects and, occasionally, in operational tasks (like salary administration). The difficulty with ranking always is that the rankers knowledge-in-depth about an individuals job performance is limited to a small number of individuals. There is no person who knows the details about job performance with respect to everyone in the company. The result is that it is difficult to know how to put the short ranked lists collected from many rankers together into one massive, overall list. I worked at that problem in a research and development organization (Ross, Paul F. Reference groups in man-to-man job performance ranking, 1966, Personnel Psychology, 19, 115-142) and proposed a solution for the purpose of salary administration, but my solution was not adopted for salary administration purposes. The problem always is that the ranker knows

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 93 of 109 pages All rights reserved.

well the job performance of too few people both within and outside his/her own organization. The other major problem with ranking is that the judgments for most people being ranked are coming from one person only, the judgments for just a few people coming from two or three people at most. The process incurs the same risks as do graphic ratings with its demonstrated problem that within-observer halo, itself, is correlated with no other observers judgments about the ratees job performance (see Atkins and Wood, Personnel Psychology, 2002, 55, 871-904, discussed above in this presentation). To measure job performance well enough to accomplish individuals, organizations, and societys ends, we must use processes of greater validity than essays, graphic ratings, and ranking provide. Paired comparison job performance description is the method that promises that greater validity.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 94 of 109 pages All rights reserved.

A listener asked: Can a paired-comparison questionnaire be made to measure the performance of a member of the board of directors or the CEO? Yes. Begin with a list of tasks that these leaders are expected to undertake at Board level, and a list of the kinds of individual performance capabilities, styles, and values that are expected in the organizations top leadership, like: Task list compares current-year revenues with prior-year revenues by revenue stream reviews appropriateness of employee incentives compares current-year costs with prior-year costs by category compares current-year profit (revenue-cost difference) with plan compares managements performance with prior-year performance by element reviews leadership replacement plans for top ten percent of organization compares current-year stock performance with prior-years stock performance compares current stock ownership patterns with prior-years patterns reviews new product development

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 95 of 109 pages All rights reserved.

compares current R&D activities with prior-years R&D activities compares current employee capabilities with prior-years capabilities compares current employee capabilities with organizations next-decade needs compares current risk management with prior-years risk management identifies ten most important organizational challenges compares organizations competitive advantages with competitors advantages compares organizations leadership strengths with organizations challenges compares recent product time-to-market with benchmarks forecasts revenue streams by stream for a decade compares prior revenue streams forecasts with actual revenues reviews tasks undertaken in prior-year immediately below Board level reviews product sunset plans compares prior-year employee turnover by stream with benchmarks compares prior-year employee intake by stream with benchmarks compares prior-year manufacturing costs by product with benchmarks compares employee development by stream with benchmarks compares prior-year employee in-organization leadership acts with benchmarks compares current organizational image with benchmarks compares current organizational climate measures with benchmarks compares current environmental performance measures with benchmarks

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 96 of 109 pages All rights reserved.

... ... Performance expectations is admired by employees is admired by peers is persuasive with peers reasons well is relaxed in the job asks relevant questions convincingly challenges a conclusion absorbs what has been presented is thoroughly prepared knows the important questions to ask puts others at ease is dedicated to the organizations success is everyones confidant knows our business in depth is broadly informed

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 97 of 109 pages All rights reserved.

can be trusted is well liked respects community values is sensitive to social values brings vital knowledge to our team ... ... Under the guidance of a consultant highly qualified in psychometrics, extend, edit and clarify the phrases, collect the responses needed to the individual phrases to be able to scale the phrases, then build pairs, observing the rules for scaling phrases and pairbuilding (for an introduction, see Ross, Paul F. A comparison of two methods of matching in forced-choice rating 1955, PhD dissertation, Ohio State University, Columbus OH), creating pairs such as . . . reviews new product development reviews product sunset plans identifies ten most important organizational challenges compares prior revenue streams forecasts with actual revenues

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 98 of 109 pages All rights reserved.

is relaxed in the job absorbs what has been presented convincingly challenges a conclusion knows the important questions to ask With a questionnaire presenting 100 or more pairs of this kind, develop scoring keys for eight or ten dimensions of performance (financial health, new product development, human resources management, leadership development, public image, poise, values, business insight, ). Ask each Board member to mark the phrase in each pair in the questionnaire which best describes the CEOs performance during the past year. Repeat the request to obtain peer performance reports about each person around the Boardroom table. Ask your consultant to score, develop analyses including calculation of interrater reliability, and display findings for Board discussion. Develop an action plan for the organization based on the information derived from the performance reports.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 99 of 109 pages All rights reserved.

Then, with respect to the method for evaluating CEO and Board performance, ask your consultant to design and do research that discovers the accuracy and meaning of performance scores for individuals and for organizations, work that will require inputs from econometrics and the cooperation of an adequate sample of boards. While the needed research steps are well known, the knowing how to measure the performance of the Board and its members must be learned over years of time. It can be learned fastest and least expensively as a culture, as a society by doing the research in just two or three competing research organizations, each with multiple boards (organizations) participating in and supporting its research program. Almost no consulting organization will have the required sophistication in psychometrics that is needed to undertake this task. However research teams with the needed skills can be assembled to reside in two or three different management consulting firms, an important environment for the work. It is vital that the organizations housing the work understand the meaning of research and few management consulting organizations have that breadth of view. Management consulting firms like to pedal expertise and broad experience, not necessarily research skill and outstanding performance in research tasks.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 100 of 109 pages All rights reserved.

That the research evaluating boards performance be published, and that two or three research organizations compete in working toward the same goals, are both essential conditions to accomplish rapid, lowest-cost progress up the learning curve for measuring CEO and board leadership performance and its influence on organizational outcomes. The research in which sophisticated and trusted interviewers from proper schools of business come round, visit with all the important people, ask questions, listen to wisdom, go think, and formulate what they have learned for publication six months later simply cannot contribute to knowledge of the kind that is needed to avoid the mistakes of Adelphia Communications, Arthur Andersen, Enron, Kmart, Tyco, Warnaco, and WorldCom not to mention Iraq and al Qaida. Sonnenfeld (What makes great boards great, September 2002, Harvard Business Review) writes a perceptive and sensible summary of current knowledge about the distinguishing characteristics of outstanding board performance, work based on more than meets the eye in this article, more than a series of interviews. However we can and must go far beyond the level of understanding represented in Sonnenfelds work if we want to substantially reduce the losses and pain following from inadequate leadership more important, realize the gains and rewards that will follow from world class leadership and all around outstanding performance at all levels in our organizations. Achieving world class performance begins with learning how to recognize outstanding individual and team performance.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 101 of 109 pages All rights reserved.

A listener asked: How can paired comparison be used in salary administration? Perhaps the organization can rank the several objectives measured in the paired comparison questionnaire according to each objectives importance to the organization, then an individuals performance profile can be compared with the organizations expectations and rewards assigned based on the goodness of fit. However, individuals and organizations have multiple objectives, all of them important, and to rank objectives (as suggested in the paragraph above), indicating a differential in the importance of the organizations goals, is inconsistent with the underlying idea that all the objectives are important. The question referencing salary administration has as its foundation the idea that salary received is important (it is) and that recognition through merit increases in salary is important (it is). However a look at a sample of careers will show that the visible, noticeable changes in compensation occur with changes in job assignment, not through increments gained by merit salary increases. Large increments in earnings come not from merit increases but from the assumption of new tasks (and, sometimes, because of the good or bad luck of ones investments). The paired comparison methodology is well

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 102 of 109 pages All rights reserved.

suited to assessing the profile of performance strengths and that profile, itself, is likely to be well suited to guiding assignment to new responsibilities. Further, important as is salary as an attractor of talent, it is work content and work challenge that is the ultimate motivator. Consider a sample of CEOs. Look at their vastly different salaries. Can you tell that one works harder or smarter than another as a function of the monetary compensation received? It is the nature of the work itself that drives most human performance. If overall value to the organization is needed to guide merit salary increases, the alternation ranking methodology and its variations (variations required in order to reduce the problems associated with the ranking methodology) may be needed (see Ross, Paul F. Reference groups in man-to-man job performance ranking, 1966, Personnel Psychology, 19, 115-142).

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 103 of 109 pages All rights reserved.

A listener asked: How are paired comparison questionnaires scored? The easiest way is to count the number of times the phrases describing Objective A have been chosen over some competing phrase in order to get the score for Objective A. Notice that in order to indicate a high performance with respect to Objective A, one automatically indicates a lower score with respect to other objectives. The sum of scores on all objectives is a constant that equals the total number of paired comparison items. Notice, too, a subtle point sometimes difficult to describe. Each phrase, in this scoring scheme, is assumed to describe one and only one objective. Usually one finds that phrases are complex in meaning (although simple in grammatical structure and vocabulary) and are loaded on two and sometimes more factors. Thus adding one point to the score for Factor I when Phrase A is marked overlooks the fact that one also should be adding one point to Factor J and perhaps a negative point to Factor K. The way item respondents use the phrases they read is often not easy to decipher. For example, always chooses action that leads to high profit certainly describes the ratees attention to financial success. It may also describe irresponsibility with respect to customer complaints. Current profits +1, responsibility to customers 1, future profits -2.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 104 of 109 pages All rights reserved.

The author asks: Why does the difficult question of whether prompt no discussion? In discussing the paper days after its presentation, and on other occasions, experienced, thoughtful, helpful middle managers including those who heard the presentation have said in the authors hearing: I struggled with writing the essays I needed to write about an individuals job performance. I wrestled with what rating to assign to each individual. I could not focus enough to do the work at the office and always did it at home. The work took me many hours to complete. I held the required one-on-one discussions with the ratee. We discussed the individuals objectives set (three months, six months, a year) earlier. New priorities often had intervened. The ratee was presenting to me his/her own performance. It was done according to specification. But no one liked the process, neither the ratee nor the rater. Benefits from the process are never mentioned. Thoughtful, experienced people treat the topic of job performance measurement as if a skunk has just arrived at a fine lawn party. Folks frequently are speechless. If measuring job performance is a taboo topic, the painful, unwanted topic can receive no constructive discussion, no sharing of information that offers solutions. Practically no one tells stories about the mistakes we made in assigning someone to a task that could not be well handled by that person. No one asks: Is there any science

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 105 of 109 pages All rights reserved.

that can help us improve the odds of making a good-for-the-long-run assignment decision? Sonnenfeld (What makes great boards great, September 2002, Harvard Business Review) understands correctly that changes in corporate governance or changes in accounting rules or changes in how government leaders are elected are not going to eliminate the problems experienced with inadequate leadership. He discusses matters about board performance that do need attention. If it is our cultural view that job performance cannot be measured, then we have much to learn about what, indeed, can be measured. We have known for more than half a century how better to do job performance measurement. The author hopes this paper helps promote the dissemination of what is already known and the application of that knowledge supported by further research.

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 106 of 109 pages All rights reserved.

References Atkins, Paul W. B., and Wood, Robert E. Self versus others ratings as predictors of assessment center ratings: Validation evidence for 360-degree feedback programs. 2002, Personnel Psychology, 55, 871-904 Berkshire, J. R. and Highland, R. W. Forced-choice performance rating a metholological study. Personnel Psychology, 1953, 6, 355-378. Carlson, K. D., Scullen, S. E., Schmidt, F. L., Rothstein, H., and Erwin, F., "Generalizable biographical data validity can be achieved without multi-organizational development and keying," Personnel Psychology, 1999, 52, 731-755 Friedman, Robert Marc The politics of excellence: Behind the Nobel prize in science 2001, Henry Holt and Company, New York NY Ross, Paul F. A comparison of two methods of matching in forced-choice rating, Ph.D. dissertation, 1955, The Ohio State University, Columbus, OH

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 107 of 109 pages All rights reserved.

Ross, Paul F. Reference groups in man-to-man job performance ranking, 1966, Personnel Psychology, 19, 115-142 Ross, Paul F. Course evaluation: an experiment in the technology for prompting judgments about job performance, 2001, http://home.att.net/~pfrswr/ Ross, Paul F. Identifying excellence in human performance: A review of Friedmans The politics of excellence, 2002, http://home.att.net/~pfrswr/ Sisson, E. D. Forced choice the new Army rating, 1948, Personnel Psychology, 1, 365-381 Sonnenfeld, Jeffrey A., What makes great boards great, 2002, Harvard Business Review, September, Vol. 80, No. 9, 106-113 Sparks, C. Paul, Testing for management potential, p 103-111, in C. E. Clark and M. B. Clark, Measures of leadership, 1990, Center for Creative Leadership, Greensboro NC

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 108 of 109 pages All rights reserved.

Wherry, R. J. Control of bias in rating. Subproject 3, Factor analysis of rating item indices. 1950, Personnel Research Section Report No. 914, Adjutant Generals Office, Department of the Army Wherry, R. J. Control of bias in rating. Subproject 9, A theory of rating. 1952, Personnel Research Branch Report No. 922, Adjutant Generals Office, Department of the Army Wilson, Edward O., The future of life, 2002, Knopf, NY NY

Measuring job performance: why, how, and whether Copyright 2003 Paul F. Ross

page 109 of 109 pages All rights reserved.

Das könnte Ihnen auch gefallen