1 - The MACBETH Approach Basic Ideas, Software, and An Application

ADVANCES IN DECISION ANALYSIS
Edited by N. Meskens Department of Applied Mathematics FUCAM University MONS Belgium M. Roubens Institute of Mathematics University of Lige LIEGE Belgium
THE MACBETH APPROACH: BASIC IDEAS, SOFTWARE, AND AN APPLICATION CARLOS A. BANA E COSTA Technical University of Lisbon, IST Av. Rovisco Pais - 1000 Lisbon, PORTUGAL cbana@alfa.ist.utl.pt JEAN-CLAUDE VANSNICK University of Mons-Hainaut, F.W.S.E. Place du Parc, 20 - 7000 Mons, BELGIUM Jean-Claude.Vansnick@umh.ac.be
Abstract - MACBETH (Measuring Attractiveness by a Categorical Based Evaluation Technique) is an interactive approach to guide the construction, on a set S of stimuli, of an interval scale which quantifies the attractiveness of the elements of S in the opinion of an evaluator. The aim of this paper is to present the main ideas on which this new decision-aid approach is based, and its software. MACBETH has already been applied in several complex cases. One such case was in the first application of multicriteria analysis to the evaluation of a European structural programme, the Hainaut case, which is used to illustrate the presentation in this paper. 1. Introduction Based on semantic judgements about the attractiveness of several stimuli, MACBETH (Measuring Attractiveness by a Categorical Based Evaluation Technique) is an interactive approach to aid the person who makes the judgements to quantify the attractiveness of each stimulus, in such a way that the measurement scale constructed is an interval scale. The aim of this paper is to present the main ideas on which MACBETH is based, and the software developed to use this approach in practice. The MACBETH approach has been already applied in the framework of many real-world applications of multicriteria decision-aid - see (Bana e Costa and Vansnick, 1997) and (Bana e Costa et al., in press). Section 2 briefly presents one of these applications, the Hainaut case; it will serve
Multi-Attribute Selection
throughout this article to illustrate the presentation of MACBETH and the use of its software to construct value functions and to assess the scaling constants of an additive model. The first part of Section 3 is devoted to the questioning mode used in MACBETH to obtain preference information richer than simple ordinal judgements. Next, we present the measurement rules used to quantify that information, and the MACBETH procedure to derive a first numerical scale on the set of stimuli. Based on the MACBETH scale , an interval scale can then be interactively constructed using the software (Section 4). The question of the incompatibility of judgements with a cardinal representation is the subject of Section 5. Section 6 deals with the important problem of establishing numerical values for the scaling constants of an additive aggregation model. Some comments on the Hainaut case are presented in Section 7, and the paper ends with a brief conclusion in Section 8. 2. The Hainaut case In 1995, in the framework of MEANS1, a multicriteria methodology was proposed for the evaluation of the European Unions structural programmes and a pilot evaluation using the MACBETH approach was carried out in the Hainaut province of Belgium - cf. (C3E, 1995a and b)2. The Hainaut programme is a typical structural programme that involves 1.5 billion ECU of public spending over 6 years. It is made up of 45 measures that aim at relaunching socio-economic development3. When a European programme is underway, its Monitoring Committee has to allocate additional funds or transfer existing funds from one measure to another, based on intermediate (on-going) evaluations of the success of each measure. It was precisely for this purpose that MACBETH was applied.
MEANS is a programme of the European Commission. It aims at improving Methods for Evaluating Actions of Structural Nature. (C3E, 1995b). The multicriteria methodology was proposed by Carlos Bana e Costa and Philippe Vincke, and adapted by the Centre for European Evaluation Expertise, C3E (Lyon) to the Hainaut case. The evaluation itself was carried out by the SEMA Group (Brussels) and RIDER (Louvain-la-Neuve). {...} the measure is the basic unit of the programme management and monitoring system. Financial information is structured per measure. The Monitoring Committee decides on budgetary allocations with reference to the measure. (C3E, 1995b, p.19). Examples of those measures are investment aid for setting up of enterprises, support for farming and forestry investments, development of top-quality sectors in R&D and Technologies, cleaning-up of derelict industrial and urban sites, better integration of people excluded from the job market by long-term unemployment and/or social handicaps, etc.
Index
As said in (C3E, 1995b, p.18), the Hainaut pilot evaluation was conducted by the Commission under the MEANS programme with the collaboration of the Walloon Government. The experiment fitted into a genuine process of evaluation, thereby presenting all the advantages of a true-to-life test. The success of a measure can be evaluated from different points of view. In the Hainaut pilot evaluation, eight evaluation criteria were constructed with reference to the intermediate objectives of the programme (see Table I). After the selection of the criteria, the structuring of the problem evolved towards the construction of a qualitative descriptor for each criterion, that is, a list of sentences describing various plausible levels of impact a measure may have in terms of the specific criterion. TABLE I. Evaluation criteria selected (source: C3E, 1995b, Table 10).
Intermediate objective proposed as the criterion 1 - Sectorial diversification 2 - More SMIs 3 - More opening up to outside markets 4 - More services 5 - Better environmental integration 6 - Better territorial distribution 7 - Greater enterprise viability 8 - Improved employability Meaning of the term success of a measure in relation to this criterion The measure redirects the economic activity of the province towards promising markets The measure increases the share of small and/ or medium enterprises in the economic life of the province The measure increases exports beyond the province The measure allows more enterprises to find services in the province The measure reduces the number of provincial enterprises whose development or survival is handicapped by environmental problems The measure redirects the economic activity of the province towards the most deprived areas The measure improves the economic viability of the province The measure improves the employability of the people living in the province who are working or looking for a job
Table II shows the descriptor for the criterion of Sectorial diversification, formed by six impact levels ranked by decreasing order of attractiveness from A (best level) to F (worst level). The descriptor aims to facilitate the appraisal of the degree to which each measure redirects the economic activity of the province towards promising markets. From among the impact levels of each descriptor, two reference levels good and neutral were identified to enable an intrinsic evaluation of each measure. Later on, a level of each descriptor was associated to each measure, thus defining its profile of impacts on
the eight criteria. The problem was structured by mutual agreement of the programme managers. During the evaluation phase, in the framework of an additive value model, MACBETH was applied to construct a value function for each criterion and assess scaling constants (relative weights of the criteria) from judgmental information provided by seven judges separately, in personal interviews. TABLE II. Descriptor for Sectorial diversification (source: C3E, 1995a, p.17).
Level A B Impact description The measure has for most of the enterprises and/or individual beneficiaries involved a change of activity from a declining sector to a fast-growing sector. 4 The measure has for the majority of the enterprises and/or individual beneficiaries involved a change of activity from a declining sector to a fast-growing sector. The measure is only beneficial to fast-growing sectors. The measure is mostly beneficial to f ast-growing sectors. The measure is in part beneficial to fast-growing sectors. The measure is only beneficial to declining sectors. Reference
good
C D E F
neutral
As said in (C3E, 1995b, p.27), the judges were asked to express a differentiated judgement so that the spirit of partnership evaluation might be respected. {} The seven judges were divided between two governmental levels - the European Commission and the Walloon Region. All were involved in the management of the programme as a whole and did not represent sectorial interests {}. In Sections 3 to 5, the presentation of MACBETH will be illustrated with the construction of a value function upon the descriptor of Table II, that is an interval scale on the set X of levels {A, B, C, D, E, F} measuring the attractiveness of its six elements, and Section 6 shows how MACBETH was used to assess scaling constants. To preserve the confidentiality of the actual preference data of the Hainaut case, the judgemental information included in the next sections is hypothetical, although realistic. Moreover, it is worthwhile remarking that several features of the MACBETH software presented in this paper were not yet available in the preliminary version used in the Hainaut evaluation. Some comments on the case are presented in Section 7.
4
A sector is growing if it is probable that the share of the market for Hainaut is increasing. It is in decline in the opposite case. A measure can be of benefit to those activities which are currently growing within a sector which is globally in decline, and vice versa.
Index
3. Deriving a first numerical scale Let S be a finite set of stimuli and P a strict weak order5 modeling the relative attractiveness of the elements of S for a judge J in the sense that, x, y S, xPy if and only if J judges x more attractive than y. In the example in Table II, S is the set X of levels {A, B, C, D, E, F} and the relation P models the ranking of the six elements of S in decreasing order of their attractiveness (A is the most attractive element, followed by B, than C, D, E, and finally F, which is the least attractive element). Assessing such ordinal preference information is not too difficult but, unfortunately, this information is not enough in most of the practical applications (as the Hainaut case) in which interval scales are necessary to assure the meaningfulness of the results, that is, one needs to know, not only that x is more attractive than y, but also by how much. Our opinion is that most people have not an interval scale in mind but they have some feelings of difference of attractiveness, and, therefore, it is possible, thanks to an interactive learning process, to aid them to construct such a strong scale. In order to obtain reliable information about differences of attractiveness, the basic idea of MACBETH is to ask many concrete questions, thereby enabling the testing of the consistency of the answers of the responder regarding the type of scale one wants to build. Nevertheless, the MACBETH questions are simple and natural ones for they only involve two stimuli simultaneously. The MACBETH questioning procedure consists in asking J to verbally judge the difference of attractiveness between each two stimuli x and y of S (with x more attractive than y) choosing one of the following semantic categories: C1 C2 C3 C4 C5 C6 very weak difference of attractiveness weak difference of attractiveness moderate difference of attractiveness strong difference of attractiveness very strong difference of attractiveness extreme difference of attractiveness.
During this questioning process, a matrix can be filled with the categorical judgements of J, as exemplified in Figure 1 for the set of levels presented in Table II (note that when there is no difference of
5
Strict weak order: an asymmetric and negatively transitive relation.
attractiveness between two stimuli x and y, no is inserted at the intersection of the row x and the column y, and vice-versa, and the ordered pairs (x, y) and (y, x) are said to belong to category C0). Although each question only involves two stimuli, it is easy to derive, from the set of absolute judgements given by J, information concerning the relative differences of attractiveness between two pairs of stimuli. For instance, from the matrix of judgements in Figure 1, one can conclude that the difference of attractiveness between D and E is greater than the difference of attractiveness between B and C.
A A no B C D E F B weak no C moderate weak no D moderate weak very weak no E very strong very strong strong strong no F extreme extreme very strong very strong moderate no
Figure 1. Matrix of judgements of difference of attractiveness for judge J
This is essentially the kind of (indirect) information that MACBETH exploits. For the example above, MACBETH will constrain the difference between the number (score) associated to D and the number associated to E to be greater than the difference between the number associated to B and the number associated to C. Indeed, given the relation P and a matrix of categorical judgements, MACBETH verifies if there exists a numerical scale on S that satisfies the two following conditions (measurement rules): Condition 1 (ordinal condition) x , y S : (x ) > (y ) x is more attractive than y Condition 2 (semantic condition) k , k' { 1,2,3,4,5, 6} , x , y , w, z S with (x , y ) Ck and (w, z ) Ck' : k k '+1 (x ) (y ) > (w ) (z )
If it is not possible to satisfy these two conditions, then no interval scale can represent the judgements expressed, that is, the matrix of judgements is incompatible with an interval scale representation. We will see in Section 5 how MACBETH deals with incompatibility. On the contrary, if conditions 1 and 2 can be satisfied (the matrix of judgements is then called consistent), the MACBETH software determines, from all the possible scales, a particular scale on S (the basic MACBETH scale) by a procedure which consists essentially of solving the following linear programme:
Index
Set of stimuli: S = {s1, s2, , sn} Relation P on S such that s1 is at least as attractive as s2, s3, s4, , sn-1 and is more attractive than sn s2 is at least as attractive as s3, s4, , sn s3 is at least as attractive as s4, , sn sn-1 is at least as attractive as sn Positive variables: (s i ) , i {1, 2, , n}
(s i ) Objective function: min (s1 )

Constraints: (1) (s n ) = 0 (2) x , y S with (x , y ) C 0 : (x ) = (y ) (3) k , k' { 0,1, 2, 3, 4, 5, 6} with k > k' , (x , y ) C k and (w, z ) C k' : (x ) (y ) (w ) (z ) + k k' . Figure 2 is a snapshot of the main screen of the scoring module of the software. The upper matrix in the left window represents the matrix of judgements in Figure 1. For example, the 4 in the intersection of the row of C with the column of E means that (C, E) C4, that is, the difference of attractiveness between the elements C and E was judged strong.6
MACBETH
Very important note: The numbers in the upper matrix that represent the judgements of J have not a cardinal meaning. They have only an ordinal meaning.
Figure 2. Consistent case
Running MACBETH for the matrix of judgements in Figure 2, one obtains the basic MACBETH scale on S shown in the left window (both under and over the label Current scale) and also graphically in the right window. Moreover, the bottom matrix in the left window shows the resulting differences of values corresponding to the semantic judgements above. Observe in Figure 1 that the differences of attractiveness between A and B, B and C, and B and D were judged weak, that is, the three pairs of stimuli (A, B), (B, C) and (B, D) were assigned to category C2 - the 2s in the matrix of judgements in Figure 2. The corresponding differences of MACBETH values are (A) (B) = 2, (B) (C) = 2 and (B) (D) = 3, respectively. Note that these differences are not equal, although they all represent weak judgements numerically. This fact is not at all shocking, in as much as it confirms the character of the term weak. This is why no constraint has been introduced in the linear programme regarding the elements of pairs belonging to the same category Ck (for k 1). The MACBETH scale associates an interval of values to each category Ck (k 1), some of them eventually reduced to a single point. For the example in Figure 2, the resulting intervals are shown graphically in Figure 3. As can be seen, the interval [2, 3] corresponds with category C2.
10
Index
Note also that, given constraint 1 of the programme, the upper limit of the interval associated with the highest category used in the matrix of judgements (in the case C6) necessarily equals the MACBETH value assigned to the most attractive stimulus (the element A, in the case). One can now understand why the objective function of the linear programme above consists in minimizing the number attached to the most attractive stimulus: that aims at reducing as much as possible the lengths of the intervals associated with the categories and the distances between them, these distances being at least equal to 1 to respect the constraints 3.
Figure 3. Intervals associated with the categories
In these constraints, the difference k k' is introduced in order to take into account the possibility that some categories are not used by J: it forces the minimum distance between two categories Ck and Ck (with k, k' {0, 1, 2, 3, 4, 5, 6} and k > k' ) to be at least equal to k k'. Finally, let us mention that, although (s1) is uniquely determined by the linear programme above, this is not necessarily the case for (s2), (s3), , (sn-1) (by constraint 1, (sn) = 0). This is the reason why the procedure for determining the MACBETH scale also contains some technical rules for ensuring the unicity of .
4. From the scale to an interval scale v
MACBETH
Once expressed judgements of difference of attractiveness following the MACBETH questioning procedure, one starts to move from the ordinal to the cardinal judgemental domain. But, to reach a cardinal scale, it is still necessary to go further and reason about proportions between differences of attractiveness. Obviously, this subject concerns very difficult ratio questions that, from our viewpoint, shouldnt be directly asked. Instead, the problem should be discussed in an interactive learning process developed in a very friendly way.
11
In this sense, we do not see the MACBETH scale as an end in itself, but precisely as a sound and practical means to launch a discussion with J to enter into the cardinal domain. For instance, from the MACBETH scale one can compute ratios like (x ) (y ) (w ) (z ) (where x, y, w and z are four stimuli such that, for J, x is more attractive than y and w is more attractive than z), and, therefore, to confront J with those ratios and ask him or her if they reflect, in each case, the proportion of differences of attractiveness that he or she feels exist between x and y from one side and w and z from the other one. For example, from the MACBETH scale in Figure 2 and the respective differences of values, one can see that ( D) ( E ) =2 (B) (D) and, then, to ask J if he or she feels that the difference of attractiveness between D and E is twice the difference of attractiveness between B and D. One can also see that (A ) (C) =1 (E ) (F) and, then, to ask J if he or she considers that the difference of attractiveness between A and C is the same as that between E and F. And so on. One can very much facilitate this interaction with J using the visual support of the friendly graphic display in the right window of the software (see Figure 2). The visual comparison of intervals make comparison of proportions less abstract and consequently less difficult. Indeed, many times a small drawing is better than a long dissertation! When J considers that the distances between stimuli showed on the screen do not adequately represent the respective differences of attractiveness, one can easily change these distances by dragging a stimulus (using the left mouse button). As Figure 4 shows, once selected a stimulus (clicking on it with the left mouse button) it appears bound by two lines defining the range within which J can freely move the selected stimulus without violating the constraints, derived from conditions 1 and 2, corresponding to Js judgements.
12
Index
Figure 4. Limits of variation of a value
For example, the lower bound 3.01 for the value of E is due to the weak and moderate judgements of difference of attractiveness between B and D and E and F, respectively: from condition 2, (E) (F) > (B) (D), and so, with (F) = 0, (B) = 13 and (D) = 10, one must have (E) > 3. And the upper bound 4.99 is due to the strong and moderate judgements between D and E and between A and D, respectively: (D) (E) > (A) (D), (A) = 15 and (D) = 10 imply (E) < 5. The software updates, in the bottom matrix of the left window, the value differences that are being affected when the value of a certain stimulus is being changed within its range (by dragging it with the left mouse button, or typing directly a value). Moreover, the software offers the option of seeing which relationship(s) between semantic judgements risk(s) being violated each time a bound is attained. This is exemplified in Figure 5, where the value of E in the right window is at its lower limit (3.01). One can see in the matrices of the left window small lines above the judgement for B and D (weak, 2) and the respective difference of values (3.00), and other small lines below the judgement for E and F (moderate, 3) and the respective difference (3.01). These lines indicate that decreasing the value of E below 3.01 would reverse the order of the two judgements involved. This feature can be very useful for J to reason and learn about his or her preferences. For example, based on such information, J can decide to revise any one of the initial judgements and run again the programme.
13
Figure 5. Judgements defining a bound for a value
Of course, the discussion can be based on a transformed MACBETH scale, that is, a scale obtained by a linear transformation of the basic MACBETH scale. For instance, in the Hainaut case, the scoring system for each criterion was such that the values of the reference levels good and neutral were always fixed at 100 and 0, respectively (see Figure 6). Another interesting feature of the software is the possibility of testing directly if a specific set of scores is compatible with the semantic judgements given by J (see Figure 7).
14
Index
Figure 6. Working with a fixed scale (B - good; F - neutral)
Figure 7. Testing a numerical scale directly
15
5. Dealing with incompatibility
As soon as the comparative judgements of J concerning the relative attractiveness of the elements of S lead to a ranking of these elements (as supposed above), it exists a numerical scale on S satisfying the ordinal condition (condition 1 in 3). However, it can happen that the semantic judgements (absolute judgements of difference of attractiveness) of J are such that it does not exist a scale satisfying the ordinal and semantic conditions (conditions 1 and 2), that is, the judgements of J are incompatible with the construction of a cardinal scale on S. We present below two possible sources of incompatibility and how MACBETH deals with them. 5.1. CONFLICT BETWEEN COMPARATIVE AND SEMANTIC JUDGEMENTS (SITUATION OF INCOHERENCE) Suppose that the matrix in Figure 8 represents the semantic judgements (of a certain judge) between the impact levels of the descriptor in Table II.
A A no B C D E F B very weak no C weak weak no D moderate strong weak no E very strong very strong moderate very weak no F extreme extreme very strong strong moderate no
Figure 8. Incoherent matrix of judgements
Observing the column of D in Figure 8, it is easy to understand why, in this case, a numerical scale on S that satisfies conditions 1 and 2 does not exist. As a matter of fact, A is more attractive than B, implying (given condition 1) (A) > (B); while on the other hand, the difference of attractiveness between B and D (strong) is higher than the difference of attractiveness between A and D (moderate), implying (given condition 2) (B) (D) > (A) (D), that is, (B) > (A). This is a case of conflict between comparative and semantic judgements. In general, four different cases of conflict between comparative and semantic judgements can occur: 1) there exist three stimuli x, y and z such that x and y are both more attractive than z, x is more attractive than y (xPyPz) and the difference of attractiveness between y and z is higher than the
16
Index difference of attractiveness between x and z (as in Figure 8 for A, B, and D),
2) there exist three stimuli x, y and z such that x is more attractive than both y and z, y is more attractive than z (xPyPz) and the difference of attractiveness between x and y is higher than the difference of attractiveness between x and z, 3) there exist three stimuli x, y and z such that x and y are both more attractive than z, there is no difference of attractiveness between x and y, and the difference of attractiveness between x and z is different from the difference of attractiveness between y and z, 4) there exist three stimuli x, y and z such that z is more attractive than both x and y, there is no difference of attractiveness between x and y, and the difference of attractiveness between z and x is different from the difference of attractiveness between z and y. In the MACBETH approach, the matrix of judgements is called incoherent in case of conflict between comparative and semantic judgements. When running the MACBETH software in such a case, the cells of the matrix of judgements corresponding to the differences of attractiveness causing the incoherence problem are shaded, and a message invites the user to revise the judgements (see Figure 9).
Figure 9. Case of incoherence
Multi-Attribute Selection 5.2. CONFLICT BETWEEN SEMANTIC JUDGEMENTS (SITUATION OF SEMANTIC INCONSISTENCY)
17
Even when there is no conflict between comparative and semantic judgements, a numerical scale on S satisfying conditions 1 and 2 can still not exist, because some semantic judgements conflict with each other. In MACBETH, this is called a case of semantic inconsistency. For example, the matrix of judgements in Figure 10 is semantically inconsistent. This situation can be detected by linear programming, but the conflict between the semantic judgements in Figure 10 is easy to see graphically, as shown in Figure 11: indeed, the length of path B-D-E should be longer than the length of path B-C-E, while both paths should have the same length of B-E. This corresponds to the impossibility of satisfying the semantic condition (condition 2), as follows: (B) (D) > (C) (E) and (D) (E) > (B) (E) lead, by summation, to (B) (E) > (B) (E).
A A no B C D E F B very weak no C weak very weak no D strong strong weak no E very strong very strong moderate weak no F extreme extreme very strong strong moderate no
Figure 10. Semantically inconsistent matrix of judgements
D strong w eak
very w eak C
m oderate
Figure 11. Graph of the situation of semantic inconsistency
In face of a semantically inconsistent matrix of judgements the software recognizes the problem and can, if asked, identify the minimum number of judgements that must be modified to arrive to consistency and give suggestions for such modifications. As shown in Figure 12, for the case of the matrix in Figure 10 consistency can be achieved by (only) one of the following four different possibilities - to move (B, C) or (C, E) up one category, or (B, D) or (D, E) down one category - indicated in the matrix of judgements with up and down arrows. (Note in Figure 11 that the
18
Index
paths B-D-E and B-C-E can have the same length for any one of these four modifications, because judgements of the same category can be represented by numerical differences not necessarily equal).
Figure 12.
MACBETH
suggestions to bypass inconsistency
When facing a situation of semantic inconsistency, the evaluator can opt to revise his or her judgements as described above, or, alternatively, to enter directly into a graphical discussion of a numerical scale proposed by the MACBETH software (see Figure 13)7. Obviously, this scale (still called MACBETH scale) does not respect conditions 1 and 2 together, but, however, it is determined (by linear programming) in such a way that it always respects the ordinal condition and the weakest possible weakening of the semantic condition from the following alternatives: Weakening 1 (condition 2W1) k ,k' {1, 2, 3, 4, 5, 6}, x , y , w, z S with (x , y ) C k and (w, z ) C k' : k = k' +1 (x ) (y ) (w ) (z )
k k' +2 (x ) (y ) > (w ) (z ).
Although, in general, we strongly recommend the first option for it fits better with a learning perspective in decision-aiding.
19
Weakening 2 (condition 2W2) k , k' {1, 2, 3, 4, 5, 6}, x , y , w, z S with (x , y ) C k and (w, z ) C k' : k k' +2 (x ) (y ) > (w ) (z ). Weakening 3 (condition 2W3) k ,k' {1, 2, 3, 4, 5, 6}, x , y , w, z S with (x , y ) C k and (w, z ) C k' : k = k' +2 (x ) (y ) (w ) (z )
k k' +3 (x ) (y ) > (w ) (z ).
etc. For the semantically inconsistent matrix in Figure 10, the scale proposed by MACBETH (see Figure 13) is such that (B) (D) = (C) (E) and (D) (E) = (B) (C), making the upper limit of very weak equal to the lower limit of weak, and the upper limit of moderate equal to the lower limit of strong. Thus, in this particular example, the MACBETH scale satisfies condition 1 and condition 2W1 (the weakest weakening of condition 2). However, in other cases, it might be necessary to overlap some categories in order to reconcile semantically inconsistent judgements.
Figure. 13.
MACBETH
scale in case of semantic inconsistency
20
Index
6.Assessing scaling constants with
MACBETH
In the first part of the Hainaut pilot evaluation, MACBETH was used to construct an interval scale vi over the descriptor Xi of each criterion i (i {1, 2,, 8}), quantifying the attractiveness of the elements of Xi for each judge, such that vi(neutrali) = 0 and vi(goodi) = 100, being neutrali and goodi the neutral and good reference levels of Xi. Thus, vi(xi) measures the absolute attractiveness (according to criterion i) of each element xi of Xi. Next, to measure the global attractiveness (that is, according to all the eight evaluation criteria together) of each element (x1, x2, , x8) of X1 X2 X8 for each judge, the following additive aggregation procedure was adopted: V(x1, x2, , x8) = k1.v1(x1) + k2.v2(x2) + + k8.v8(x8) with k1, k2, , k8 0 and k1 + k2 + + k8 = 1.8 The numerical values for the scaling constants k1, k2, , k8 were determined by MACBETH from qualitative intercriteria preference information provided by each judge individually. Each judge was asked to consider a fictitious programme measure a0 = (neutral1, neutral2, , neutral8) whose impacts are neutral in all the criteria, and the following eight fictitious programme measures whose impacts are good in one criterion and neutral in all the other criteria a1 = (good1, neutral2, neutral3, , neutral8) ai = (neutral1, , neutrali-1, goodi, neutrali+1, , neutral8), i {2, 3, 4, 5, 6, 7} a8 = (neutral1, , neutral6, neutral7, good8).9 Applying the aggregation model, one has V(a0) = 0 and V(ai) = 100.ki (i {1, 2, , 8}). Consequently, ranking the fictitious measures in decreasing order of relative global attractiveness will lead to a ranking of the scaling constants in order of their relative magnitude.
Note that the number V(x 1 , x 2 , , x 8 ) measures the absolute global attractiveness of (x 1 , x 2 , , x 8 ) since, by definition, V(neutral 1 , neutral 2 , , neutral 8 ) = 0 and V(good 1 , good 2 , , good 8 ) = 100. For the application of MACBETH, the set S of stimuli is now {a 0 , a 1 , a 2 , a 3 , a 4 , a 5 , a 6 , a 7 , a 8 }.
21
Suppose that judge J decided that the criterion of Greater enterprise viability (i = 7, see Table I) is the one that he or she would most like to swing from neutral to good. Thus, a7 is the most attractive fictitious measure and k7 is the largest scaling constant. Moreover, J judged that the swing of impact from neutral to good in Greater enterprise viability would increase the global attractiveness of the fictitious measure a0 = (neutral1, neutral2, , neutral8) extremely, that is, the relative difference of global attractiveness between a7 and a0 would be extreme. The criterion of More SMIs (i = 2) was the one that J would next prefer to move to its good level, thus a2 is the second most attractive fictitious measure and k2 is the second largest scaling constant. Moreover, J judged that the swing of impact from neutral to good in More SMIs would very strongly increase the global attractiveness of a0, that is, the relative difference of global attractiveness between a2 and a0 would be very strong. Repeating this procedure led to the following ranking of the scaling constants for J: k7 > k2 > k3 = k5 > k4 > k8 > k1 > k6. The judgements between each ai (i {1, 2, , 8}) and a0 are shown in the last column of the matrix in Figure 15. Each judge J was then asked to choose his preference from among the criteria (C3E, 1995b, p.28), following an adequate questioning procedure. For example, for the criterion of Greater enterprise viability and that of More SMIs, J was asked (see Figure 14): 1. to consider the two fictitious measures: a7 whose impact is good for the criterion of Greater enterprise viability and neutral for the others, a2 whose impact is good for the criterion of More SMIs and neutral for the others; 2. to confirm that he or she would prefer a7 to a2 and, in case of preference, say whether, in his or her opinion, the difference between the global attractiveness of these two fictitious measures would be very weak, weak, moderate, strong, very strong or extreme.
22
Index
Figure 14. Example of comparison of two fictitious measures
Figure 15 resumes all the two-by-two comparisons of fictitious measures (including a0) for the hypothetical judge J.
a7 a2 a3 a5 a4 a8 a1 a6 a0 a7 no a2 moderate no a3 a5 a4 strong strong very strong weak weak strong no no no no moderate moderate no a8 very strong strong strong strong weak no a1 extreme strong strong strong weak very weak no a6 extreme very strong strong a0 extreme
very strong very strong strong very strong moderate strong weak moderate very weak no moderate weak no
Figure 15. Judgements of difference of global attractiveness
23
Figure 16. Weighting module of the
MACBETH
software
Figure 16 is a snapshot of the main screen of the weighting module of the MACBETH software. Its left window is similar to that of the scoring module (see Sections 3 to 5): the upper matrix represents the matrix of judgements of (global) difference of attractiveness in Figure 15, for which the MACBETH scale (linear transformation of the basic MACBETH scale such that the value of a0 is 0 and the sum of all values is 100 - which assures that k1 + k2 + + k8 = 1) is the one shown under and over the label Current scale; the bottom matrix shows the resulting value differences corresponding to the semantic judgements above. In the right window, a bar chart appears, representing this transformed MACBETH scale. In order to reach an interval scale V on {a0, a1, a2, a3, a4, a5, a6, a7, a8}, the judge can modify any value (except the value of a0 which is fixed at zero) by changing the height of the respective bar, dragging it with the left mouse button within the acceptable range indicated under the bar. As exemplified in Figure 17, it is also possible to enter a numerical value directly in the modify weight window that appears when the right mouse button is clicked over the bar. In the figure, note that that, once the value of a2 has been modified to 19, all the other values also changed in such a way that their proportions are kept constant and the sum still makes 100. Nonetheless, if desired, one can keep unchanged
24
Index
one value (or more) by fixing, before, the height of the respective bar by clicking the left mouse button over it. When no more modifications are desired, the numerical values of the scaling constants can be calculated from V(ai) = 100.ki (I {1, 2, , 8}). Note that to reason about direct proportions of values is now meaningful, since, with V(a0) = 0 by definition, for any two fictitious measures ai and aj (i, j {1, 2, , 8}) one has:
V (a i ) V (a 0 ) V (a i ) 100.ki k = = = i . V (a j ) V (a 0 ) V (a j ) 100.k j k j That is, being V an interval scale, the numerical values of the scaling constants are expressed in a ratio scale. In the Hainaut multicriteria pilot evaluation, a weighting scale was assessed for each judge.
Figure 17. Changing a value
Note that the seven judges did not share the same preferences. During this two-by-two comparison, each judge clarified his own system of values in relation to the criteria used; this explains why people with the same assessment of the impacts may reach different values. Remember too that the judges only expressed qualitative preferences between
25
fictitious measures which gave priority to this or that criterion. (C3E, 1995b, p.29).
7.Comments on the Hainaut case

Once the scoring and weighting scales were defined, the global values of the programme measures under evaluation were calculated for each judge using the additive model. For each measure, the final results were presented to the programme managers in the form of a multi-judge profile. As explained in (C3E, 1995b, p.32), these profiles can be used in the following way: - if a measure is generally agreed to be good, it can be recommended for positive consideration in the next budgetary reallocations, - if a measure is generally agreed to be neutral, no recommendation follows, - if a measure is highly rated by some judges and not by the others, the measure can be recommended for further evaluation in order to shed more light on the issue. We see how the method can be used both to redirect the programme and improve the quality of the evaluation. The main objective of the pilot evaluation carried out in 1995 in the Hainaut province of Belgium was to make a true-live test of the multicriteria methodology proposed for the evaluation of European Unions Structural Programmes. From (C3E, 1995a and b), one can conclude that the objective has been successfully attained. From among the many very interesting conclusions included in these two documents, we quote to the comments more directly related to the MACBETH approach: During the discussions with the judges, the evaluator asks them to choose between concrete elements as demonstrated by the [following] example {...}. {...} Example of the concrete expression of preferences Which of these two sentences do you prefer? The measure has for most of the enterprises and/or individual beneficiaries involved a change of activity from a declining sector to fast-growing one The measure is in part beneficial to the fastgrowing sectors Is your preference very weak, weak, [moderate], strong, very strong or extreme?
26
Index
The interlocutors are not asked to give their preferences directly in the forms of scores or weights. They are only asked to compare two sentences as concrete and practical as possible; a software programme processes the answers into scores and weights. Other multicriteria evaluation methods ask the interlocutors to give their preferences in scores and weights. {...} The concrete expression of preferences was chosen in Hainaut because it was a way of keeping in closer touch with reality. The difference between the two methods is well illustrated by the following two questions: Compare these two photos and say whether you think this person is more beautiful than the other person? (concrete approach, indirect scoring) Look at this photo and say if you can give the persons beauty a score between 0 and 10? (abstract approach, direct scoring). One of the great advantages of the concrete approach is that it can be understood by all the evaluation partners for almost the entire procedure.10 The judges accepted to give their preferences between the impact levels and between the criteria. The use of the MACBETH questioning procedure in the framework of Structural Funds did not pose any problem. {} Note that the three consultants that conducted the interviews did not have previous knowledge of the MACBETH method. Despite this fact, they succeeded in carrying out the interviews in a satisfactory way.11
10 11
In (C3E, 1995b, pages 40 to 42). Translated by the authors from the French original (C3E, 1995a, pages 5 and 6).
27
8. Conclusion
When studying the elements of a set S of stimuli, it may be of interest to assess from a person J cardinal information concerning the degrees to which the elements of S possess, in Js opinion, a (objective or subjective) property . The goal of the MACBETH approach is to aid J to assign to each element of S a number quantifying the degree to which it possesses (in Js opinion) in such a way that these numbers can be considered to form a cardinal scale, that is, a measurement scale where more mathematical operations than the min and the max might be meaningfully performed. In this paper, MACBETH was presented in a multicriteria decision-aid framework in which the property under consideration is the attractiveness (or desirability) of stimuli. Nevertheless, it is clear that MACBETH has a much broader applicability domain: for instance, S could be a set of offences and the graveness of these offences, or S could be a set of people and their intelligence, or S could be a set of seats and their comfort, etc. The basic ideas of the
MACBETH
approach are:
to adopt an initial questioning procedure that is simple and easy to understand, involving only two elements of S in each question and allowing a verbal (qualitative) judgemental answer, but to ask for a significant number of such questions in order to make possible the testing of their compatibility with the construction of a cardinal scale on S in a subsequent phase; in case of incompatibility, to search for its sources in order to support a reflection with J about his or her initial judgements and eventually revise some of them; in case of compatibility, to propose to J a first numerical scale on S representing his or her verbal judgements (the MACBETH scale), determined by a straightforward procedure based on evident conditions (measurement rules) and a simple principle; to facilitate, with the support of a visual interactive software, the perception of this first numerical scale, the reflection on it, and the evolution towards its progressive transformation into a cardinal scale.
28
Index
Since the beginning of the research with MACBETH, in 1992, these basic ideas have proven to be fruitful and, therefore, they are unchanged. Regarding the technical components and the software, they have evolved both in terms of the linear programmes used and of the options available. This evolution has been the consequence of numerous realworld applications and of the effective work of Jean-Marie De Corte, who has been in charge of the development of the MACBETH software for Windows since 1996. Several new features and improvements are currently under development, such as the possibility of dealing with hesitation in expressing judgements of difference of attractiveness and the extension to group decision support. A trial version of the software can be freely down-loaded from the Web-site MACBETH SUPPORT SYSTEM. This version already permits to work with judgemental hesitation (for example, the difference of attractiveness between x and y is weak to moderate) and with incomplete matrices of judgements.
9. References
Bana e Costa, C.A., Vansnick, J.C. (1997) Applications of the MACBETH approach in the framework of an additive aggregation model, Journal of Multi-Criteria Decision Analysis 6:2, 107-114. Bana e Costa, C.A., Ensslin, L., Corra, E.C., Vansnick, J.C. (in press), Decision Support Systems in action: integrated application in a multicriteria decision aid process, European Journal of Operational Research. C3E (1995a), Evaluation Pilote Multicritre du Hainaut - Rapport dExprience, European Commission, DG XVI/02, Brussels. C3E (1995b), MEANS Handbook N 4: Applying the Multi-criteria Method to the Evaluation of Structural Programmes, European Commission, DG XVI/02, Brussels.

1 - The MACBETH Approach Basic Ideas, Software, and An Application

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

1 - The MACBETH Approach Basic Ideas, Software, and An Application

Hochgeladen von

Copyright:

Verfügbare Formate

ADVANCES IN DECISION ANALYSIS

Strict weak order: an asymmetric and negatively transitive relation.

Figure 1. Matrix of judgements of difference of attractiveness for judge J

(s i ) Objective function: min (s1 )

Figure 2. Consistent case

Figure 3. Intervals associated with the categories

Figure 4. Limits of variation of a value

Figure 5. Judgements defining a bound for a value

Figure 6. Working with a fixed scale (B - good; F - neutral)

Figure 7. Testing a numerical scale directly

5. Dealing with incompatibility

Figure 8. Incoherent matrix of judgements

Figure 9. Case of incoherence

Figure 10. Semantically inconsistent matrix of judgements

Figure 11. Graph of the situation of semantic inconsistency

suggestions to bypass inconsistency

scale in case of semantic inconsistency

6.Assessing scaling constants with

Figure 14. Example of comparison of two fictitious measures

Figure 15. Judgements of difference of global attractiveness

Figure 16. Weighting module of the

Figure 17. Changing a value

7.Comments on the Hainaut case

Das könnte Ihnen auch gefallen