Paired Comparisons and Designed Experiments

Food Quality and Preference 11 (2000) 5561
www.elsevier.com/locate/foodqual
Paired comparisons and designed experiments

Gorm Gabrielsen*
Department of Management Science and Statistics, Copenhagen Business School, Julius Thomsens Plads 10, DK-1925 Frederiksberg C, Denmark
Received 7 September 1998; received in revised form 14 July 1999; accepted 31 July 1999
Abstract
Paired comparisons can be a very eective way of performing measurements of preferences. The statistical analysis can be carried
out as an ANOVA although the explanatory variables are not categorical variables. Explanatory variables of subjects and objects
can be included in the ANOVA if the experiment is carefully designed. It is often feasible or necessary to reduce the number of
comparisons to be performed by subjects, however, such structure can be included in ANOVA if comparisons to be performed are
carefully selected. # 1999 Elsevier Science Ltd. All rights reserved.
Keywords: Analysis of variance; Designed experiments; Individual preferences; Paired comparisons; Sensory analysis; Scale dierences
1. Introduction
The origin of paired comparisons is Thurstone's law
of comparative judgement to items which have no physical properties as for example statements (Thurstone,
1927). The basic data collection for scaling items in the
Thurstone tradition is that of paired comparisons, while
the most common design for measurements requires
subjects to respond to items according to some positive
negative dichotomy or scale, such as `àgree'' or ``disagree''. The latter is often referred to as a direct
response design. The model for paired comparisons is
numerically equivalent to the BradleyTerryLuce
model and the simple logistic model (Bradley & Terry,
1952; Luce, 1959). In all of these models the response is
binary.
The method of paired comparisons can also be developed for continuous responses. However, beyond the
paper of Schee (1952), very little work seems to have
been done in this area. Also the method has not yet been
applied very extensively.
In general the method of paired comparisons is
expected to be more sensitive to dierences between
objects than direct response designs. If a judge in a
direct response design gives two objects the same score,
he might still nd a slight preference for one. This might
be an advantage but it may also distort the hypothesis
of subtraction discussed later.
* Tel.: +45-3815-3515; fax: +45-3815-3500.
Also the paired comparisons may be sensitive to the

context in which the comparisons are performed. This
sensitivity may at the same time be turned to account,
since the experiment can be repeated several times under
dierent contexts. The variation of context may be
incorporated into the model as an explanatory variable.
The aim of the present paper is to demonstrate the
abilities of a paired comparison design. The approach is
taken in an example from which hypotheses and analyses are discussed. The experiment was performed as a
pilot study to evaluate the practicability of the experiment, the reasonableness of the statistical analysis and
the interpretation of consequent results. As the number
of subjects only amount to eight subjects, the focus of
the paper will be on the method rather than the results.
2. The experiment
The objects consisted of 4 dierent kinds of beer
chosen from a range of standard Danish beers (no special
or fancy beers). Eight consumers of beer were recruited
(friends of the author). They were not paid, but they
were promised to be able to join in the emptying of beer
bottles at the end of the study. It was decided that each
subject should perform all possible (dierent) paired
comparisons of the four beers. If the comparison of beer
i against beer j is taken as a dierent comparison than
beer j against beer i the total number of comparisons to
be performed by each subject is 12. In general this
0950-3293/99/$ - see front matter # 1999 Elsevier Science Ltd. All rights reserved.
PII: S0950-3293(99)00064-6
56
G. Gabrielsen / Food Quality and Preference 11 (2000) 5561
would be too many comparisons for a subject to perform, however, as the study was a pilot study and the
experiment was carried out under relaxed conditions it
was accepted as feasible.
The experiment took place in the afternoon in the
garden of the author given the context of ``having a
relaxed late afternoon beer among friends''. Each subject was placed at a table with two glasses of beer, one
to the left and one to the right of the subject and a piece
of paper between the glasses. The glasses were marked
``left'' and ``right'', respectively, otherwise they were
identical. Subjects were instructed to taste the left beer
rst but were otherwise free to retaste any of the beers if
desired. Subjects rated their judgement on a 22 cm linear scale labelled at left anchor. `Ì prefer left beer very
much'' and at the other anchor `Ì prefer right beer very
much''. Two new glasses of beer and a new piece of
paper with a rating scale were placed on the table and so
on until 12 comparisons were performed.
At the end of the experiment ratings were converted
into numbers. The distance from the middle of the scale
to the noted mark was measured. Positively to the right
and negatively to the left. The corresponding values of
the scores of subject p with beer i to the left and beer j to
the right was denoted yijp. In any event it is assumed
that the numerical score increases with the strength of
the preference for j over i, and that equal but opposite
preferences (j over i, and i over j) corresponds to equal
but opposite scores. Thus for subject p
means the reverse. To allow for a possible order eect

the mean-values can be specied
ijp ijp p :
ijp is the ``true'' preference of beer j over beer i and it is

assumed that the preferences are anti symmetric
(ijp jip ), p is the order eect. If p is positive subject p has a bias towards a preference of the beer to the
right, and conversely if p is negative. From the point of
view of ANOVA the order eects, p , are subject-maineects, however, in the present context the interpretation of p is a systematic bias of subject p measured as
deviation from zero.
In the present specication of the mean-value the
order eect, p , depends on the subjects but not on the
pair of objects to be compared, however, more general
specications are possible.
Model (1) and all subsequent considered models
falls under the general theory of least squares and linear
hypotheses and may be written in the usual way as
y X "
If e.g. yijp is positive j is preferred over i, if yijp is

negative i is preferred over j.
for an appropriate chosen design matrix, X.

In general, models of this kind are analysed by
regression i.e. the mean-value is explained by explanatory variables or by factorial designs. The advantage of
using a factorial design is that by ``balancing'' the
design an orthogonal decomposition is obtained, meaning that specic eects can be estimated independently
of other eects, i.e. it is possible to construct a unique
analysis of variance table. Although the present design
of paired comparisons is neither a regression analysis
nor a factorial design the concept of orthogonal
decomposition and thereby construction of analysis of
variance tables can still be supported if the experiment is carefully designed.
3. The model
4. Statistical analysis
The underlying assumptions of the mathematical

model are that all the yijp are independent random variables and normally distributed with mean value E(yijp)
=ijp and the same variance

Yijp N ijp ; 2 ; i; j 1; 2; 3; 4; i 6; j; p 1; 2; :::8:
Model (1) is a linear model and can be tted applying

usual computer software, see Appendix. Including eight
subjects each having six preferences plus one order
eect totally 56 parameters are estimated and the analysis of variance table become as Table 1. It should be
noted that the measures are preferences or distances and
therefore, the hypothesis that all mean-values are zero
has a specic interpretation, namely that no preference
exists. The constant is, therefore,
in the total
P included
variation which is calculated as
y2ijp having 96 degrees
of freedom. From Table 1 it is seen that the model has
explanatory power (p<0.001), however, this does not
tell which eects are signicant. The aim of the decomposition is to identify signicant eects. For the sake of
yijp is the assessed preference of beer j over beer i

and
yjip is the assessed preference of beer i over beer j:
If a pair, say i and j, is presented to a judge in the

order (i, j), the order will usually mean the temporal
order in which the objects are tried, for example, the
order in which the beers are tasted. In a hand lotion test,
where both brands are used simultaneously, one on the
right hand and the other on the left, the order (i,j) could
mean that i is on the left hand and j on the right, while (j,i)
simplicity the residual variance, m.s.=0.9159 with 40

degrees of freedom is applied in this and all subsequent
F-tests.
The eect of the total model can be decomposed into
an eect of the true preferences (ijp ) and an eect of the
order eects, (p ), that again can be decomposed into a
common order eect and an eect of deviations from a
common order eect, Table 2. It is seen that most of the
explanatory power of the model originates from the
preferences (m.s.=56.24), the order eects can be
assumed equal (p=0.672), however, the common order
eect is signicantly dierent from zero (p=0.005). The
common order eect is estimated to =0.29 meaning
that all subjects has a common bias towards the glass of
beer to the right. As the individual order eects are
insignicant they can be removed from the model,
however, as the decomposition is orthogonal they can
also be included in the model as eects of the preferences are unaected by inclusion or not of the order
eects.
4.1. Hypothesis of scales
An important hypothesis states that for each subject
the assessed preference for any pair of objects can be
attributed to the dierence of the two objects on a continuous scale, i.e. the objects can be arranged into a
rating scale. The usual way of expressing this hypothesis
is that for each person there exists 's corresponding to
the objects so that
H1: ijp jp ip
As only the dierences of 's are of interest one can

add the restriction that for each person 's should sum
to zero. In economic literature such a rating scale is
called a utility function, furthermore, it is cardinal since
Table 1
Analysis of variance, model (1)
Eect
df
s.s.
m.s.
Model
Residual
56
40
2712.36
36.64
48.44
0.92
52.9
<0.001
Total
96
2749.00
28.64
Table 2
Analysis of variance. Order eects and preferences
Eect
df
s.s
m.s.
Common order eect

Dierent order eects
Preferences
Residual
1
7
48
40
8.05
4.78
2699.53
36.64
8.05
0.68
56.24
0.92
8.8
0.7
61.4
0.005
0.672
<0.001
Total
96
2749.00
28.64
57
it is metric. If H1 can be supported an interesting question would be whether the rating scales are the same for
all subjects. These hypotheses correspond to an orthogonal decomposition such that the analysis of variance
table becomes Table 3. The s.s.'s of rows 3, 4 and 5
correspond to the total variation of preferences ('s)
which is decomposed into deviance from rating scales,
(ijp jp ip ), dierent rating scale (ip i ) and
degeneration of common rating scale (1 2 3 4
0). It is found that most of the variation in preferences is due to common rating scale (m.s. =823.06),
which is signicant (p<0.001), however, there is also
variation due to dierences in rating scales (m.s. =9.15)
and the dierences are found to be signicant
(p<0.001). On the other hand the variation due to
deviance from rating scales is insignicant (p =0.067).
Thus, one can conclude that for each subject there exists
a rating scale, however, the rating scales are dierent.
The estimated individual rating scales and the common
rating scale are shown in Table 4.
The hypothesis of subjects having rating scales (H1) is
of special importance. If the hypothesis is rejected several explanations are possible. A frequent explanation is
that more than one dimension is involved in the comparisons. In the present case several aspects of taste may
be involved. When comparing objects i and j focus may
be on ``sweetness'' while comparing j and k focus might
have changed to ``bitterness'' or even dierence in the
colours of objects. A related situation arises if one
object is very dierent from the others. This might bring
subjects in an uncertain situation concerning the context
of the comparisons to be performed.
A simple, but not very satisfying, way of handling
such problem is if possible to omit the deviating
object from the analyses. Another way is to extend the
experiment to a multivariate comparison i.e. letting
subjects express their preferences on more than one
scale. This can be arranged by having for example four
visual scales on the paper between the beers, and request
subjects to express their preferences with respect to e.g.
colour, bitterness, sweetness and ability to slake the
thirst. An analysis of multivariate comparisons can be
performed but will not be discussed in this paper.
Finally several alternative hypotheses of H1 can be
incorporated in the analysis. It is beyond the scope of
this paper to discuss these possibilities. It should, however, be emphasized that if H1 is rejected care should be
taken.
4.2. Structure of subjects
If as in the present case H1 is accepted but the
hypothesis that subjects have the same rating scale is
rejected the question arises whether there exists a
grouping of subjects, say into gender, such that all men
have the same rating scale and all women have the same
58
Table 3
Analysis of variance. Rating scales
Eect
df
s.s.
m.s.
Common order eect

Dierent order eects
One rating scales
Dierent rating scales
Deviances from rating scales
Residual
1
7
3
21
24
40
8.05
4.78
2469.17
192.10
38.26
36.64
8.05
0.68
823.06
9.15
1.59
0.92
8.8
0.7
898.6
10.0
1.7
0.005
0.672
<0.001
<0.001
0.067
Total
96
2749.00
28.64
Table 4
Estimated rating scales
Subject
Table 5
Two by two layout of subjects
1
Subject
a
1
1
1
1
1
2
1
2
2
1
2
1
2
2
2
2
1
2
3
4
5
6
7
8
3.0
3.2
3.2
3.1
4.4
4.5
4.3
4.2
0.9
0.7
1.1
1.0
3.3
3.0
3.1
2.7
0.6
0.5
0.8
0.4
2.2
2.2
2.3
2.0
3.3
3.4
3.5
3.7
5.5
5.3
5.1
4.9
Gender
Lifestyleb
Common scale
3.7
2.0
1.4
4.3
Lifestyle 1
Lifestyle 2
3.1
4.4
0.9
3.0
0.6
2.2
3.4
5.2
between men and women, the third s.s. corresponds to

dierences in rating scales between academic and nonacademic lifestyle and the fourth s.s. corresponds to the
eect of the interaction between lifestyle and gender and
nally the last s.s. corresponds to the within cell variation of rating scales.
Firstly, the within cell variation is found insignicant
(p=0.986) meaning that the between subject variation
of rating scales can be explained by the combination of
life-style and gender i.e. only four dierent rating scales
are necessary. Furthermore, the interaction is not signicant (p=0.322). This means that it is possible to
interpret a rating scale for women and a rating scale for
men and correspondingly a rating scale for each of the
two lifestyles. The rating scale for e.g. an academic man
becomes in this case the sum of the rating scale for men
and the rating scale for academics.
In the present experiment neither the interaction (p
=0.322) nor the gender (p=0.960) were signicant
meaning that grouping of subjects into lifestyle was
sucient to explain the between subject variation of
rating scales. The two estimated scales are shown in
Table 4. The variation of scales becomes s.s. =2654.39
(=2469.17+185.22) with 6 degrees of freedom.
rating scale but possible the rating scales for men and
women are dierent. Furthermore, if there exists an
additional grouping of subjects e.g. into lifestyle, it is
possible to combine two (or more) categorical variables
and also include possible eects of interactions, between
categorical explanatory variables. The hypothesis
becomes
H2: ip 0i;lifestylep 00i;genderp 000
i;lifestylegenderp
where lifestyle(p) is the lifestyle of subject p.

One can think of the structure as a two way layout in
subjects with each subject having a rating scale `às
response''. If the two-way layout in subjects is balanced
in the usual sense, an analysis of variance can be performed in a complete analogy with the usual two-way
analysis of variance.
In the present experiment subjects were men and
women recruited among academics and non-academics
(which was supposed to be interpreted as two dierent
lifestyles) such that subjects formed a balanced two by
two layout, Table 5. This implies that the decomposition of the variation of rating scales (s.s. =2661.27
(=2469.17+192.10), df =24) becomes orthogonal,
Table 6. The rst s.s. corresponds to degeneration of a
common rating scale (1 2 3 4 0), the second s.s. corresponds to dierences in rating scales
a
b
Gender: 1=men; 2=women.

Lifestyle: 1=academic; 2=non-academic.
4.3. Structure of objects

Although two dierent scales are sucient to explain
the variation of preferences this does not imply that all
of the objects are dierent. In some situations it might
be appropriate to test whether there is any dierence in
preferences between e.g. object i and j (i6j), H: i =j .
59
Table 7
Two by two layout of objects
Table 6
Analysis of variance. Gender and lifestyles
Eect
df
s.s.
m.s.
Objects
One rating scale

Gender
Lifestyle
Interaction
Within cell
3
3
3
3
12
2469.17
0.25
185.22
3.35
3.28
823.06
0.08
61.74
1.12
0.27
898.6
0.1
67.4
1.2
0.3
<0.001
0.960
<0.001
0.322
0.986
Qualitya
Strengthb
0
0
0
1
1
0
1
1
Variation of scales
24
2661.27
110.89
However, if this kind of hypotheses is relevant it is also

possible to include structure of objects into the analysis
of variance. In the present experiment the beers were
chosen as two beers of standard price and two beers of
low price. This was thought to be interpreted as high
respectively low quality. Furthermore, two beers were
with a standard content of alcohol (approx. 3.8%) and
two beers were strong beer (approx. 6.0%). The beers
were chosen such that they formed a balanced two by
two layout, Table 7.
It was earlier found that two scales one for each
lifestyle was sucient to describe the between scale
variation. There is, therefore, a hypothesis of additivity
for each of these scales
H3: lifestylep;strengthqualityi
lifestylep;strengthi lifestylep;qualityi
Due to the fact that quality and lifestyle is balanced

the variation of scales (s.s. =2654.39, df =6) can be
orthogonally decomposed into 6 eects, Table 8. The
rst three parts express the main eects of quality and
strength and the interaction of quality and strength, the
next three parts as there are two scales to
decompose express the interaction of lifestyle and
quality, the interaction of lifestyle and strength and the
``three variable interaction'' of lifestyle, quality and
strength. First of all it is seen that the dominant eect is
quality (m.s. =2093.05, p <0.001) but also the strength
has some eect and also the interaction is signicant (p
<.001). Furthermore, the interaction between quality
and lifestyle is signicant (p<0.001) meaning that the
eect of quality is not of equal size for both of the lifestyles. The interaction between strength and lifestyle is
not signicant so that the eect of strength is the same
for both lifestyles. Finally, the interaction between
quality, strength and lifestyle is signicant (p =0.047),
however, as the signicance is not very strong and the
numerical eect is very small we shall for the present leave
this eect out to make the interpretation of results easier.
Recalling that preferences were judged on a 22 cm
visual analogue scale the conclusion of the analysis can
a
b
Quality: 0=low, 1=high.

Strength: 0=3.85%, 1=6%.
Table 8
Analysis of variance. Structure of objects and lifestyle
Eect
df
s.s
m.s.
Quality
Strength
Qualitystrength
Qualitylifestyle
Strengthlifestyle
Qualitystrengthlifestyle
1
1
1
1
1
1
2093.05
353.91
22.21
179.58
1.79
3.85
2093.05
353.91
22.21
179.58
1.79
3.85
2285.2
386.4
24.2
196.1
2.0
4.2
<0.001
<0.001
<0.001
<0.001
0.165
0.047
Variation of scales
2654.39
442.40
be interpreted in the following way: The preference

towards strong beer over standard beer is independent
of lifestyle and is estimated to 2.2 cm. The preference
towards high quality beer over low quality beer is connected to lifestyle and to some extend to alcohol content
of the beer. For standard beer (3.8%) the preference
was estimated to be 4 cm for academics and 7.4 cm for
non-academics. For strong beer (6%) the corresponding
preferences were estimated to be 4.8 and 8.2 cm,
respectively. It is seen that the preference for high quality beer was slightly more pronounced concerning
strong beer.
5. Reducing the number of comparisons
In the present experiment each subject had to perform
12 comparisons. In general this is too much and if the
number of objects increases the number of possible
comparisons increases very rapidly. Therefore, an
important question is whether it is possible to reduce the
number of comparisons to be performed by each subject
and still maintain the structure of orthogonal decompositions.
The comparisons to be performed by a subject can be
described as a set of ordered pairs in which the rst and
second coordinate, represent the rst and second object,
respectively. For the comparisons (1,2), (2,3), (3,4) and
(4,1) the eect of preferences are orthogonal to the
order eects, however, for the comparisons (1,2), (2,3),
(3,4), (4,1), (1,3), (2,4) the corresponding eects are not
orthogonal.
60
Table 9
Variable created to perform regression analysis
Comparisons
Variables
t12
t23
t34
t41
t13
t24
a1
a2
a3
a4
qs
1
2
2
3
3
4
4
1
3
1
2
4
2
1
3
2
4
3
1
4
1
3
4
2
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
1
1
1
0
0
0
0
0
0
1
1
0
0
1
1
1
1
0
0
1
1
0
0
0
0
0
0
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
1
1
0
0
1
1
1
1
1
1
1
1
0
0
1
1
0
0
1
1
1
1
Table 10
Original observations measured on a (-11,11) scale
Comparisons
Subjects
1
2
2
3
3
4
4
1
3
1
2
4
2
1
3
2
4
3
1
4
1
3
4
2
2.6
1.3
2.0
0.8
3.8
2.4
5.5
6.6
3.3
4.5
4.6
3.5
3.8
2.3
2.4
1.3
3.2
1.8
6.3
7.5
3.8
1.7
4.6
3.5
2.6
1.4
2.8
1.3
3.6
2.3
5.8
6.8
4.2
4.6
5.3
3.8
3.3
1.7
2.6
1.8
3.9
2.4
7.2
7.5
4.0
1.2
5.2
3.4
2.4
1.3
6.9
5.7
4.6
3.8
10.2
7.8
6.4
7.4
9.3
8.3
0.7
3.1
5.7
4.7
4.6
3.4
9.8
7.7
7.1
7.9
9.1
8.1
2.6
1.5
7.6
5.7
4.6
3.7
9.3
7.2
6.1
7.4
8.4
7.3
0.8
2.7
5.4
4.4
4.6
3.4
9.3
6.9
6.5
7.8
7.9
7.3
In the practical planning of an experiment there are

many considerations to be taken into account. Each
comparison may be very time consuming so that the
number of comparisons to be performed by each subject
thereby becomes very limited; each comparison may be
expensive so that the total number of comparisons must
be kept small due to a budget restriction; for some reason it might be appropriate that an object appears at
most one time before each subject, etc. A detailed discussion concerning existence of plans of comparisons
fullling certain requests must take into account the
specic conditions given by the practical experiment and
is therefore rather extensive and is beyond the limits of
this paper.
response design. A careful design of the experiment

implies that an analytical approach to the problems in
question can be applied (ANOVA) in contract to the
sample survey approach which is more descriptive.
Furthermore, the order eect can be estimated and/or
eliminated, it is not necessary for each subject to perform all possible comparisons, if no rating scales exist it
is possible to investigate the reasons for this e.g. put
up alternative hypotheses, if rating scales exits it is possible to estimate the inuence of dierent attributes of
subjects on the rating scales and nally structure of
objects can be included in the analysis.
6. Conclusions
Computations
The method of paired comparisons can in many

situations be an attractive alternative to a direct
The models considered are linear models and can

therefore be tted by regression analyses provided by
Appendix
usual computer software. In the following we use the

model-formula from GENSTAT which is very similar
to the notation used by SAS in the GLM procedure.
Note that we have 96 observations obtained by
stringing out the measurement given in Table 10. Each
observation is an individual comparison and is indexed
by subjects and the ordered pair of objects to be compared i.e. [subject, (i; j)].
We specify the following categorical variables: SUBJECT, GENDER and STYLE identifying the subjects
plus gender and lifestyle of subjects taken from Table 5.
Furthermore, we specify the product factor GENDERSTYLE of all combinations of GENDER and STYLE.
Corresponding to 12 we construct a variable t12 by
t12=1 if (i,j)=(1,2), t12=1 if (i; j)=(2,1) and t12=0
otherwise. In a similar way we construct variables t23,
t34, t41, t31 and t24, Table 9. The model formula for
Model (1) will be
SUBJECT SUBJECT:
t12 t23 t34 t41 t43 t24
The ``.'' is the interaction syntax of GENSTAT (corresponding to ``*'' in other software, like SAS ), and one
can think of it as multiplication: the parentheses means
that the model includes all iteraction terms between
SUBJECT and each of the terms inside the parentheses.
This and the following model formulas have too
many parameters which means that some parameters
must be set equal to zero. This does not aect the analysis of variances, however, the estimated parameters
must be interpreted accordingly.
Tables 1 and 2 are obtained by tting appropriate
sub-model of (6) by leaving out parts of (6). The 8
parameters of SUBJECT is the order eects. If SUBJECT
is left out the constant will be the common order eect.
To estimate the scales we specify variables a1, a2, a3
and a4 by
a1 1 if i 1; a1 1 if j 1 and a1 0 otherwise,
and similarly for a2, a3 and a4.
SUBJECT a1 a2 a3 a4
GENDER STYLE GENDERSTYLE:
a1 a2 a3 a4
and appropriate sub-models.
To compute Table 8 we note that Model (5) states
that ai qualityi strengthi; i=1,2,3,4 and thereby
j i = quality(i) - quality(j) - [strength(i) - strength(j)].
We construct variables q and s by
qi; j qualityi qualityj
and
si; j strengthi strengthj
where quality and strength of object i is 0 or 1 according
to Table 7.
Adding a row, r, to Table 7 by r(i)=0 if i 1 or 4
and r(i) =1 if i 2 or 4, we can dene a variable qs as
the interaction between strength and quality by qs(i,j)
=r(i) r(j). From the statistical analysis it was found
that a scale for each lifestyle was required. The model
formula for tting these scales is
SUBJECT a1 a2 a3 a4
STYLE:a1 a2 a3 a4
The same model, however, with another parametrization can be specied by

SUBJECT q s qs STYLE:q s qs
10
Table 8 is obtained from (10) and appropriate submodels.

Data table
The scores were measured on a line segment represented by the interval (11,11). The original values are
given in Table 10.
References
The model formula for Model (3) becomes

SUBJECT SUBJECT:a1 a2 a3 a4
61
Table 3 is obtained by tting appropriate sub-models

of (7). In a similar way Table 6 is obtained using the
model formula
Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete

block designs. I. The method of paired comparisons. Biometrica, 39,
342345.
Luce, R. D. (1959). Individual choice behaviour. New York: Wiley.
Schee, H. (1952). An analysis of variance for paired comparisons.
American Statistical Association Journal. September, 381400.
Thurstone, L. L. (1927). A law of comparative judgement. Psychological Review, 34, 278286.

Paired Comparisons and Designed Experiments

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Paired Comparisons and Designed Experiments

Hochgeladen von

Copyright:

Verfügbare Formate

Food Quality and Preference 11 (2000) 5561

Paired comparisons and designed experiments

Also the paired comparisons may be sensitive to the

G. Gabrielsen / Food Quality and Preference 11 (2000) 5561

means the reverse. To allow for a possible order eect

ijp is the ``true'' preference of beer j over beer i and it is

If e.g. yijp is positive j is preferred over i, if yijp is

for an appropriate chosen design matrix, X.

The underlying assumptions of the mathematical

Model (1) is a linear model and can be tted applying

yijp is the assessed preference of beer j over beer i

If a pair, say i and j, is presented to a judge in the

G. Gabrielsen / Food Quality and Preference 11 (2000) 5561

simplicity the residual variance, m.s.=0.9159 with 40

As only the dierences of 's are of interest one can

Common order eect

G. Gabrielsen / Food Quality and Preference 11 (2000) 5561

Common order eect

between men and women, the third s.s. corresponds to

where lifestyle(p) is the lifestyle of subject p.

Gender: 1=men; 2=women.

4.3. Structure of objects

G. Gabrielsen / Food Quality and Preference 11 (2000) 5561

One rating scale

However, if this kind of hypotheses is relevant it is also

Due to the fact that quality and lifestyle is balanced

Quality: 0=low, 1=high.

be interpreted in the following way: The preference

G. Gabrielsen / Food Quality and Preference 11 (2000) 5561

In the practical planning of an experiment there are

response design. A careful design of the experiment

The method of paired comparisons can in many

The models considered are linear models and can

G. Gabrielsen / Food Quality and Preference 11 (2000) 5561

usual computer software. In the following we use the

The same model, however, with another parametrization can be specied by

Table 8 is obtained from (10) and appropriate submodels.

The model formula for Model (3) becomes

Table 3 is obtained by tting appropriate sub-models

Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete

Das könnte Ihnen auch gefallen

ijp is the ``true'' preference of beer j over beer i and it is