Discovering Cognitive Architecture by Selectively Influencing Mental Processes

Discovering Cognitive Architecture by
Selectively Influencing Mental Processes
7344.9789814277457-tp.indd 1 13/4/12 1:32 PM

ADVANCED SERIES ON MATHEMATICAL PSYCHOLOGY
Series Editors: H. Colonius (University of Oldenburg, Germany)

E. N. Dzhafarov (Purdue University, USA)
Vol. 1: The Global Structure of Visual Space

by T. Indow
Vol. 2: Theories of Probability: An Examination of Logical and

Qualitative Foundations
by L. Narens
Vol. 3: Descriptive and Normative Approaches to Human Behavior

edited by E. Dzhafarov & L. Perry
Vol. 4: Discovering Cognitive Architecture by Selectively Influencing

Mental Processes
by R. Schweickert, D. L. Fisher & K. Sung
EH - Discovering Cognitive Architecture.pmd 1 4/13/2012, 1:44 PM

Advanced Series on Mathematical Psychology Vol. 4
Discovering Cognitive
Architecture by
Selectively
Influencing
Mental Processes
Richard Schweickert
Purdue University, USA
Donald L. Fisher
University of Massachusetts Amherst, USA
Kyongje Sung
Johns Hopkins University
School of Medicine, USA
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TA I P E I • CHENNAI
7344.9789814277457-tp.indd 2 13/4/12 1:32 PM

Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data

Schweickert, Richard.
Discovering cognitive architecture by selectively influencing mental processes / by Richard
Schweickert, Donald L. Fisher & Kyongje Sung.
p. cm. -- (Advanced series on mathematical psychology ; v. 4)
Includes bibliographical references and index.
ISBN 978-981-4277-45-7 (hardcover : alk. paper)
ISBN 981-4277-45-2 (hardcover : alk. paper)
1. Psychology--Mathematical models. 2. Psychometrics. I. Fisher, Donald L. II. Sung, Kyongje.
III. Title.
BF39.S345 2012
150.1'5195--dc23
2012002594
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
Copyright © 2012 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
Printed in Singapore.
EH - Discovering Cognitive Architecture.pmd 2 4/13/2012, 1:44 PM

To Carolyn, Patrick and Ken,
R. S.
To Susan Duffy and Susan Haas,

Without whom I would never have realized what little I have been able to
give to my daughters, my family and my friends.
D. L. F.
To Sue, Siwon, Jiwon, and Sihyun,

whose influences on my life are not selective at all.
K. S.
This page intentionally left blank
Preface
This is a book about a technique used in cognitive psychology to

learn how mental processes are organized. Many things are studied
scientifically by removing parts to observe the effects, or by cutting a
part in two to see which subpart is crucial, techniques obviously not
suited to studying the working human brain. The technique of selectively
influencing processes is more subtle. To study a task, an experimenter
finds a manipulation expected to change the difficulty of a single mental
process, leaving everything else invariant. An experiment is done,
measuring, say, reaction time or accuracy. Effects of making two
manipulations at the same time are compared with the effects of making
them one at a time. Results test whether each manipulation indeed
selectively influenced a single process. If the test is passed, something is
revealed about how the two processes are arranged, for example, whether
they are in series or in parallel. This book reviews several ways of doing
this.
The technique became immediately popular when Saul Sternberg
introduced it for reaction times in 1969. Its few simple assumptions lead
to a few simple predictions. Success encouraged experimenters to apply
it with dependent variables other than reaction times and with more and
more complicated process arrangements. It has evolved into a technique
with technicalities, and controversies about interpretations have arisen.
Currently, disagreements about process arrangement are entangled with
disagreements about the applicability of the technique. Despite the
problems, selective influence is a tool yielding insights in many
applications. This book is a survey and a guide. It is about what is
settled; rapidly changing frontiers are better tracked in journals than
books.
Most mathematical notions are introduced here from scratch.
However, the theory has advanced to the point where it is not feasible to
include all material needed to make the book self contained. The reader
vii
viii Preface
is assumed to be familiar with probability and statistics, as found in a

book such as Hays (1994). Later chapters use calculus.
We thank many colleagues who have generously contributed to this
work. We have benefited from communication with Donald Bamber,
Hye Joo Han, James Nairne, Ian Neath, Marie Poirier, Saul Sternberg,
Gerald Tehan, James T. Townsend and Zhuangzhuang Xi. We thank
Harold Pashler and Gerald Tehan for providing data and Seth Roberts
and Saul Sternberg for providing simulations. We are grateful to our
editors, Hans Colonius and Ehtibar N. Dzhafarov, for helpful
discussions. Portions of this work were supported by NIMH grants
MH38675 and MH41452, and AFOSR grant FA9550-06-0383 to
Schweickert, and AFOSR grant FA9550-09-0252 to Schweickert and
Dzhafarov. Also portions were supported by the Therapeutic Cognitive
Neuroscience Fund and by the Benjamin and Adith Miller Family
Endowment on Aging, Alzheimer’s Disease, and Autism to Barry
Gordon through Kyongje Sung. Errors are, of course, the responsibility
of the authors.
Contents
Preface vii
Chapter 1: Introduction to Techniques 1
Chapter 2: Introduction to Process Schedules 7
Chapter 3: Selectively Influencing Processes in Task Networks 20
Chapter 4: Theoretical Basis for Properties of Means
and Interaction Contrasts 64
Chapter 5: Critical Path Models of Dual Tasks
and Locus of Slack Analysis 93
Chapter 6: Effects of Factors on Distribution Functions
and Consideration of Process Dependence 152
Chapter 7: Visual and Memory Search, Time Reproduction,
Perceptual Classification, and Face Perception 223
Chapter 8: Modeling with Order of Processing Diagrams 256
Chapter 9: Selective Influence with Accuracy, Rate,
and Physiological Measures 291
Chapter 10: Selective Influence of Interdependent Random
Variables 359
References 383
Author Index 403
Subject Index 409
ix
Chapter 1
Introduction to Techniques
A person performing a task such as searching a screen for a target

executes mental processes such as perceiving, recognizing, selecting a
response and so on. In the early days of experimental psychology Wundt
tried to directly find the duration of a single process, apperception, by
asking an observer to directly insert this process into a task or remove it.
For Wundt, perception denoted “the appearance of a content in
consciousness” after a stimulus is presented, and apperception denoted a
deeper process, “its reception into the state of attention” (Külpe, 1895, p.
426). Investigators in Wundt’s lab presented stimuli to observers trained
in introspection. In one condition the observer was instructed to respond
when the stimulus was perceived, and in another condition the observer
was instructed to respond when the stimulus was apperceived. By
subtracting the reaction time for the perception condition from the
reaction time for the apperception condition, the time for apperception
itself could be found. It would have been fortunate if this naive
procedure had worked, but even at the time it was unconvincing. Cattell
(1893) said that “the great variation ... of the measurements bears witness
to the lack of an objective criterion.”
The approach of Donders (1868) to inserting processes was more
objective. The assumption was that processes required for performing a
task are executed one after the other, in series, and the reaction time is
the sum of the times required for the individual processes. The aim was
to insert or remove processes by changing the task to be done. For
example, in one experiment, the stimulus was a vowel sound and the
observer’s response was to repeat the vowel. In the first condition (a),
the simple condition, the observer knew which vowel was to be
1
2 Discovering Cognitive Architecture by Selectively Influencing Mental Processes
presented and only had to repeat it. In a second condition (b) the
observer did not know which of five vowels was to be presented, and had
to repeat whichever was presented. Condition (b) requires two processes
in addition to those required in condition (a), namely, (1) the stimulus
vowel must be discriminated and (2) the response vowel must be chosen.
Thus, subtracting reaction time for (a) from reaction time for (b) gives
the time required for discrimination plus the time required for choice.
In a third condition (c) the observer did not know which of five
vowels was to be presented, but only had to respond to one, say, i, by
repeating it when it was presented. No choice was required for the
response, although discrimination of the stimulus vowel was required.
Thus, subtracting response time for (b) from response time for (c) gives
the time required for choice. Subtracting response time for (a) from
response time for (c) gives the time required for discrimination.
Donders reported that the mean reaction time for condition (a) was
201 msec, for (b) 284 msec, and for (c) 237 msec. Then the time for
discrimination is c − a = 36 msec, and the time for choice is b − c = 47
msec.
An important feature of this Subtractive Method of Donders is that
the experimenter can determine, based on the observer's responses,
whether the intended task was performed. Nonetheless, the experimenter
does not know how the task was performed. A criticism at the time,
based on introspection, was that changing from one condition to another
changes the nature of the processing, even of processing that precedes
the process allegedly inserted (Külpe, 1895, p. 414).
In principle, the results of Donders’ subtractive method can be
checked. To take a simple case, suppose the experimenter can insert and
remove two processes, with durations x and y. By producing three tasks
intended to have reaction times, respectively, of
x
y
x+y
Introduction to Techniques 3
the experimenter can check whether the duration of the third task is
indeed the sum of the durations of the first two tasks. This example is an
over simplification, but with three or more processes, if the experimenter
inserts and deletes processes in the right combinations, the number of
observed response times can be made large enough so that values of
unknown process durations can be solved for in more than one way,
allowing a check. Curiously, this does not seem to have been done at the
time.
Stretching Processes Rather Than Inserting Them
In Sternberg’s (1969) elegant approach, instead of trying to insert a

process, the experimenter tries to change the task slightly, in order to
make an existing process take longer, without changing anything else.
Such a manipulation is called a factor, and is said to selectively influence
the process.
The two major assumptions of the theory are (1) processes are
executed in series, so the reaction time is the sum of the durations of the
individual processes, and (2) each of two experimental factors prolongs a
different process. There are also secondary assumptions about the
measurement of time and so on. These are numerous and so minor we
can safely assume they are met. Then the theory predicts the combined
effect of prolonging two processes will be the sum of the effects of
prolonging them individually.
This prediction can be tested with an Analysis of Variance
(ANOVA). Factors having additive effects on response time are called
additive factors. Two factors that are not additive are said to interact. It
is common to call processes executed one after the other stages. An
experiment with additive factors supports the theory. If two factors
interact, at least one of the major assumptions is wrong. Sternberg
(1969) proposed that if two factors interact, it is likely that assumption
(2) is violated and the two factors influence the same stage.
The technique of selective influence for a series of stages is called the
Additive Factor Method. With it, the experimenter obtains an immediate
check on the assumptions through the test of interaction in the Analysis
of Variance. Its applications have been numerous, see, e.g., Sanders

(1990) and Sternberg (1998).
As a well-known example, in experiments of Sternberg (1966, 1967),
the subject was given a set of digits to memorize, the positive set. On
each trial a digit was presented, and the task was to respond whether the
presented digit is in the positive set or not. The task is now called the
Sternberg memory scanning task, or the memory search task. Two
factors discussed by Sternberg (1969) are (1) a change in stimulus
quality produced by superimposing a checkerboard pattern on the
stimulus digit and (2) a change in the size of the positive set from 1 to 2
to 4 digits. The two factors had additive effects on reaction time
(Sternberg, 1966, 1967). The interpretation is that the first factor
selectively influences one stage, stimulus encoding, and the second
factor selectively influences another stage, memory comparison; further,
the two stages are arranged in series. Other stages in the series were
selectively influenced by other factors, see Sternberg (1969).
When the combined effect of two factors is the sum of their separate
effects, we say the composition rule is addition. In that case, there is a
model of the situation in which two processes are in series, with each
factor selectively influencing a different process.
However, additivity of the factors does not imply that two processes
in series exist in reality, because other process arrangements could yield
an additive composition rule. Tension between what can be observed
and what can be inferred has been part of cognitive psychology since its
inception, because the subject matter, cognition, is only partly
observable. Consider the case of two factors, Factor A and Factor B,
having additive effects on reaction time. Suppose each factor has two
levels. Let τ11 be the reaction time when both factors are at level 1, let τ12
be the reaction time when Factor A is at level 1 and Factor B is at level
2, and so on. The two factors have additive effects if changing the level
of Factor A from 1 to 2 has the same effect at each level of Factor B.
That is,
τ21 − τ11 = τ22 − τ12.

Introduction to Techniques 5
To construct a model with two processes in series, let process A have

duration
a1 = .5τ11 when Factor A is at level 1
a2 = .5τ11 + τ21 − τ11 when Factor A is at level 2.
Let process B have duration
b1 = .5τ11 when Factor B is at level 1
b2 = .5τ11 + τ12 − τ11 when Factor B is at level 2.
Finally, suppose the response time for a level of Factor A combined with
a level of Factor B is the sum of the corresponding durations of process A
and process B.
It is easy to check that when both factors are at level 1, the reaction
time is τ11, when Factor A is at level 1 and Factor B is at level 2, the
reaction time is τ12, and so on. Each factor changes the duration of only
one process. Hence, the data can be represented by two processes in
series, with durations as above, and with each factor selectively
influencing a different process. (See Dzhafarov and Schweickert, 1995,
for a representation in which the reaction times and process durations are
random variables, rather than fixed constants.)
Clearly, it is arbitrary to use .5τ11 in the durations above, so this is not
the only way to represent the data with two processes in series. More
troubling is that a quite different process arrangement can also represent
the data. We have implicitly assumed that a process begins processing at
a starting point and stops processing at a finishing point, and if a second
process follows, its starting point is the finishing point of the first
process. McClelland (1979) and Townsend and Ashby (1983) showed
that factors can have additive effects on reaction time in a different kind
of model, where as soon as a process begins, it starts sending output to its
successor. McClelland’s model is called the cascade model, and Eriksen
and Schultz (1979) call such models continuous flow models. An
analysis of the cascade model by Roberts and Sternberg (1993) showed
that it failed to account for aspects of their data. But it often happens that
two process arrangements account for the known data equally well. In
the end, the choice between them can only be based on nonempirical
considerations such as simplicity, plausibility and taste.
If two factors do not have additive effects on reaction time, it is
possible that each factor prolongs a different process, so assumption (2)
above is satisfied, but the processes are not in series, so assumption (1) is
violated. Sternberg (1969) pointed out that if processes are in parallel,
the effect of prolonging two of them would be the maximum of the
effects of prolonging them separately.
In some situations, the processes are not all in series and they are not
all in parallel. Evidence comes from dual tasks, in which two stimuli are
presented, and a response is made to each (Telford, 1931). Consider the
case of two stimuli presented at the same time. When results are
compared with the corresponding single tasks in which each reaction is
made separately, the following outcomes are typical (e.g., Schvaneveldt,
1969), although not always found. In the dual task, the subject is not
carrying out all the single task processing for the first response, followed
by all the single task processing for the second response, because the
time required to do the dual task is less than the sum of the times
required to respond to each stimulus separately. On the other hand, in
the dual task the subject is not carrying out all the single task processing
for the first response simultaneously with all the single task processing
for the second response, because the time to do the dual task is longer
than the maximum of the times required to respond to each stimulus
separately. Something more general than pure serial or pure parallel
processing is needed.
Chapter 2
Introduction to Process Schedules
The main reason for selectively influencing processes is to learn about

the arrangement of the processes in a structure containing them. It is
clear that there may not be a single structure used for all tasks. Meyer
and Kieras (1997a, 1997b) emphasize that a system with flexible
strategies will operate in a variety of ways. This chapter introduces two
structures, task networks and trees, which are plausible, tractable and
testable. The former are often used for modeling reaction times, the
latter for response probabilities. Other structures will be introduced in
later chapters.
Gantt Charts and Directed Acyclic Task Networks
Bar charts are a natural way to represent the mental processes required
for a task; they are especially useful when intuition about process
durations is important. Bar charts are also called Gantt charts. Figure
2.1 gives an example for processes in a dual task. Stimulus s1 is
presented, followed after a stimulus onset asynchrony (SOA) by stimulus
s2. Responses r1 and r2 are made to stimuli s1 and s2 respectively.
There are three sequential processes for each stimulus, a perceptual
process, A, a central process, B, and a motor preparation process, C. (A
motor movement follows motor preparation, but because reaction time is
ordinarily measured at response onset, the motor movement that follows
is ordinarily not illustrated.) In this model, the perceptual processes for
the two stimuli, A1 and A2, are executed concurrently. However, the
response to the second stimulus is delayed because, in accord with
Welford’s (1952, 1967) single channel theory, the central processing, B2,
7
for the second stimulus cannot begin until the central processing, B1, for
the first stimulus is finished. The central processes B1 and B2 are
executed sequentially. The first use of this model that we are aware of is
by Davis (1957); it was popularized by Pashler and Johnston (1989). For
more discussion, see Pashler (1994).
Fig. 2.1. Gantt chart for a dual task.
When intuition about relationships among processes is important, a

Gantt chart is often replaced with an equivalent directed acyclic task
network. Figure 2.2 shows a directed acyclic task network corresponding
to the Gantt chart in Figure 2.1. The network is directed because each
arc has a direction, and it is acyclic because no process precedes itself;
that is, one cannot go from the head of an arc to its tail by following a
sequence of arcs, each from tail to head. A wide variety of models are
explicitly or implicitly in the form of Gantt charts or directed acyclic task
networks. These include serial models (Donders, 1868; Sternberg,
1969), parallel models (Townsend, 1972), and the dual task model
already mentioned (Davis, 1957; Pashler & Johnston, 1989). They also
include models of de Jong (1993); Ehrenstein, Schweickert, Choi and
Proctor (1997); Fisher and Glaser (1996); Johnston, McCann and
Remington (1995); Osman and Moore (1993); Pashler (1984); Ruthruff,
Miller, and Lachman (1995); Van Selst and Jolicoeur (1994); and
Welford (1952). The various models make different predictions about
details, but because they all can be represented as Gantt charts (or,
equivalently, as directed acyclic task networks), there are certain general
Introduction to Process Schedules 9
predictions they all make. If one of the general predictions fails for an
experiment, there is no possible directed acyclic task network in which
the experimental factors selectively influence different processes.
Fig. 2.2. Directed acyclic network equivalent to Gantt chart for dual task.
If the processes in a task cannot be represented in an acyclic task

network, they can sometimes be represented in a more general structure,
an OP (Order-of-Processing) diagram. These were introduced by Fisher
and Goldstein (Fisher and Goldstein, 1983; Goldstein and Fisher, 1991,
1992). They were first used to derive moments of response time
distributions for task networks and other models. Later, the availability
of expressions for the moments lead Fisher (1985) to propose the use of
OP diagram representations for many different cognitive networks, such
as queuing networks and Petri nets. These will be discussed in later
chapters. For more background on the use of response times to analyze
mental processes, the reader is referred to the excellent surveys by Luce
(1986) and Townsend and Ashby (1983). For networks of queues, see
Liu (1996), Miller (1993), and Wu and Liu (2008).
Directed Acyclic Task Networks
The directed acyclic task network in Figure 2.2 is made of vertices joined
by arcs. Processing begins with the presentation of a stimulus at the
starting vertex of the network. A mental process is represented by an arc
directed from one vertex to another. The starting vertex of an arc, at the
tail, represents the starting point of the process. The ending vertex of the
arc, at the head, represents the finishing point of the process. Responses
are made at the ending vertices of the network.
Sometimes an arc does not represent a mental process, but merely
indicates that one process precedes another. For example, a stimulus
onset asynchrony is represented by an arc directed from the onset of one
stimulus, a vertex, to the onset of another stimulus, another vertex. This
SOA arc does not represent a mental process. As another example,
suppose a process stops using a certain resource at some point,
represented by vertex, and at another point, represented by another
vertex, a second process starts to use the resource. An arc from the first
vertex to the second vertex can be used to represent the fact that the
resource must be released by the first process before it can be used by the
second. If the resource is available the instant it is released, the duration
of the arc is 0. For convenience, we will often refer to arcs as processes,
even when there is no processing going on. An arc with duration 0
representing precedence is called a dummy process.
By starting at a vertex and moving along arcs in the direction of their
arrows until another vertex is reached, one traces a path. More precisely,
a path from a vertex u to a vertex z consists of the vertex u, followed by
an arc directed from u to a vertex v, followed by an arc directed from v to
a vertex w, and so on, with the last arc having ending vertex z. A single
vertex is considered a path. To indicate that one process immediately
precedes another, the head of the arc representing the first process is
incident with the tail of the arc representing the second. If one process
precedes another (not necessarily immediately), there is a path from the
head of the arc representing the first process to the tail of the arc
representing the second; the path will go along arcs in the direction
indicated by the arrows.
We say a vertex precedes another vertex if there is a path having at
least one arc from the former vertex to the latter vertex. A process
preceding a process, a vertex preceding a process, and so on are defined
similarly. A path that goes from a vertex u to the same vertex u, and that
has at least one arc, is called a cycle. An acyclic network has no cycles,
so a vertex or process does not precede itself. We assume precedence is
transitive, that is, if process x precedes process y, and process y precedes

process z, then x precedes z.
Two processes are sequential or ordered if one precedes the other;
otherwise they are concurrent or unordered. We use the term
“concurrent” as in the operations research literature to mean “potentially
concurrent.” When we say two processes are concurrent, we mean there
is no requirement for one of them to finish before the other can start.
Typically, portions of their execution will overlap in time, but the
processes might not literally be executed simultaneously and it is
possible that one process would be completed before the other one starts.
Some processes begin execution as soon as the first stimulus is
presented. These have their starting vertex at the starting vertex of the
network. We assume every other vertex in the network represents an
AND gate or an OR gate. A process whose starting vertex is at an AND
gate begins execution as soon as all processes immediately preceding it
finish. A process whose starting vertex is an OR gate begins execution
as soon as any process immediately preceding it finishes. Some
processes have their ending vertex at a response. The response is made
as soon as all immediately preceding processes are finished if the
response is at an AND gate, and as soon as any is finished if the response
is at an OR gate.
In the networks considered here, except for the starting vertex, every
vertex is an AND gate or every vertex is an OR gate. In the former case
the network is called an AND network and the latter case an OR network.
AND networks are often called PERT (Program Evaluation and Review
Technique) networks or critical path networks (Kelley & Walker, 1959;
Malcom, Roseboom, Clark & Fazar, 1959; Elmaghraby, 1977). For
short, we will use the term task network to refer to an AND network or
an OR network. Networks having both AND and OR gates, or other
kinds of gates, are possible of course, but beyond the scope of this work.
Sometimes a task might appear to require both AND gates and OR
gates, but closer analysis shows it does not. Consider a visual search
task with a process working on each item on the screen, these processes
being concurrent. Suppose on a target absent trial the response is made
as soon as all of these processes finish, each with the answer “nontarget.”
Then the response “absent” is made at an AND gate. Suppose on a target
present trial, several targets are present on the screen. Suppose the
response “present” is made as soon as any process finishes with the
answer “target.” The processes working on the nontarget items can be
ignored, because they will not trigger a response. Then the response
“present” is made at an OR gate. At first it might seem that a single
network with both an AND gate and an OR gate is required. However,
the trials can be separated into target present trials and target absent
trials, with a different network for each type. The network for the
“present” response has an OR gate, the network for the “absent” response
has an AND gate. The task is represented by one OR network and one
AND network. More information on representing tasks with networks is
given in a later chapter.
The duration of an arc x is a nonnegative random variable, D(x). On
a particular trial, each arc is assumed to take on a particular value from
its probability distribution. The duration of an arc representing a process
is the duration of the process. The duration of an arc representing an
instantaneous action, such as a resource becoming available, is zero on
every trial.
The duration of a path is the sum of the durations of all the arcs on it.
A path can consist of a single vertex; in that case, the path duration is 0.
Since arc durations are random variables, the duration of a path is a
random variable also. To be specific, suppose a vertex u precedes a
vertex v on a particular path. The durations of the arcs on this path will
vary from trial to trial, so the duration of the path will vary also.
If there is more than one path from u to v, and we are interested in the
longest path from u to v, the path with the longest duration may not be
the same path on each trial. Despite this complication, we can speak of
the duration of the longest path from u to v; it is a random variable whose
value on a particular trial is the sum of the arc duration values on that
path which happens to be the longest for that trial. (On a given trial,
there may be several paths tied as longest or shortest from one vertex to
another; this turns out to not affect our conclusions.)
The time elapsing between the occurrence of vertex u and the
occurrence of vertex v is denoted D(u,v). If all vertices are AND gates,
D(u,v) is the duration of the longest path between vertices u and v. On a
particular trial, the longest path from the starting vertex of the network,
o, to the ending vertex, r, is called the critical path; in an AND network,

the duration of the critical path is the response time for the trial. If all
vertices are OR gates, D(u,v) is the duration of the shortest path between
vertices u and v. The shortest path from vertex u to vertex v is called a
geodesic. In an OR network, the duration of the shortest path from the
stimulus to the response on a trial is the response time for the trial.
If more than one response is made, there will be a response time for
each. If one is interested in a particular response, arcs not preceding that
response can be ignored because they have no influence on the time at
which that response is made. If more than one stimulus is presented, the
response time for the subtask associated with a stimulus is the time
elapsing from the onset of the stimulus to the response for that stimulus.
When two stimuli are presented, they are typically presented in the same
order on every trial, separated by a stimulus onset asynchrony. It is
sometimes of interest to know the time at which a particular response is
made using the time at which the first stimulus was presented as the
reference point.
In the next chapter, we turn to Task Network Inference, the
construction of a directed acyclic task network from observed effects of
factors selectively influencing processes in it.
Acyclic Task Networks in Human Factors
A major use of task networks is in Human Factors. A network is often

drawn to represent operations of machines in a workplace and it is
natural to extend the network to include the cognitive operations of
workers interacting with the machines. Large portions of such cognitive
task networks can be constructed by observing workers and reasoning
about necessary information processing, a procedure called cognitive
task analysis. One of the best examples of a successful application is
Project Ernestine (Gray, John, & Atwood, 1993). In a now well known
story, while new workstations were under development for telephone
operators, analysts observed videotapes of operators using the old
workstations for various tasks. A critical path network was drawn for
each task, using the cognitive task analysis method CPM-GOMS. (CPM
stands for Cognition, Perception, Movement, and also for Critical Path
Method. GOMS stands for Goals, Operators, Methods and Selection
Rules.) Estimates of the durations of component processes such as
typing and speaking were obtained from the videotapes and from the
human factors literature. For each task, a network was drawn for use of
the old workstation and another network was drawn for use of the new
workstation. With the networks and estimated durations, the time to
complete each task could be predicted for the new workstations.
Specifications indicated that several processes would be faster with
the new workstations. Surprisingly, predicted times to complete tasks
were longer. In the networks, some of the faster processes were not on
the critical path, so their shorter durations did not shorten completion
time. However, several other processes were inserted into the critical
path, thus increasing completion time. When the new workstations were
tested, completion times were indeed longer.
GOMS was developed by Card, Moran and Newell (1993), and CPM-
GOMS by John (1990). Cognitive task analysis is discussed in
Schweickert, Fisher and Proctor, 2003.
Systems Not Easily Represented in Acyclic Task Networks
Systems that cannot be formulated as acyclic AND or OR task networks

usually have one of the following features, (1) the absence of discrete
events, (2) the presence of feedback, or (3) the wrong kind of gates. For
issue (1), if there are no discrete events, then the system can be
represented as a directed acyclic task network only in an unenlightening
way, as a single arc directed from the stimulus onset to the response.
Systems with no discrete events are plausible, but beyond the scope of
this work; a special issue of Acta Psychologica (1995) has relevant
papers. For issue (2) some forms of feedback cycles can easily be
reformulated as part of an acyclic task network. For example, if a
process is simply repeated a random number of times, and no output is
sent to other processes until the last repetition, this entire action can be
represented in a network as a single arc with a random duration.
Feedback causes a problem for our analysis when a process producing
feedback activates processes following it at the same time as it

reactivates itself or earlier processes. The problem is that processes
cannot then be readily classified as sequential or concurrent.
For issue (3), production systems (e.g., Anderson & Bower, 1974;
Meyer & Kieras, 1997a, b) are important examples of systems which
often have the wrong kinds of gates for our decomposition. A particular
production system might easily be representable as an AND or OR task
network. But in most production systems an action starts when a
compound proposition becomes true. A problem arises when its truth
value depends on an event such as the presence of a goal instead of the
event that a process has finished. For gates not using standard Boolean
logical operations, decomposition with selective influence may be
difficult. An example of a nonBoolean gate is a gate releasing a process
when the total activation into it exceeds a threshold, the activation being
a continuous quantity. The difficulty does not arise in representing the
task as a network of some kind, perhaps as an OP diagram. The
difficulty is that there is little hope of finding factors which selectively
influence processes when a gate blends outputs of several processes. The
hard problem is finding a robust alternative to selective influence. One
of our major points is that data can easily lead to rejection of the
assumption that a directed acyclic network exists, in which experimental
factors selectively influence processes. The price for a class that can be
rejected is an inability to model everything.
Processing Trees
Responses can be classified in various ways, as, say, correct or incorrect,

and we turn now from the time required to respond to the type of
response that is made. One of the most widely used structures for
modeling accuracy is a processing tree; uses range from perception (e.g.,
Ashby, Prinzmetal, Ivry & Maddox, 1996; Prinzmetal, Ivry, Beck &
Shimizu, 2002) to social cognition (e.g., Klauer & Wegener, 1998).
Batchelder and Riefer (1999) provide an excellent review.
In a processing tree, when a process finished, it produces an outcome
with a certain probability, and the next process is selected depending on
which outcome occurred. Some outcomes of processes are responses,

and these fall into various classes. A processing tree is used to predict
the probabilities of the response classes from the probabilities of the
various process outcomes. An important kind of processing tree is a
multinomial processing tree, in which the parameters satisfy a certain
constraint ensuring that parameter estimates have a simple form; for
details see Hu and Batchelder (1994).
A widely used processing tree model was proposed by Jacoby (1991)
in his process dissociation procedure, see Figure 2.3. Two groups of
subjects studied the same two lists of words. After study, they were
presented with test words which were from List 1 or List 2 or neither.
Subjects were asked to say for each word whether it was old or new. For
the inclusion group, a word was considered old if it was in either List 1
or in List 2. For the exclusion group, a word was considered old only if
it was in List 2.
Fig. 2.3. Processing trees for inclusion and exclusion conditions. Arcs are directed from
top to bottom.
According to the model, when a subject sees a word at test, he

attempts to consciously recollect it. For a word studied on either list, this
recollection is successful with probability R, and yields the information
that the word was studied and which list it was in. For a studied word, if
the word is not recollected, with probability F it is judged familiar.
Consider a word in List 1. A subject in the inclusion group will say

the word is old if it is recollected, or if it is not recollected, but judged
familiar. That is, for a word in List 1, the subject will say old (correctly)
with probability
pinclusion = R + (1 − R)F.
For a word in List 1, a subject in the exclusion group will not say the
word is old if it is recollected. However, a familiar word is more likely
to be from list 2 (presented recently), than from list 1 (presented earlier),
or new (not presented). So if a word is not recollected, but is judged
familiar, the subject will say the word is old. That is, for a word in List
1, the subject will say old (incorrectly) with probability
pexclusion = (1 − R)F.
The two equations above can be solved for the two unknowns, R and F.
To test the model, experimental factors expected to selectively
influence recollection or familiarity are manipulated. For example,
Jacoby (1991) proposed that a secondary task carried out during testing
would not change the familiarity of items, because familiarity was
established during study. However, the secondary task would harm
recollection, because recollection occurs during testing. Hence, the
secondary task is expected to decrease R leaving F invariant, as was
found. An example of a factor that does not selectively influence a
parameter is the presentation of words as anagrams instead of in the
usual way. This manipulation changes both R and F (Jacoby, 1991;
Jacoby, Toth & Yonelinas, 1993).
In a processing tree, at each vertex a process is executed. (In a task
network, processes were represented by arcs.) The first process to be
executed is represented by a special vertex, the root (at the top in our
illustrations). When a process is executed, it produces one of several
possible outcomes. These outcomes are represented by arcs leaving the
vertex representing the process. (Because the direction of all arcs is from
top to bottom, arrows can be omitted.) Such arcs are called the children
of the vertex. An arc is directed from its starting vertex to its ending
vertex. Each child of a vertex has a probability associated with it; this is
the probability the corresponding output is produced, given that the
process represented by the vertex is executed. The sum of the
probabilities associated with the children of a vertex is 1. When an
output is produced by a process, the arc corresponding to it is said to be
traversed, the ending vertex of the arc is said to be reached, and the
process represented by this vertex begins execution. This procedure
continues until a vertex with no children is reached. Such a vertex is
called a terminal vertex, and it produces a response. The responses fall
into mutually exclusive classes. Responses made at a particular terminal
vertex fall into one such class.
As in a task network, a path from a vertex u to a vertex z consists of
the vertex u, followed by an arc directed from u to another vertex v,
followed by an arc directed from v to another vertex w, and so on, with
the last arc having ending vertex z. A single vertex is considered a path.
A simple path is a path in which no vertex is repeated. We say a network
is connected if for any two vertices u and z there is a path from u to z, or
a path from z to u. A tree is network in which for every pair of vertices u
and z, there is exactly one simple path from u to z or exactly one simple
path from z to u, but not both. With our definition, a tree is connected.
Further, no vertex precedes itself on a path, so a tree is a directed acyclic
network. (With a task network, there may be more than one simple path
from one vertex to another, but this cannot happen in a tree.)
The probability of a path is the product of the probabilities associated
with the arcs on the path. The probability of a path consisting of a single
vertex is 1. Given that processing started at the root, the probability a
response is made at a particular terminal vertex is the product of the
probabilities on the path from the root to that terminal vertex (there is
exactly one such path). The probability a response in a particular class is
made is the sum of the probabilities that responses are made at terminal
vertices associated with that class. For short, we will sometimes say tree
to refer to a processing tree.
Because multinomial processing trees are so widely used, their
statistical analysis is well developed. See, for example, Batchelder and
Knapp (2004), Batchelder and Riefer (1986, 1990), Chechile and Meyer
(1976), Hu and Batchelder (1994), and Riefer and Batchelder (1988).

Software is well developed also. See, for example, Dodson, Prinzmetal
and Shimamura (1998); Hu (1999); Rothkegel (1999); and Stahl and
Klauer (2007).
Systems Not Easily Represented As Processing Trees
A tree is a special form of directed acyclic network, so difficulties that

arise for the former also arise for the latter. As with directed acyclic task
networks, the following are common impediments to forming a
processing tree model. (1) Continuous output, rather than discrete
output, is not easily represented in trees (Kinchla, 1994; Slotnick, Klein,
Dodson & Shimamura, 2000). (2) Many forms of feedback, such as
error correcting procedures, cannot readily be represented without cycles.
(3) In a tree, the gate for releasing a process is special, because a process
has at most a single predecessor. Although this limitation can sometimes
be overcome by placing copies of a process at several places in a tree, a
factor selectively influencing this copied process may not be well
behaved.
Analyzing both reaction time and accuracy
It is natural to attempt to combine a processing tree with a task network,

to obtain a model for both reaction time and accuracy. A start has been
made for processes in series by Hu (2001) and Schweickert (1985). The
difficulty is not so much in finding a common structure, but in deriving
predictions for factors selectively influencing processes. A simple
example illustrates the problem. Suppose process A requires time D(A)
to produce a correct output, and does so with probability p(A). Over all
trials, the expected value of the contribution of process A to the reaction
time for a correct response is p(A)E[D(A)]. A factor selectively
influencing A, making it more difficult, has two effects. It will decrease
the probability A produces a correct output and it will increase the
duration of A. Such opposing effects are hard to work with.
Chapter 3
Selectively Influencing Processes

In Task Networks
Although Sternberg (1969) focused on serial processes, he noted that the

combined effect of two factors selectively influencing two parallel
processes would be the maximum of their individual effects. Effects of
factors prolonging processes that are not in series have been studied for a
long time (Karlin & Kestenbaum, 1968; Welford, 1952). We know about
these effects in more detail now. When factors selectively influence
processes in an AND or OR task network, systematic patterns occur in
the mean response times. This chapter gives an overview. At the end of
the chapter we discuss how a process can be part of a larger superprocess
or have constituent subprocesses. For analysis of response times in
general, see Van Zandt (2002).
Effects of Selectively Influencing Processes in Task Networks
Figure 3.1 illustrates a model for a dual task in which a subject produces
a time interval and, part way through the time interval, searches a screen
for a target (Schweickert, Fortin & Sung, 2007). Each trial had two
components. In the first component, a tone was presented. The subject
encoded its duration, to be used in the second component of the trial as
the duration goal of a time interval the subject would produce. When
ready for the second component, the subject pressed a button (noted as
event o1 in Figure 3.1). The button press blanked the screen and started
the time interval the subject was producing. After an interval (the
stimulus onset asynchrony, SOA), a display was presented (noted as
event o2 in Figure 3.1). The subject was to search through the display
20
Selectively Influencing Processes in Task Networks 21
and decide whether a target (a circle) was present among the distractors
(circles with a vertical line stem). The subject was to respond only after
he or she believed both that a time interval had elapsed whose duration
was the goal duration and also that a decision was made as to whether the
target was present in the display or not. The subject made a single
response by pressing a button. One button was pressed to indicate that
the target was present, another button was pressed to indicate that the
target was absent. The model is a simple AND network.
Fig. 3.1. Processes in a dual time production and visual search task. If the produced
interval is short, effects of prolonging SOA and visual search will be additive.
Three processes are illustrated, the produced time interval, the SOA,
and the visual search. (The visual search could be divided into
subprocesses, but the details are not relevant here.) The SOA and the
visual search are sequential. The time interval is concurrent with the
SOA and concurrent with the visual search.
For simplicity, consider trials on which the target is absent. On such
trials, the subject must search all the items in the display to correctly
decide the target is absent. Consider effects of manipulating three
factors. We can increase the time required for the search by increasing
the number of items in the display. We can increase the duration of the
SOA directly. Finally, we can increase the duration of the time interval
produced by the subject by giving the subject a longer duration goal.
In the initial condition illustrated in Figure 3.1, the goal duration has
elapsed (i.e., the produced time interval is over) before the search subtask
is completed. The response is made at time 600. If the SOA is increased

by 100, the response time is increased by the same amount, 100. If the
SOA is returned to its original value and the search is increased by 200,
the response time is increased by 200. Finally, if the SOA is increased
by 100 and the search is increased by 200, the response time is increased
by 300. The combined effect of both factors is the sum of the effects of
each of them separately. The factors are additive and one can conclude
that there exists a task network in which there is a pair of sequential
processes, and each factor selectively influences a different process in
the pair.
Fig. 3.2. The combined effect of prolonging two concurrent processes will be less than
additive.
Figure 3.2 illustrates the effect of selectively influencing both the

search and the produced interval. The initial condition is the same as
before, with the response made at time 600. As before, when the search
is increased by 200, the time at which the response is made increases by
the same amount, 200. When the produced interval is increased by 500,
the time at which the response is made increases by 400. Finally, when
the search is increased by 200 and the produced interval is increased by
500, the time at which the response is made still increases by 400. The
combined effect of both factors is smaller than the sum of their separate
effects. The two factors interact, and we will see later that from the form
of the interaction one can conclude that the task can be represented with
an AND network in which there are two concurrent processes, and each
of the two factors, produced-interval-goal and display size, selectively

influences a different one of the two concurrent processes.
It is straightforward to check that the factors of produced-interval-
goal and SOA also interact. The combined effect of both these factors is
smaller than the sum of their separate effects. As we will see, from the
form of the interaction one can conclude that the task can be represented
with an AND network in which the factors of produced-interval-goal and
SOA selectively influence two concurrent processes.
Fig. 3.3. If search is prolonged by 200, reaction time increases by 100. If the produced
interval is long, effects of prolonging SOA and visual search will be greater than additive.
With the initial condition in Figure 3.1, SOA and display size have
additive effects on reaction time. But with a different initial condition,
illustrated in Figure 3.3, these factors could interact. In Figure 3.3 the
produced interval is 700, longer than the sum of the SOA and the search
duration. With this initial condition, if the SOA duration is increased by
200, the increase in the reaction time is only 100. Likewise, if the search
duration is increased by 200, the reaction time increases by 100. Finally,
if the SOA duration is increased by 200 and the search duration by 200,
the reaction time increases by 300. The combined effect, 300, of
prolonging the SOA and the search is greater than the sum of their
separate effects (100 + 100). The factors of SOA and display size
interact.
In the task network, each of the three factors selectively influences a
different process. It is awkward to conclude in this situation that an

interaction between two factors indicates that the two factors influence
the same process, because the interaction comes and goes depending on
the duration of the time interval. However, the interactions are
systematic, as we will see when we examine the details of prolonging
processes.
Slack
The behavior of AND networks and OR networks is similar, so it will

suffice to focus on AND networks. If all the processes were in series,
and an amount u were added to the duration of a process A, then the
response time would increase by u. But suppose the processes are in an
AND network, and the process A is not on the longest path through the
network. The response time is determined by processes which bypass A,
so incrementing the duration of A by a small amount would have little or
no effect on the response time. For example, in the AND network in
Figure 3.1, the response time is the duration of the longest path, 200 +
400 = 600. The duration of the produced interval is only 500. If the
produced interval is increased by 50, the response time would not change
because the longest path did not change. (An analogous situation would
arise in an OR network, if a prolonged process is not on the shortest path
through the network.)
On a particular trial, if a process A is not on the longest path through
an AND network, we say there is slack for the process A on that trial.
Suppose we knew the durations of all the processes on that particular
trial. And suppose we could rerun the trial with the same process
durations, except that the duration of A is prolonged. Then, the longest
time by which A could be prolonged without delaying the response r is
the slack from A to r, sometimes called the total slack for A. It is
denoted s(A, r). In this notation, the first argument, A, is an arc and the
second argument, r, is a vertex, the vertex at which response onset
occurs. In Figure 3.1, the total slack for the produced interval is 100.
If all the process durations were known, s(A, r) could be determined,
that is, s(A, r) is a function of the process durations. The intuition is that
the slack from process A to r is the difference between (1) the duration of
the longest path from the start of the network to r and (2) the duration of
the longest path that goes from the start of the network to r and that also
goes through arc A. Let o denote the starting vertex of the network, and
let A' and A" denote the starting and ending vertices of arc A. For two
vertices, say o and A', with the first preceding the second, let d(o, A')
denote the duration of the longest path between them. Let d(A) denote
the duration of process A. Then the total slack for A is
s(A, r) = d(o, r)  d(o, A')  d(A)  d(A", r).
For more detail, see Schweickert (1978). With the formula, one can
see that on a particular trial a process is on the longest path from o to r,
that is, on the critical path, if and only if its total slack is 0.
Two related quantities are also used in the analysis of sequential
processes. Consider an AND network in which process A precedes
process B. The largest amount of time by which A can be prolonged
without delaying the start of B is the slack from A to B. Its value can be
found in the following way. Remove from the network all processes that
do not precede B', the starting vertex of B. (This includes removing
process B itself, but leaving vertex B'.) In the remaining network, B' can
be considered the terminal vertex. Then, by analogy with finding the
total slack for A, the slack from A to B is
s(A, B') = d(o, B')  d(o, A')  d(A)  d(A", B').
Now, restore the removed processes to the network, and suppose A is

prolonged by an amount just long enough to make B start late, that is, A
is prolonged by exactly s(A, B'). How much of the total slack for A
remains? This quantity is the coupled slack from A to B,
k(A, B) = s(A, r)  s(A, B')

= d(o, r)  d(o, B')  d(A", r) + d(A", B'). (3.1)
This quantity can be positive, zero, or perhaps contrary to intuition,

negative. We will see that its value determines the form of the
interaction between factors selectively influencing processes A and B.
Here are examples of coupled slack values in later figures. In Figure 3.4,
s(A, r) = 225 and s(A, B) = 440, so k(A, B) = 225  440 = 215. In
Figure 3.7, the slack from B to C is 125, the same as the total slack for B.
Hence, k(B, C) = 0. However, if the duration of C were 100 (instead of
375 as in the figure), the total slack for B would be 175. The slack from
B to C would still be 125. Hence, the coupled slack for B and C would
be k(B, C) = s(B, r)  s(B, C) = 175  125 = 50.
Typically in an experiment we do not know the durations of
individual processes on a trial, so we do not know the value of the slack
for any process. Values of this unobservable quantity are assumed to
have a probability distribution over all the possible trials. The random
variable taking on these values is denoted S(A, r); it is a function of the
random variables which are the process durations.
Selective influence
There are many ways that changing the level of an experimental factor
might selectively influence the duration of a process. For example, a
factor might make the duration of a process more variable, without
changing its mean. It is reasonable to assume that if changing the level
of a factor makes a process more difficult, it increases the mean duration
of the process. Unfortunately, this simple assumption does not lead to
many useful conclusions, so stronger assumptions are needed
(Townsend, 1990). Different assumptions about selective influence are
needed for different purposes. This chapter is concerned with expected
values of reaction times, so the assumptions need not be strong. For
many conclusions about expected values, dependencies between random
variables can be ignored; for example, the expected value of X + Y is the
sum of the expected values of X and Y, whether X and Y are correlated or
not.
Consider a factor selectively influencing a process A. Let a level of
the factor be denoted i, for i = 1, 2, .... If the brightness of a stimulus is
the experimental factor, then the levels 1 and 2 might indicate bright and
dim, respectively. Higher level numbers indicate greater process
difficulty (for both AND and OR networks).

When the factor selectively influencing process A is at level 1, in the
initial condition, the duration of A is a random variable A1. (Random
variables will usually be denoted by capital letters, values they take on
by corresponding small letters.) We assume that an increase to level 2 of
the factor adds something to the duration of A (for both AND and OR
networks). That is, there is a nonnegative random variable U such that at
level 2 of the factor the duration of A is A2 = A1 + U. (The next chapter
supplies more details.) Increasing the level of the factor from 1 to 2 is
said to increment the duration of A. One immediate consequence is that
the expected value of the duration of A at level 2 is greater than or equal
to the expected value at level 1.
Sternberg (1966, 1969) gives an example of how this assumption
would be met in practice. Suppose search through a memory set is serial
and exhaustive, that is, items are processed one by one, and every item is
processed. If the memory set is {a, b} in one condition, and {a, b, c} in
another, then increasing the size of the memory set increments the
duration of the memory search.
This assumption is equivalent to another one (see Müller & Stoyan,
2002; Townsend & Schweickert, 1989). When the factor is at level 1, let
the cumulative distribution function of the duration of process A be FA1(t)
= Prob[A1 < t]. Likewise, let FA2(t) be the cumulative distribution
function of the duration of process A when the factor is at level 2. Then
increasing the level of the factor from 1 to 2 increments the duration of A
if and only if FA1(t) > FA2(t).
If at every t, the cumulative distribution function for one random
variable is greater than or equal to the cumulative distribution of another,
then the former is said to be stochastically smaller than the latter. (Note
that the larger cumulative distribution function produces the smaller
mean.) When we say a factor selectively influences a process A, one
assumption we make is that when the level of the factor is increased from
i to i', the duration of A at level i is stochastically smaller than the
duration of A at level i'.
When we say each of two factors selectively influences a different
process, one assumption we make is that each factor increments the
duration of a different process; that is, the marginal cumulative
distribution functions of the two process durations are each ordered by

the levels of the factors. What about the joint distribution of the process
durations? An easy assumption to make, but a strong one, is that the
durations of all the processes are mutually stochastically independent at
every combination of factor levels (see, e.g., Schweickert & Giorgini,
1999; Schweickert, Giorgini & Dzhafarov, 2000). Weaker assumptions
about selective influence sufficient for the results presented in this
chapter are given in the next chapter. It formulates in a more precise
way assumptions originally given in Schweickert (1982), Schweickert
and Townsend (1989), Townsend and Schweickert (1989), and
Schweickert and Wang (1993). Recently, general formulations of
selective influence have been developed by Dzhafarov (1996) and Kujala
and Dzhafarov (2008). These will be discussed in a later chapter.
The next chapter deals with the following difficulty. Suppose a
subject is presented with a block of trials with a factor at level 1, and
later is presented with a block of trials with the factor at level 2. It is no
problem to subtract the mean reaction time at level 1 from the mean
reaction time at level 2. But a problem arises if we consider subtracting
individual reaction times at level 1 from individual reaction times at level
2. For a given trial with factor level 2, which trial with factor level 1 do
we subtract from it? We do not have a sample of pairs of reaction times,
with the only difference between one element of a pair and the other
being a change in the duration of process A. In particular, it is
impossible in the experiment to obtain a sample <a1, a2> of an
observation a1 of A1 paired with an observation a2 of A2. It turns out that
the assumption that each factor increments the duration of a different
process can be formulated in such a way as to imply the existence of a
common theoretical probability space for the random process durations
at all levels of the factors, whether or not we can make experimental
observations at all levels simultaneously. Details are in the next chapter.
Monotonic Response Time Means
We are now in a position to explain what happens to the response times

when a factor selectively influences a process by incrementing its
duration. Consider an AND network with response made at r, and

consider a particular trial with the factor selectively influencing process
A at level 2. When a trial occurs, a sample value is taken from the
population distribution of each random variable’s process duration. On
this particular trial, then, every process has a duration which is a
nonnegative number. In particular, the duration of A is d(A) + u, for
some value d(A) of the duration of A when the factor is at level 1 and
some value u of the duration of the increment. The durations of the
remaining processes C1, ..., Cp are d(C1), ..., d(Cp). The duration values
can be used to calculate the values of quantities not only for the trial at
level 2 of the factor, but for what would have happened if the trial had
been at level 1 of the factor.
On a particular trial, the slack from A to r at level 1 of the factor has a
particular numerical value, s(A, r). If the increment u is less than the
slack from A to r, there is no increase in the response time produced by
changing the factor from level 1 to level 2. If u is greater than the slack
from A to r, a portion of u would be used to overcome the slack from A
to r, and what remains of u would increase the response time. That is,
the increase in the response time would be
0 if u < s(A, r)
u  s(A, r) if u > s(A, r).
It is convenient to use the notation
[x]+ = 0 if x ≤ 0
[x]+ = x if x > 0.
With this notation, the increase in response time when A is prolonged

by u, is [u  s(A, r)]+.
The process durations vary from trial to trial, so they are random
variables. In the initial condition, the slack from A to r is a function of
the random variable process durations, so it too is a random variable,
S(A, r). In the expression, we use the capital letter S to denote slack as a
random variable, and a small letter s to denote a numerical value the
random variable takes on. When the factor selectively influencing A is at

level 2, over all the trials the amount by which A is prolonged beyond its
duration at level 1 is a nonnegative random variable, U. (We are
assuming the factor selectively influencing A increments its duration.)
Over all the trials, the expected value of the response time, E[T], is
increased by a nonnegative amount E([U  S(A, r)]+). The result is that
as the factor levels increase, the mean response times increase
monotonically.
To give more detail, if we let T1 and T2 be the response times when
the factor influencing A is at levels 1 and 2, respectively, then
E[T1] < E[T1] + E([U  S(A, r)]+) = E[T2]. (3.2)
The result is that when the process A is prolonged, its mean response
time either increases or stays the same; i.e., it increases monotonically.
The reasoning is similar for other changes in the factor levels.
A note on SOA in dual tasks
When there are two responses, it is customary to use the time from the
onset of the second stimulus to the onset of its response as the reaction
time to the second stimulus. Then for the model in Figure 2.1 it is not
hard to show that the response time to the second stimulus decreases
monotonically as the SOA increases. This may seem at first to contradict
the statement that increasing a factor level increases the mean response
time. However, with this way of measuring the response time to the
second stimulus, the location of the event used to start the clock (the
onset of stimulus 2) changes as the SOA changes. If instead the clock is
started at the onset of stimulus 1, the mean time at which the response to
the second stimulus is made increases as the SOA increases.
A note on OR networks
In an OR network with response made at r the greatest amount of time by

which a process A may be shortened without decreasing the response
time is called the surplus from A to r, analogous to the slack from A to r.
In an OR network, the mean response times decrease monotonically as

process durations decrease (Schweickert & Wang, 1993). This is
equivalent to saying that mean response times increase monotonically as
process durations increase, i.e., as the factor levels increase, the same
result as for AND networks. Because results about shortening can be
rephrased as results about prolonging, we speak of factors prolonging
process durations, for both OR and AND networks.
Monotonic Interaction Contrasts
Consider a factor Α selectively influencing a process A and another factor

Β selectively influencing a different process, B. Let the levels of the
factor selectively influencing process A be denoted i = 1, 2,..., and let the
levels of the factor selectively influencing process B be denoted j = 1,
2,.... In both cases, higher numbers indicate greater process durations
(for both AND and OR networks). When the first factor is at level i and
the second at level j, we denote the response time as Tij, with expected
value E[Tij]. For each combination of levels (i, j) we define an
interaction contrast
(ΑΒ)ij = E[Tij]  E[T1j]  E[Ti1] + E[T11]. (3.3)
When the processes are in series, the factors have additive effects, so
the interaction contrasts are zero for every i and j.
The effects of selectively influencing two processes in a task network
depend on how the two processes are arranged. The major distinction is
between concurrent and sequential pairs of processes. (For a good
introduction, see Logan, 2002.) Sequential pairs are further
distinguished depending on whether they are in a structure called a
Wheatstone bridge. A Wheatstone bridge is illustrated in Figure 3.4.
Processes A and B are on opposite sides of the bridge. One place a
Wheatstone bridge arises is in a dual task, when subjects are instructed to
respond to the first stimulus before responding to the second stimulus.
This commonly given instruction, in effect, inserts a dummy process
between the two responses to establish their order. Figure 3.5 shows the
task network for the dual task model in Figure 2.2 drawn with the
additional constraint that response 1 precedes response 2. Figure 3.5 can
easily be redrawn as an AND network. It has the form of a Wheatstone
bridge. For analyzing times to make response 2, processes B1 and B2 are
on opposite sides of a Wheatstone bridge (as are other pairs such as A1
and C2). Other examples of models with a Wheatstone bridge are the
double bottleneck models of de Jong (1993), Ehrenstein, Schweickert,
Choi and Proctor (1997) and the stimulus-response compatibility model
of Kornblum, Hasbroucq, & Osman, (1990).
Fig. 3.4. Processes A and B are on opposite sides of an incomplete Wheatstone bridge.
Fig. 3.5. Instructing the subject to make response r1 before r2 creates a Wheatstone
bridge.
Fig. 3.6. Processes A and B are on opposite sides of a complete Wheatstone bridge.
Pairs of sequential processes are subdivided into those not on

opposite sides of a Wheatstone bridge, those on opposite sides of an
incomplete Wheatstone bridge (Figure 3.4), and those on opposite sides
of a complete Wheatstone bridge (Figure 3.6).
We will first discuss interactions indicating concurrent processes and
then discuss those indicating sequential processes. Before discussing
interactions, we explain our simulations.
Calculations and simulations
A number of practical questions arise when one considers testing these

predictions in experiments. Are the effects big enough to be found?
Will a reasonable number of trials be sufficient for discerning the
patterns? To investigate the feasibility of finding these patterns in data,
we produced results for hypothetical experiments, by simulation and by
calculation. These examples refute objections that the interactions
predicted by the theory are small and easily mistaken for additivity
(Molenaar & van der Molen, 1986; Vorberg & Schwarz, 1988).
The predictions about means and interaction contrasts are distribution
free. But are the predicted patterns more conspicuous for some
distributions than others? To investigate this possibility, we used two
different distributions for process durations, the exponential and the
truncated normal. The first is highly skewed while the second is nearly
symmetrical. Little is known about the actual distributions of individual

mental processes, but normal and exponential distributions are plausible
and often assumed. A normal distribution would be expected if the
duration of a mental process were the sum of many components. There
is evidence in some experiments for exponential distributions (or sums of
these), e.g., Ashby and Townsend (1980) and Kohfeld, Santee, and
Wallace (1981), although Sternberg (1964) found evidence against them.
For more discussion of distributions, see Luce (1986).
In examples using exponential process durations, the expected values
of the response times were calculated exactly with the OP diagrams
described in a later chapter (Fisher, 1985; Fisher & Goldstein, 1983;
Goldstein & Fisher, 1991, 1992). For examples using truncated normal
distributions, no algorithm giving exact values of expected values is
known, and the results are based on simulations using MICROSAINT
(Micro Analysis and Design, 1985).
For each type of distribution in our hypothetical experiments, the
process durations were assumed to be mutually independent, that is, the
joint distribution for every subset of processes was assumed to be the
product of the corresponding marginal distributions. Independence is not
a realistic assumption, and the predictions do not require it. Little is
known about the actual correlations between durations of mental
processes, so the choice of correlation values is somewhat arbitrarily.
We chose 0 (independence) because it is familiar and intuitively clear.
Later we will relax this assumption.
Interaction Contrasts: Concurrent Processes
When two factors selectively influence concurrent processes in an AND

network, the following results are predicted: (1) mean response times
will increase monotonically with increases in levels of the factors; (2)
interaction contrasts will all be less than or equal to zero; and (3)
interaction contrasts will decrease monotonically as the levels of either of
the factors is increased. Prediction (3) is a consequence of (2). All
interaction contrasts calculated for higher factor levels with respect to
lower factor levels are predicted to be nonpositive. These predictions are
derived in the next chapter.
Example 1: Exponential distributions
If two factors prolong different concurrent processes A and B, the pattern

of interactions produced on response times can be easily seen. Consider
the acyclic task network in Figure 3.7. Each process duration was
assumed to have an exponential distribution and the durations were
assumed to be mutually independent. The processes prolonged, A and B,
are concurrent. Means for A and B are given in Table 3.1, means for the
other processes are as indicated in Figure 3.7. Expected values of
response times are in Table 3.1. They were computed from the
associated OP diagrams using the algorithm we describe in a later
chapter. Note that these numbers are not the means of simulated trials;
the algorithm calculates the exact expected values. The interaction
contrasts defined in Equation 3.3 are easily calculated; for example, for
the change from level 1 to level 2 of each factor, ΑΒ22 = 733.340 
642.778  710.901 + 616.093 =  4.246. These values are also in Table
3.1. (Note that T11 = 616.092, T22 = 733.340, T12 = 642.778 and T21
= 710.901).
The three patterns are immediately apparent. (1) Means are
monotonically increasing from left to right and from top to bottom, (2)
interaction contrasts are all negative, and (3) they too are monotonic.
(All interaction contrasts calculated for higher factor levels with respect
to lower factor levels are predicted to be nonpositive, not only those in
the table of interaction contrasts.)
Fig. 3.7. AND network used in simulations. Mean durations of processes not prolonged
are on arcs; mean durations of processes prolonged are in table headings.
Table 3.1
Expected Values of Reaction Times
When Factors Influence Concurrent Processes A and B in Figure 3.7
All Process Durations Exponentially Distributed
μB
25 100 150 200 250
μA
300 616.1 642.8 671.1 704.5 741.3

450 710.9 733.3 757.8 787.1 820.0
500 746.5 767.8 791.1 819.2 850.9
550 783.6 803.8 826.1 853.1 883.6
650 861.4 879.8 900.3 925.2 953.6
700 901.8 919.4 939.0 963.1 990.5
750 943.0 959.9 978.8 1002.0 1028.5
Interaction Contrasts
μB
25 100 150 200 250
μA
300 - - - - -
450 - -4.2 -8.1 -12.2 -16.1
500 - -5.4 -10.4 -15.6 -20.8
550 - -6.4 -12.5 -18.8 -25.1
650 - -8.3 -16.1 -24.5 -32.9
700 - -9.1 -17.7 -27.1 -36.4
750 - -9.8 -19.2 -29.4 -39.7
Example 2: Truncated normal distributions
The same patterns were found with each process duration sampled from
a truncated normal distribution, that is a distribution whose density
function is the normal distribution restricted to nonnegative values, and
renormalized so the area under it is one. The standard deviation of each
process duration was set to one fourth of its mean. The value of one-
fourth is representative of values typically found for response times
themselves (see Luce, 1986, p. 64), so we used it for the individual

process durations. When the means for processes A and B were
increased, the standard deviations were also increased to one fourth of
the new mean to simulate the finding that response time variability
typically increases as the mean increases. The process durations were
mutually independent.
Simulated response times are given in Table 3.2 for the same network
(Figure 3.7) used in the preceding tables. Two thousand simulated trials
were run for each combination of means for A and B using the
MICROSAINT system for personal computers (Micro Analysis and
Design, 1985). The means and standard deviations (prior to truncation)
are the row and column labels in Table 3.2. The interaction contrasts are
in the body of the table.
The same three patterns occur as before, although not without
exception. The means increase monotonically from left to right and from
top to bottom (for the most part), the interaction contrasts are negative
(all), and they too are monotonic (for the most part). As noted, some
small exceptions occur for the response times and interactions in the first
few rows and columns. These arise from sampling the reaction times. It
is clear that increasing the mean for B from 25 to 100 had little effect on
the reaction times, because the increase is not enough to overcome the
total slack for B. These exceptions would not occur in the population
values, although, of course, the effects would still be small.
One of our assumptions in deriving the three patterns is that each
factor selectively influences a process by incrementing its duration. The
reader may wonder if this form of selective influence occurs here, where
a factor increasing the mean duration of a process also increased the
variance. It is easy to verify that if two normally distributed random
variables have respective means μ1 and μ2 and standard deviations σ1 and
σ2, their cumulative distribution functions cross at t = (μ1σ2  μ2σ1)/(σ2 
σ1). Here, since each standard deviation equals the same fraction of the
corresponding mean (one-fourth, in this case), the value of t is 0. The
distributions were truncated at 0 to avoid negative durations, and since
the cumulative distribution functions do not cross elsewhere, they are
always ordered in the same way; that is, selective influence takes place
by incrementing the process duration (Townsend & Schweickert, 1989).
Table 3.2
Means of Simulated Reaction Times
When Factors Influence Concurrent Processes A and B in Figure 3.7
All Process Durations Have Truncated Normal Distributions
μB 25 100 150 200 250

σB 6.25 25.0 37.5 50.0 62.5
μA σA
300 75.0 528 531 547 584 628
450 112.5 556 554 568 596 636
500 125.0 576 577 589 613 648
550 137.5 607 606 616 635 664
650 162.5 679 679 685 696 716
700 175.0 720 720 724 733 750
750 187.5 767 766 767 776 788
μB 25 100 150 200 250

σB 6.25 25.0 37.5 50.0 62.5
μA σA
300 75.0 - - - - -
450 112.5 - -5 -7 -16 -20
500 125.0 - -1 -5 -19 -28
550 137.5 - -3 -10 -29 -43
650 162.5 - -3 -13 -40 -62
700 175.0 - -3 -15 -43 -70
750 187.5 - -4 -19 -47 -78
OR networks
The same patterns are predicted for prolonging concurrent processes in

an OR network, except that the interaction contrasts are nonnegative
(Schweickert & Wang, 1993).
Statistical considerations
A table whose rows and columns are monotonically increasing is said to

satisfy independence. This property is of interest in conjoint
measurement, so it has been studied in some detail. Although it may
seem at first to be a weak condition, independence is quite constraining.
Suppose the cells in a table with r rows and c columns are rank ordered.
A formula for the number of such tables satisfying independence was
derived by Arbuckle and Larimer (1976); they note that the proportion of
tables satisfying independence is quite small, even for a small number of
rows and columns. Of course, one can always permute the rows and
columns until the cell means in the first row are monotonically
increasing, as well as those in the first column. McClelland (1977)
calculates that there are 3.33  106 tables with 3 rows and 4 columns in
which the first row and first column are monotonically increasing. Of
these, only 462 have the remaining rows and columns monotonically
increasing. Independence is unlikely to occur by chance.
To reject independence, it is sufficient to reject the hypothesis that
some particular pair of cells is in the proper order. If a given pair was of
interest for some reason before the experiment was done, the hypothesis
could be tested with a simple a priori test of a contrast. If an out of order
pair was located when examining the data, the hypothesis that the
population means for those cells are out of order could be tested with an
aposteriori test (Kirk, 1982); the appropriate type of aposteriori test
would depend on the circumstances.
Interaction contrasts: Sequential processes
When two factors selectively influence two sequential processes, the

interaction contrasts defined in Equation 3.3 display simple patterns
analogous to those for concurrent processes. Once again, the only
difference between AND networks and OR networks is in the signs of
the interaction contrasts, as explained below. Details depend on the way
the sequential processes are arranged in the network, and are best
explained by examples. There are three cases to consider, depending on
whether or not the processes A and B are arranged in a Wheatstone
bridge. This structure is illustrated in Figures 3.4 and 3.6 in the

incomplete and complete form, respectively. More information about
sequential processes is in the next chapter.
Sequential processes case 1: Not in a Wheatstone bridge
We begin with the simplest case, sequential processes not on opposite

sides of a Wheatstone bridge. In Figure 3.7, processes B and C are an
example.

Table 3.3 gives mean response times and interaction contrasts for an
AND network when processes B and C were prolonged. All process
durations were assumed to be exponentially distributed and mutually
independent. The mean for process A was 300, the means used for B and
C are in the table.
Three patterns for interaction contrasts are apparent in the table: (1)
mean response times are monotonically increasing across the rows and
down the columns, (2) interaction contrasts are all positive (or zero), and
(3) interaction contrasts are monotonically increasing across the rows
and down the columns. We do not show all possible interactions
contrasts, but all calculated for higher factor levels with respect to lower
factor levels are predicted to be positive or zero, and this implies result
(3). If all the gates were OR gates, corresponding patterns are predicted,
the difference being that the interaction contrasts would all be negative
or zero, so the interaction contrasts would be monotonically decreasing
across the rows and down the columns (Schweickert & Wang, 1993).

The same patterns would be found for any other joint density for the
process durations and prolongations when the factors selectively
influence processes that are sequential, but not on opposite sides of a
Wheatstone bridge. For example, Table 3.4 gives the results of
simulations in which the same two processes as before are prolonged, but
the durations of all processes in the network have mutually independent
truncated normal distributions. The simulations were carried out in
MICROSAINT (Micro Analysis and Design, 1985). The same three

patterns are evident in the tables. The small negative interaction
contrasts in Table 3.4 are based on sample means, and would not occur
with population means.
Table 3.3
Expected Values of Reaction Times
When Factors Influence Sequential Processes B and C in Figure 3.7
All Process Durations Have Exponential Distributions
μB
25 100 150 200 250
μC
100 376.5 421.3 456.7 495.0 535.5
150 397.7 445.0 481.7 521.0 562.3
200 424.7 474.3 512.1 552.3 594.3
250 455.9 507.5 546.3 587.3 629.9
300 527.3 582.2 622.5 664.7 708.4
375 546.5 602.1 642.7 685.2 729.1
400 566.1 622.3 663.3 706.1 750.1
μB
25 100 150 200 250
μC
100 - - - - -
150 - 2.6 3.8 4.8 5.7
200 - 4.9 7.3 9.2 10.7
250 - 6.9 10.2 12.9 15.1
300 - 10.2 15.1 19.0 22.2
375 - 10.9 16.1 20.3 23.7
400 - 11.5 17.0 21.5 25.1
Table 3.4
When Factors Influence Sequential Processes B and C in Figure 3.7
μB 25 100 150 200 250 500 600

σB 6.25 25.0 37.5 50.0 62.5 125.0 150.0
μC σC
100 25.0 314 316 322 339 371 602 702
150 37.5 337 336 347 372 410 651 752
200 50.0 367 370 383 414 458 702 798
250 62.5 410 411 427 459 503 747 851
300 75.0 454 456 472 509 552 803 901
375 93.8 527 530 548 584 627 873 972
400 100.0 550 553 575 608 652 899 997
μB 25 100 150 200 250 500 600

σB 6.25 25.0 37.5 50.0 62.5 125.0 150.0
μC σC
100 25.0 - - - - - - -
150 37.5 - -3 2 10 16 26 27
200 50.0 - 0 8 21 33 47 42
250 62.5 - -1 9 25 36 50 53
300 75.0 - 0 11 31 41 62 59
375 93.8 - 0 13 32 43 58 58
400 100.0 - 1 17 33 45 61 59
Monotonicity of the response times with the factor levels was

discussed above. Schweickert and Townsend (1989, Theorem 3) showed
that when factors Α and Β selectively influence sequential processes A
and B not in a Wheatstone bridge, the expected interaction contrast (ΑΒ)ij
is typically positive and always nonnegative. If all the gates were OR
gates, (ΑΒ)ij is typically negative and always nonpositive (Schweickert &
Wang, 1993). It follows that the expected interaction contrasts will be
monotonic with the factor levels.
In the AND network examples just given, two factors prolonging
sequential processes produce positive interactions. (By a positive

interaction, we mean the combined effect of both factors is greater than
the sum of their individual effects.) Since factors prolonging concurrent
processes produce negative interactions, it might seem that the sign of
the interaction is diagnostic for concurrent and sequential processes.
However, the situation is more complicated, because factors prolonging
sequential processes in an AND network can also produce negative
interactions. This is possible only when the two sequential processes are
on opposite sides of a Wheatstone bridge (Schweickert 1978), which we
now turn to.
Sequential processes case 2: An incomplete Wheatstone bridge
The task network illustrated in Figure 3.4 has an unusual feature. There
are three paths through the network, and only one of them contains both
A and B. If the arc from A to B has a short duration, then the path
containing them both will hardly ever be the critical path, so it will
appear as if A and B are not on a path together. In other words, although
A and B are in fact sequential, they might appear to be concurrent.
When factors selectively influence sequential processes on opposite
sides of an incomplete Wheatstone bridge (e.g., A and B in Figure 3.4),
the resulting patterns of mean response times can be similar to (or
identical to) the patterns observed when concurrent processes are
influenced. Fortunately, the patterns will be different provided a wide
range of levels of the factors are used, when large increments in process
durations overcome the relevant slacks. Once again, the patterns to be
expected are best illustrated by examples, which we will turn to after
explaining more about Wheatstone bridges.
The only way two factors selectively influencing sequential processes
A and B in a directed acyclic task network can produce a negative
interaction is for the network to contain a subnetwork in the shape of a
Wheatstone bridge, with A and B on opposite sides of the bridge
(Schweickert, 1978). To be more precise about what it means for one
graph to have the same shape as another, we need to explain what is
meant by two graphs to be homeomorphic. Consider a graph consisting
of two arcs in series, one from a vertex u to a vertex v, and another from
vertex v to vertex w. Now consider a graph made by replacing the two

arcs of the first graph with a single arc from vertex u to vertex w (see
Figure 3.8) The second graph is obtained from the first one by
smoothing, the first graph is obtained from the second one by
subdividing. Two graphs are homeomorphic if one can be obtained from
the other by repeated smoothing and subdividing. When we say two
graphs have the same shape, we mean they are homeomorphic.
Fig. 3.8. Homeomorphic graphs.
We say processes A and B are on opposite sides of a complete

Wheatstone bridge if (1) the task network contains a subnetwork
homeomorphic to a Wheatstone bridge, with A and B on opposite sides
of the bridge, and (2) the task network has a path from the starting vertex
to the ending vertex, containing neither A nor B.

The sequential processes A and B in Figure 3.4 are on either side of an
incomplete Wheatstone bridge. The OP diagram algorithm (Fisher &
Goldstein, 1983) described in a later chapter was used to calculate the
expected response times when the baseline process durations and
prolongations had mutually independent exponential distributions. The
means are indicated on the arcs in the figure. The numbers on the arcs
for A and B are the means for their baseline durations. When process A
was prolonged, an exponentially distributed random variable was added
to the baseline duration of A, and the total was the duration of A when
prolonged. Process B was prolonged in a similar way. As a concrete
example of a situation in which prolongations of this kind are plausible,
consider a serial search in which each item added to the display adds an
exponentially distributed time to the search. Ashby and Townsend
(1980) found evidence in a search task for just such exponential

increments (they noted that the search might be parallel).
Results are in Table 3.5. Again, three patterns are apparent. (1)
Response times are monotonically increasing across the rows and down
the columns, (2) interaction contrasts are all negative (or zero), and (3)
interaction contrasts are monotonically decreasing across the rows and
down the columns. Interaction contrasts not shown in the table,
calculated for higher factor levels with respect to lower factor levels, are
also predicted to be negative or zero, and (3) follows from this. If all
vertices in the network were OR gates, the interaction contrasts are
predicted to be positive or zero (Schweickert & Wang, 1993).

The three patterns do not require process durations having exponential
distributions. With the same network as in Figure 3.4 process durations
were assumed to have mutually independent truncated normal
distributions, with means as on the arcs in the figure. As before,
simulations were carried out in MICROSAINT (Micro Analysis and
Design, 1985). The three patterns occur again (Table 3.6).
The sign of the expected interaction contrast (ΑΒ)ij was obtained in
Schweickert and Townsend (1989, Theorem 2) for AND networks, and
in Schweickert and Wang (1993) for OR networks. Monotonicity of the
expected interaction contrasts with the factor levels follows in either
case.
Sequential processes case 3: A complete Wheatstone bridge
To continue our discussion of sequential and concurrent processes, we

now discuss the remaining, most difficult case. Suppose A and B are
sequential and on opposite sides of a complete Wheatstone bridge, as in
Figure 3.6. Suppose as usual that Factor Α selectively influences process
A and Factor Β selectively influences process B. As with the other cases,
expected values of response times will increase monotonically as levels
of each factor increase, because response time is a monotonically
increasing function of the increment of each process duration (Chapter 4
gives more detail).
However, expected values of interaction contrasts need not all have the
same sign (Lin, 1999). An example is in Figure 3.9. Numbers on the
arrows are process durations. The duration of process F is 0 in the left
panel and 3 in the right; otherwise the networks are the same. Reaction
times for various durations of processes A and B for each network are in
Table 3.7. Corresponding interaction contrasts are in Table 3.8.
Interaction contrasts are all negative for the left hand network; further
they are monotonically decreasing with increasing values of durations of
A and B. Results are similar for the right hand network, except the
interaction contrasts are positive and monotonically increasing. So far, so
good. But suppose the left hand network occurs 1/10 of the time and the
right hand network 9/10 of the time. Resulting interaction contrasts are
in Table 3.9. They do not have the same sign, nor are they monotonic
with the durations of processes A and B.
Despite their potentially complicated behavior, processes on opposite
sides of a Wheatstone bridge (complete or incomplete) are on a path
together, and this fact can be used to distinguish them from concurrent
processes, as we now explain.
Table 3.5
Expected Values of Reaction Times: Factors Influence Sequential Processes
A and B On Opposite Sides of the Wheatstone Bridge in Figure 3.4
Baseline Process Durations and Prolongations Have Exponential Distributions
μB
25 100 150 200 250
μA
50 619 674 715 758 802
200 696 744 782 823 865
250 729 775 812 852 895
300 764 809 845 885 927
400 839 881 917 958 997
450 879 920 955 994 1035
500 920 960 995 1033 1074
[Table 3.5 continued]

μB
25 100 150 200 250
μA
50 - - - - -
200 - - 7 -10 -12 -14
250 - - 9 -12 -15 -18
300 - -10 -15 -18 -21
400 - -12 -18 -20 -25
450 - -13 -19 -24 -27
500 - -14 -21 -25 -29
Table 3.6
Means of Simulated Reaction Times: Factors Influence Sequential Processes
A and B On Opposite Sides of the Wheatstone Bridge in Figure 3.4
μB 25 100 150 200 250 500 600

σB 6.3 25.0 37.5 50.0 62.5 125.0 150.0
μA σA
50 12.5 527 603 648 700 751 1001 1101

200 50.0 555 611 658 703 753 1000 1101
250 62.5 575 622 662 708 756 1002 1103
300 75.0 599 640 675 716 763 1010 1106
400 100.0 672 695 723 752 792 1031 1129
450 112.5 715 733 752 779 815 1047 1149
500 125.0 757 773 789 811 843 1072 1176
550 137.5 806 815 832 845 883 1109 1207
600 150.0 855 860 871 888 919 1147 1243
700 175.0 953 956 966 974 1005 1234 1325
800 200.0 1047 1054 1058 1068 1099 1319 1417

μB 25 100 150 200 250 500 600

σB 6.3 25.0 37.5 50.0 62.5 125.0 150.0
μA σA
50 12.5 - - - - - - -
200 50.0 - -19 - 18 - 24 - 25 - 29 - 28
250 62.5 - -29 - 35 - 40 - 43 - 47 - 46
300 75.0 - -35 - 45 - 56 - 60 - 64 - 67
400 100.0 - -53 - 71 - 93 -105 -115 -116
450 112.5 - -58 - 84 -109 -124 -142 -139
500 125.0 - -61 - 90 -119 -139 -159 -155
550 137.5 - -67 - 95 -134 -147 -171 -172
600 150.0 - -71 -105 -140 -160 -182 -185
700 175.0 - -72 -108 -152 -172 -193 -202
800 200.0 - -69 -110 -152 -172 -202 -203
Fig. 3.9. When sequential processes A and B are prolonged, interaction contrasts are
monotonically decreasing for the left hand network and monotonically increasing for the
right hand network. They are neither increasing nor decreasing in a mixture of the two
networks.
Table 3.7
Reaction Times for Various Durations of A and B
In the Networks of Figure 3.9
Left Hand Network Right Hand Network

d(A) d(A)
d(B) 0.0 0.5 1.5 2.5 d(B) 0.0 0.5 1.5 2.5
0.0 1.0 1.5 2.5 3.5 0.0 3.0 3.0 3.0 3.5
0.5 1.5 1.5 2.5 3.5 0.5 3.0 3.0 3.0 3.5
1.5 2.5 2.5 3.0 3.5 1.5 3.0 3.0 3.0 4.0
2.5 3.5 3.5 4.0 5.0 2.5 3.5 3.5 4.0 5.0
Table 3.8
Interaction Contrasts for Various Durations of A and B
In the Networks of Figure 3.9
Left Hand Network Right Hand Network

d(A) d(A)
d(B) 0.0 0.5 1.5 2.5 d(B) 0.0 0.5 1.5 2.5
0.0 - - - - 0.0 - - - -
0.5 - -0.5 -0.5 -0.5 0.5 - 0 0 0
1.5 - -0.5 -1.0 -1.0 1.5 - 0 0 0.5
2.5 - -0.5 -1.0 -1.0 2.5 - 0 0.5 1.0
Table 3.9
Interaction Contrasts for Various Durations of A and B
In a Mixture of the Networks of Figure 3.9
Left (1/10) and Right (9/10)

d(A)
d(B) 0 0.5 1.5 2.5
0 - - - -
0.5 - - 0.05 - 0.05 - 0.05
1.5 - - 0.05 - 0.10 0.44
2.5 - - 0.05 0.44 0.80
Distinguishing Concurrent and Sequential Processes
If a task network contains a Wheatstone bridge, complete or incomplete,

two factors prolonging sequential processes on opposite sides of the
bridge may have interaction contrasts of the same sign as factors
prolonging concurrent processes.
The key to identifying processes A and B on opposite sides of a
Wheatstone bridge is to include in the experiment conditions in which
the durations of A and B are short and also to include conditions in which
the durations of A and B are long. Prolongations with respect to short
baseline durations will produce negative interaction contrasts.
Prolongations with respect to long baseline durations will produce
additive effects. Sequential processes not on opposite sides of a
Wheatstone bridge cannot produce the negative interaction contrasts, and
concurrent processes cannot produce the additive effects.
The point is important, so let us consider more details. Consider an
AND network with processes A and B, and perhaps others. Processes on
opposite sides of a Wheatstone bridge are sequential. Intuitively, two
sequential processes should behave differently from two concurrent
processes when their durations are long. As usual, suppose Factor Α
selectively influences process A, and Factor Β selectively influences
process B. Suppose a level of Factor Α can be found, so that at this
level process A is almost always on the critical path. Likewise, suppose
a level of Factor Β can be found, so that at this level process B is almost
always on the critical path. Then at these levels, if A and B are
sequential they are almost always on a critical path together, and with
still higher levels, the factors will each have large individual main effects
but nearly additive combined effects. On the other hand, two concurrent
processes cannot ever be on a critical path together. Even if their
durations are long, one process or the other, but not both, will be on a
critical path. Therefore, if there is a nonzero probability that the
prolongation of A exceeds the total slack for A and the prolongation of B
exceeds the total slack for B, if A and B are concurrent, the factors
influencing A and B will produce negative interactions, rather than being
additive (Schweickert & Townsend, 1989).
Unfortunately, for concurrent processes the interaction contrasts may
be close to 0, even when the factors both have large effects, as Table 3.1
makes clear. Nonetheless, with concurrent processes, a prolongation of
one process can, in principle, always be made large enough to eclipse the
effect of prolonging the other. The next section gives details about long
prolongations.
Distinguishing sequential from concurrent processes is important for
understanding the task network as a whole. It turns out that knowing
which pairs of processes are sequential and which are concurrent gives
sufficient information to construct an underlying directed acyclic graph
with a procedure called the Transitive Orientation Algorithm, see
Golumbic (1980) and Schweickert (1983b). The algorithm puts
sequential processes in order, and all possible directed acyclic networks
consistent with the classification of processes as sequential or concurrent
can be produced. An important property of this algorithm is that if a
proposed classification of pairs of processes as sequential and concurrent
is not possible with a directed acyclic graph, the procedure stops, and
indicates that the proposed classification is inconsistent with a directed
acyclic graph.
Limiting Values of Interaction Contrasts
A factor level is particularly informative if it makes a process so long

that it is almost always on the critical path in an AND network, or so
short that it is almost always on the shortest path through an OR network.
There are three reasons for interest in such extreme factor levels. (1)
They are especially useful for distinguishing concurrent and sequential
processes. (2) They are useful for analyzing sequential processes on
opposite sides of a complete Wheatstone bridge (Figure 3.6). (3)
Coupled slacks are useful for quantitative analysis of the network, and
estimates of these parameters become available with extreme factor
levels. It will be useful here to examine the simulated data to determine
how good the estimates of the coupled slack parameters are.
The problem of distinguishing concurrent and sequential processes is
known to be hard (Townsend, 1972). In this section we show that it can
be solved, in principle, for AND and OR task networks. The discussion
is, unavoidably, technical. Readers uninterested in technical details can
skip to the next section (on Additive Factors) without loss of continuity.
The next chapter explains the assumptions underlying the results.
Concurrent processes
Suppose processes A and B are concurrent in an arbitrary directed acyclic

AND or OR task network. Suppose the previous assumptions about
selective influence hold. As usual, the levels of the factor influencing A
are labeled i, i = 1, 2,... and the levels of the factor influencing B are
labeled j, j = 1, 2,... When one factor is held fixed, the level numbers of
the other factor increase as mean response times increase.
For levels of i and j greater than 1, the interaction contrast ΑΒ)ij is
defined in Equation 3.3 as (ΑΒ)ij = E[Tij]  E[T1j]  E[Ti1] + E[T11]. Then
for any fixed j, the sequence ΑΒ2j, ΑΒ3j,... converges to a limit for
every j. The limiting value may be different for every column j. The
reasoning is in the Appendix; a proof based on stronger assumptions is in
Schweickert and Wang (1993).
Because the interaction contrasts in a given column j have a limiting
value, they will become close to each other as one moves down the
column. This is the Cauchy criterion for convergence (Bartle, 1964, p.
115; Dzhafarov, 1992). It is useful because it can be applied even if the
value of the limit is unknown. A drawback is that it requires greater
precision in the data than the tests proposed earlier, because it is based on
(near) equality, rather than inequality. Of course, in experiments one can
only check that the criterion is met for the finite number of cases
observed, so one cannot prove conclusively by experiment that the
sequence converges.
Just as convergence is predicted for each column, it is predicted for
each row. In a given row i, the sequence ΑΒi2, ΑΒi3,... converges to a
limit for every i, and this may be tested with the Cauchy criterion. The
row limit may be different for every row i.
The reasoning in the appendix of this chapter is not informative about
the value of the limit. The theoretical limiting values are known if an
additional assumption is made (Schweickert & Wang, 1993). Consider
concurrent processes A and B. Suppose that as the level of the factor
prolonging A increases, the probability approaches 1 that A is on the
critical path. Then we say the prolongation of A overcomes the total

slack for A in the limit. If this assumption holds, the limit for column j
will be E[T1j] + E[T11]. Likewise, if the prolongation of B overcomes
the total slack for B in the limit, then the limit for row i will be E[Ti1] +
E[T11]. Note that the limit for row i is not necessarily the same as the
limit for column j.
Although the theory predicts that the row and column limits exist, it is
hard to test this in practice, because the sizes of the prolongations needed
to overcome the slacks would not ordinarily be known. An experiment
in which limiting values are found is evidence that the limits exist. But
an experiment in which limiting values were not found does not establish
that limits do not exist. The simulated data sets in Tables 3.1-3.6 are
invaluable for assessing the situation. They show, generally, that the
limits can be reached experimentally, although the prolongations
required are rather large.
Consider the interaction contrasts in Tables 3.1 and 3.2 for concurrent
processes. The lower the row number (or column number) the sooner it
is expected to converge. By choosing a small criterion for convergence,
say 5 ms, one can determine whether the numbers in adjacent columns in
a given row are close to one another. In each table, the second row (i.e.,
the first row containing numbers) converges by this criterion. Likewise,
the columns converge, except for the last one in Table 2. This pattern is
expected for concurrent processes, because for a given combination of
factor levels, if A is almost always on the critical path, then B is almost
never on the critical path, and vice versa. The resulting pattern is subtle,
because a row converges if it is a low numbered row and there are many
columns, and a column converges if it is a low numbered column and
there are many rows. As convergence begins to appear in the last
column (in the lower right hand corner), it tends to disappear for the last
row, and vice versa. This continues, no matter how far the lower right
hand corner is extended.
Do the limits equal the bounds given in the Appendix (Inequality
A1)? In our results, the answer depends on the distributions of the
process durations and the prolongations. For the exponential
distributions in Table 3.1, the interaction contrasts in column 3 (for μB =
150) are converging, but they are far from the lower bound of −55.018 =
− 671.111 + 616.093. On the other hand, in Table 3.2, for the truncated
normal distribution, the numbers in column 3 have converged to their
lower bound of −19 = −547 + 528. Likewise, the numbers in row 3 have
converged to their lower bound of −28 = −556 + 528. Note that the row
limit is different from the column limit.
The lower tail of the distribution of the process when prolonged is the
key to whether the bound in the Appendix is the limit. The lower tail is
thick for the exponential distributions but thin for the truncated normal
distributions. When the duration of a process A is exponentially
distributed (or is the sum of a few exponentially distributed durations)
there is a nonnegligible probability of a duration too low to put A on the
critical path, even when the mean duration of A is large. In contrast, with
the truncated normal distribution, as the mean increases the area under
the lower tail becomes negligible.
Sequential processes
Suppose processes A and B are sequential. Then the sequence of

interaction contrasts in each row converges to a limit, and the sequence
in each column also converges to a limit (see the Appendix). The limit
for the rows may be different from the limit for the columns. A stronger
result in Schweickert and Wang (1993) is based on the following
stronger assumption. Suppose A and B are sequential processes in an
AND network, with A preceding B. Suppose that as the levels of the
factors increase, the probability approaches 1 that all the following
events occur: (1) A and B are both on the critical path, (2) A is on the
longest path from the starting vertex of the network to the start of B, and
(3) B is on the longest path from the end of A to the response. Then we
say the prolongations overcome the relevant slacks in the limit.
If this condition occurs in addition to selective influence, then as the
factor levels i increase the limits R1, R2,..., Ri,... for the rows will
themselves approach a limit L. Further, as the factor levels j increase, the
limits C1, C2,..., Cj,... for the columns will approach the same limit, L.
The limit is important, its value is E[K(A, B)]. That is, the double
sequence (ΑΒij) converges to the expected value of the coupled slack
(Schweickert & Wang, 1993). This can be tested with the Cauchy
criterion for double sequences (Bartle, 1964). To be certain that a limit

exists, the Cauchy criterion must be tested for all possible factor levels.
Empirically, of course, only a finite number of levels is available for
each factor, that is, a table of interaction contrasts with only a finite
number of rows and columns can be checked. The Cauchy criterion for
such a table can be stated informally in this way: The interaction
contrasts in the last two columns of the last row of the table must be
close to each other and the interaction contrasts in the last two rows of
the last column must be close to each other. In other words, the number
in the lower right hand corner of the table will be close to the neighbor
above it and to the neighbor beside it. If the limit is indeed the expected
value of the coupled slack, then these numbers will be close to E[K(A,
B)]. If the prolongations do not overcome the relevant slacks in the limit,
it is not known whether, in theory, the row limits would approach the
column limit, and if so, what this limit would be.
The criterion for convergence used previously was 5 ms. In Table
3.3, the interaction contrasts at the end of the last row differ by no more
than 5 ms (21.462 is close to 25.085), the same is true for those at the
end of the last column (23.684 is close to 25.085). It appears that the
row limit equals the column limit. However, these values are far from
the parameter value of E[K(B, C)] = 122.967. (The value of this
parameter can be easily found by using an Order-of-Processing diagram
to calculate expected values of the relevant path durations in Equation
3.1.) By comparison, the interaction contrasts in the lower right hand
corner of Table 3.4 are not only within 5 ms of each other, but some are
within 5 ms of the parameter value of E[K(B, C)] = 64. (This value was
obtained by estimating quantities in Equation 3.1 with simulations.)
However, large values for the mean duration of process B were required
for convergence with the truncated normal distributions. For the highest
value used for the mean for B, the increase in response time produced by
prolonging B, 388, is greater than the response time itself, 314. Such
large increases in response time are sometimes reported in experiments,
but not often.
The situation is similar for the sequential processes in the Wheatstone
bridge. The interaction contrasts at the end of the last row of Table 3.5
are close to each other (−25.3 is close to −29.2). Likewise, those at the
end of the last column are close also (−27.4 is close to −29.2).
Evidentially, the rows and columns have nearly converged to the same
limit. However, their values are far from the parameter value, E[K(A, B)]
= −130.0 (found with the Order-of-Processing diagram). By comparison,
the interaction contrasts in the lower right hand corner of Table 3.6 are
close together (− 202 is close to − 203, for the last row and the last
column). Further, the interaction contrast of − 203 in Table 3.6 is fairly
close to the parameter value, E[K(A, B)] = − 213. Convergence, or near
convergence, was only obtained in Table 3.6 when the processes were
prolonged to large durations. The need for longer prolongations in Table
3.6 than in Table 3.5 is due, to some extent, to the use of simulations in
the former and exact values in the latter.
To summarize, the interaction contrasts converged faster for process
durations with exponential distributions than for those with truncated
normal distributions. The latter required very large prolongations for
convergence. Part of the difference is probably due to the use of an
algorithm for the exponential distribution and simulations for the normal.
With the exponential distribution, interaction contrasts did not converge
to the bounds in the Appendix; nonetheless, with sequential processes,
the rows and columns converged to the same quantity. With truncated
normal distributions, the interaction contrasts converged slowly, but
when they did converge it was to the bounds in the Appendix. The
results demonstrate the feasibility of investigating limiting values of
interaction contrasts experimentally, but high levels of the factors
selectively influencing the processes must be available. The limiting
values may not be good estimators of the network parameters.
Building Blocks: Superprocesses and Stages in Task Networks
Just as an atom has electrons as parts and may in turn be a part of a

molecule, it is natural to think a process may have subprocesses or be a
part of a superprocess. What we have in mind are not arbitrary subsets
and supersets of processes, but sets of processes that behave as units in
some sense. In this section we consider how sets of processes can form
natural units. We first explain how a set of processes that form a
superprocess can be replaced by a single process. In fact, many of the
“processes” that we regard as elemental may be superprocesses. We then

introduce a special kind of superprocess, a stage. Two factors selectively
influencing different processes have additive effects as a consequence of
the network structure if and only if the processes are in different stages.
Thus, we come full circle, first finding that two factors selectively
influencing two different processes are typically not additive, then
finding that two factors selectively influencing two processes in two
different stages are always additive.
Superprocesses
A vertex is incident with an arc if the vertex is either the starting or

ending vertex of the arc. Suppose M is a subnetwork of the directed
acyclic network N. A vertex of attachment of M is a vertex of M which
is incident with an arc of N not in M. A starting vertex of M is a vertex s
of M such that no arc of M has s as its ending vertex. An ending vertex
of M is a vertex t of M such that no arc of M has t as its starting vertex.
A superprocess in a directed acyclic network N is a subnetwork with
exactly one starting vertex, exactly one ending vertex, which is different
from the starting vertex, and no vertex of attachment other than the
starting or ending vertex.
A superprocess M has the following three properties (Schweickert,
1983b), which make the arcs in it behave as a unit with respect to the
arcs outside of it. (1) It is weakly connected; that is, given any two
vertices u and v of M one can go along arcs of M from u to v, ignoring
the directions of the arcs. (2) The arcs of M are convex; that is, if X and
Z are arcs of M, and there is an arc Y such that X precedes Y and Y
precedes Z, then Y is an arc of M. (3) The arcs of M are partitive; that is,
an arc outside M is sequential with an arc inside M if and only if it is
sequential with all arcs of M, and an arc outside of M is concurrent with
an arc of M if and only if it is concurrent with all the arcs of M.
The set of processes represented by arcs in a superprocess M behave
as a unit in many ways. Consider an arc C outside the module M. Then
if C precedes any arc in M, it precedes every arc in M, and if C follows
any arc in M, it follows every arc in M.
Further, suppose every arc is assigned a nonnegative real number as
duration. If C follows every arc in M, then for any two processes

represented by arcs A and B in M, s(A, C') = s(B, C'). If C precedes
every arc in M, then for any two processes represented by arcs A and B in
M, s*(A, C") = s*(B, C"), Schweickert (1983b). Here, s*(A, C") is the
slack from A to C in the network formed from the original task network
by reversing the direction of every arc; s*(B, C") is defined similarly.
Because of these properties, a superprocess can be replaced by a single
arc in the network with no loss of information about arcs outside of it.
For more on partitive sets, see Golumbic (1980).
Additive Factors and Stages
What conclusions can be drawn from additivity? Consider the network

in Figure 3.10. The vertex c is called a cut vertex (sometimes called a
vertex of articulation). It is obvious that if one factor prolongs a single
process to the left of vertex c and another factor prolongs a single
process to the right of c, the factors will have additive effects on mean
response time. However, an observation that two factors have additive
effects on response time does not imply that the factors influence
processes on either side of a cut vertex. Consider the AND network in
Figure 3.1. Two factors selectively influencing the SOA and visual
search will have additive effects if the duration of the produced interval
is relatively small, but not if the duration of the produced interval is
relatively large. When two factors prolong two processes, let us say that
additivity follows from network structure if for all assignments of
numbers as process durations and as prolongations produced by the
factors, the two factors have additive effects.
We will see that the only way additivity of two factors selectively
influencing two different processes follows from network structure is
when there is a cut vertex between the processes. Sternberg’s (1969)
additive factor method is based on the idea that if two factors have
additive effects on mean response time they affect different stages and if
they interact they affect the same stage. We define the part of an acyclic
task network from the starting vertex to the first cut vertex as the first
stage, the part from the first cut vertex to the second as the second stage,
and so on. Note that every stage is a superprocess, but not all
superprocesses are stages. With this natural definition of stages, two

factors selectively influencing different processes have additive effects as
a consequence of the network structure if and only if the processes are in
different stages.
The following definitions and assumptions are needed. We suppose
the acyclic network for a task is weakly connected. A cut vertex of a
weakly connected directed graph is a vertex v such that if v and all arcs
incident with v (i.e., arcs with v as the starting or ending vertex) are
removed, the resulting directed graph is not weakly connected. It is easy
to see that in a directed acyclic graph with one starting vertex o and one
ending vertex r, a vertex v different from o and r is a cut vertex if and
only if every directed path from o to r contains v.
Fig. 3.10. The subnetwork between o and c is a stage, as is that between c and r.
Processes E and F together with vertices o and c form a superprocess.
As before, for a vertex x that precedes a vertex y, let d(x, y) denote the
duration of the longest path between them in an AND network and of the
shortest path between them in an OR network. Let f[d(a, b),..., d(c, e)]
and g[d(u, v),..., d(w, z)] be functions of path durations. If for all possible
values of the path durations in the arguments of f and g, f[d(a, b),...,
d(c, e)] = g[d(u, v),..., d(w, z)] then we write
f[d(a, b),..., d(c, e)] ≡ g[d(u, v),..., d(w, z)].

Consider two processes A and B in an AND network. If A and B were

concurrent, there are assignments of process durations which yield a
negative interaction (Schweickert, 1978; Schweickert & Townsend,
1989, Theorem 1). Hence, for additivity to follow from network
structure, the influenced processes must be sequential. If A and B are
sequential and there exist durations for the processes in the network for
which k(A, B) ≠ 0 on a particular trial, then, with sufficiently long
prolongations of A and B, the factors selectively influencing them would
not have additive effects. Hence, factors selectively influencing A and B
have additive effects as a consequence of the network structure if and
only if k(A, B) ≡ 0; that is, the value of the coupled slack on a particular
trial is 0 no matter what the durations of the processes are for that trial.
According to the following theorem, this occurs if and only if there is a
cut vertex between A and B.
Theorem (Schweickert, Fisher & Goldstein, 2010) Let A and B be

two sequential arcs in a directed acyclic AND network or OR network.
The coupled slack k(A, B) ≡ 0 if and only if there is a cut vertex between
A and B.
In an experiment, process durations will be random variables, not

fixed real numbers. Suppose process A precedes cut vertex c and process
B follows c. Consider an AND network. Let D(o, c) denote the random
duration of the longest path from the starting vertex, o, of the network to
vertex c, and let D(c, r) denote the random duration of the longest path
from vertex c to the ending vertex, r, of the network. Then the random
completion time of the task is D(o, c) + D(c, r). With all but the most
exotic definitions of selective influence, a factor selectively influencing
A will change D(o, c) and a factor selectively influencing B will change
D(c, r), and the factors will have additive effects.
But suppose there is no cut vertex between processes A and B.
According to the theorem, there is at least one assignment of the process
durations such that k(A, B) ≠ 0. If there were only one such assignment,
one would be concerned that it occurs with probability 0, so an
interaction of factors selectively influencing processes A and B would not

be observed. However, Schweickert, Fisher and Goldstein (2010)
showed that if there exists one assignment of process durations for which
k(A, B) ≠ 0, then there is a region of positive volume in the space of
process durations and prolongations of A and B, such that for all values
in this region k(A, B), and hence the interaction, is nonzero, always with
the same sign. Then a probability distribution for the process durations
and prolongations of A and B having positive probability over this
region, and probability 0 elsewhere, will produce a nonzero interaction.
To use the theorem to infer the existence of a cut vertex between A
and B, one must know whether k(A, B) ≡ 0. In practice, one would only
know that a nonzero interaction was never found in the available data. In
practice, it is reasonable to conclude that a cut vertex between A and B
exists if nonsignificant interactions are found in studies with high power,
using a number of levels of i and j, under a variety of circumstances
likely to affect the durations of processes other than A and B.
This chapter is about distinguishing concurrent and sequential
processes with patterns in mean reaction times and interactions. This is
usually easy, fortunately, but the case of the Wheatstone bridge is not. A
reader interested in mathematical foundations for the results of this
chapter will find them in the following chapter. A reader interested in
applications of the results to dual tasks will find them in Chapter 5.
Appendix
Limits of Interaction Contrasts
Interaction contrasts for extreme values of the factors are discussed here
for AND networks, results for OR networks are analogous. When the
factor selectively influencing process A is at level i, Ui denotes the
increase in the duration of A from its baseline D(A). When the factor
selectively influencing process B is at level j, Vj denotes the analogous
quantity for the increase in duration of B. Let the expected value of the
response time, when the factor prolonging A is at level i and the factor
prolonging B is at level j, be denoted E[Tij]. Then
ABij = E[Tij]  E[T1j]  E[Ti1] + E[T11].
Suppose processes A and B are concurrent. Then by Inequality (4.4)

in the next chapter, for every i,
ABij >  E[T1j] + E[T11]. (3.A1)
Then for any fixed j, the sequence (ABij) is monotonically decreasing in

i but is bounded below. By the monotone convergence theorem (e.g.
Bartle, 1964, p. 111) as i increases, the sequence (ABij) converges to a
limit for every j. Similarly, for fixed i, the sequence (ABij) converges to
a limit as j increases. The limit may or may not be the bound in
Inequality (3.A1).
Suppose processes A and B are sequential. If A and B are not in a
Wheatstone bridge, then the sequence (ABij) is monotonically
increasing in i for each j, and is monotonically increasing in j for each i.
Further, each sequence is bounded, because ABij < E[K(A, B)] for all i
and j (see Schweickert, 1982, Table 1 for AND networks, and
Schweickert & Wang, 1993, for OR networks). Then, by the monotone
convergence theorem, the sequence of interaction contrasts in each row
converges to a limit as the column numbers increase; the sequence in
each column converges to a limit also. The limit need not be E[K(A, B)].
The same principle applies if A and B are in an incomplete
Wheatstone bridge in which each path from o to r includes either A" or

B'. The sequence (ABij) is monotonically decreasing in j for each i, and
monotonically decreasing in i for each j. Further, each sequence is
bounded below, because E[K(A, B)] < ABij for all i and j. Then the
sequence in each row converges to a limit, as does the sequence in each
column. Neither the limit for a row, nor the limit for a column need be
E[K(A, B)]. The complete Wheatstone bridge is discussed in the body of
the chapter.
Also discussed are conditions under which limits will reach the
bounds give here. Briefly, if processes A and B are sequential, on
opposite sides of a Wheatstone bridge or not, if P[Ui ≥ S(A, r), S(A, B')
and Vj ≥ S(B, r), S*(B, A")] → 1, then ABij → E[K(A,B)]. Here, S*(B,
A") = S(B, r) − K(A,B) is the slack from B to A in the network in which
all arcs are reversed.
Chapter 4
Theoretical Basis for Properties of Means and

In this chapter we present the mathematics underlying results presented

in earlier and later chapters. This material can be skipped without loss of
continuity.
Notation and Definitions
A vector x with n components is an ordered list of n entities, usually

written as a column,
 x1 

 .
 xn 
For two vectors whose components are real numbers,
 x1   y1 
   
x =    and y =  ,
 x n   y n 
we write x < y if x1 < y1,..., xn < yn.

Suppose f is a real valued function defined on vectors with n
64
Theoretical Basis for Properties of Means and Interaction Contrasts 65
components. We sometimes write f(x) as f(x1,..., xn). Function f is

monotonically increasing if x < y implies f(x) < f(y). A monotonically
decreasing function is defined analogously.
Probability spaces
Suppose a subject awaits a trial in an experiment. Suppose the subject is

prepared for either level 1 or level 2 of Factor A. There is a potential
sample of values of process durations at level 1 and a potential sample of
values of process durations at level 2. Now suppose the experimenter
presents the trial with Factor A at level 1. The sample of values of
process durations at level 1 is taken, and the sample of values at level 2 is
not taken. Although in the lab a sample is observed at either one level of
the factor or the other, but not both, it is convenient to assume there
exists a theoretical sample space in which process durations at both
levels are assigned on each trial. We assume the distribution of the
subset of process durations at level 1 in this theoretical sample space is
equivalent to the distribution of the process durations at level 1 in the lab,
and the analogous statement is true for the distribution of the subset of
process durations at level 2. Having a common theoretical sample space
allows us to speak of things such as adding the duration of a process at
level 1 to the duration of another process at level 2. In defining this
common theoretical sample space, we must be sure no substantial
information is introduced that could potentially conflict with what is
observable.
We now define a probability space. For more details, see Luce
(1986) and Feller (1971). In statistics, an action such as flipping a coin
or releasing a mouse into a maze is called an experiment. Outcomes are
results such as a head coming up or the mouse taking the rightmost
branch. The set of all possible outcomes of an experiment is called the
sample space, Ω. We assume it is nonempty.
We associate probabilities with certain subsets of the sample space
called events. The events are members of the set S of events. If the
sample space, Ω, is finite or countably infinite, S is the set of all subsets
of Ω. In any case, the set S of events satisfies the following assumptions:
(i) Ω  S.
(ii) If S  S, then ~ S  S. Here, ~ S, the complement of S, is the
set of elements of Ω that are not elements of S.
(iii) If {S1, S2,... } is a countable set of subsets of S, then S1∪ S2 ∪...
 S.
A set of subsets of Ω satisfying (i), (ii) and (iii) is called a sigma algebra.
(Hays, 1994, considers any subset of the sample space to be an event, but
he is speaking of a finite or countably infinite sample space.)
A probability measure on S is a function P from the set of events into
the real numbers such that
(i) P(Ω) = 1;
(ii) If S  S, then P(S) > 0;
(iii) If {S1, S2,... } is a finite or countably infinite set of pairwise
mutually exclusive events, then
P(S1∪ S2 ∪... ) = P(S1) + P(S2) + ....
A probability space is a triple, < Ω, S, P>, where Ω is a nonempty set, S

is a set of events of Ω, and P is a probability measure on S .
A univariate random variable associates a real number with every
outcome of an experiment. For example, with a coin toss, we could set
the random variable X to 1 if a head comes up, and set X to 2 if a tail
comes up. Note that two different random variables can have the same
distribution. If for this coin toss, we set random variable Y to 2 if a head
comes up, and set Y to 1 if a tail comes up, then X and Y have the same
probability distribution, that is, P(X = 1) = P(Y = 1) = .5, and P(X = 2) =
P(Y = 2) = .5. However, X and Y are not the same random variable.
If two random variables X and Y are defined on the same probability
space, another random variable can be defined through operations on
them. In the above example, define Z = Y − X. If the coin comes up
heads, X = 1 and Y = 2, so the value of Z is 1.
For some experiments, such as rolling a red die and a green die
simultaneously, it is natural to consider a random vector that associates a
list of real numbers with every outcome of the experiment. For example,
X1 can be set to the number of dots that come up on the red die, and X2
can be set to the number of dots that come up on the green die. Then
X1 
X=  
X 2 
is a random vector. Each component of a random vector is a univariate

random variable. Here, we consider only random vectors with a finite
number of components.
We use a capital letter, e.g., X, to denote a random variable, and the
corresponding small letter, x, to denote a value the random variable takes
on. Likewise, X denotes a random vector, and x denotes a vector of
values taken on by the components of X.
A random vector with n components
 X1 
 
X=   
 X n 
has a cumulative distribution function FX(x) defined on every vector x

with n real valued components,
 x1 
 
x=   .
 x n 
The cumulative distribution function for X is

FX(x) = P(X < x) = P(X1 < x1,..., Xn < xn).
The cumulative distribution FX(x) is sometimes called the joint

cumulative distribution of X1,..., Xn. The cumulative distribution function
of the single random variable Xk is Fk(xk) = P(Xk < xk). It is sometimes
called the marginal cumulative distribution function of Xk. (To avoid a
subscript to a subscript, k is used as subscript for F rather than Xk.)
Ordering random variables
If two random vectors X and X̂ have the same cumulative distribution

function, we write X ≈ X̂ . Note that this relation requires that X and X̂
have the same number of components, but it does not require that they
are defined on the same probability space.
We now define an order on random variables. In economics, it is
called first order stochastic dominance and in operations research it is
often called “the usual stochastic order” (Shaked & Shanthikumar,
2007). We use the latter term. For material here we follow here the
useful reviews by Müller and Stoyan (2002) and Shaked and
Shanthikumar (2007). See also Townsend (1990).
The definition of the usual stochastic order is intuitive for univariate
random variables. Random variable X is smaller than random variable Y
in the usual stochastic order, written X ≤st Y, if for all t, FX(t) ≥ FY(t),
where FX(t) and FY(t) are the cumulative distribution functions of X and
Y, respectively. Note that the random variable with the larger cumulative
distribution function will have the smaller expected value. A result
important for our purposes is that X ≤st Y if and only if Ef(X) ≤ Ef(Y) for
all monotonically increasing real valued functions f for which both
expectations exist (Müller & Stoyan, 2002, Theorem 1.2.8; Shaked &
Shanthikumar, 2007, p. 4).
For random vectors it is convenient to generalize the consequence
and use it as the definition. Let R denote the real numbers. For random
vectors X and Y, each with n components, we say X ≤st Y if Ef(X) ≤
Ef(Y) for all monotonically increasing functions f from Rn into R, for
which both expectations exist. For random vectors the relation ≤st is
stronger than the assumption that the corresponding joint cumulative
distribution functions are ordered by ≤. That is, for random vectors X
and Y, if X ≤st Y, then FX(t) ≤ FY(t), for all t, but the converse is not true.
(As a technical detail, for certain random variables the relation ≤st is not
transitive, because the integrals required for certain expected values do
not exist. With little loss of generality, we assume for random variables
discussed here that ≤st is transitive.)
Recall that if random vector X and random vector Y have the same
joint cumulative distribution function, we write X ≈ Y. If X ≤st Y and Y
≤st X, it follows that X ≈ Y. Some authors write X =st Y to indicate that
X and Y have the same cumulative distribution function. (Another
technical detail is that X and Y can have the same joint cumulative
distribution function while having joint density functions that differ at a
countable number of isolated points. With little loss of generality, we
assume this situation does not arise for random variables here.)
The next result characterizes the usual stochastic ordering (see, e.g.,
Müller & Stoyan, 2002, Theorem 3.3.5; Shaked & Shanthikumar, 2007,
Theorem 6.B.1):
The following two statements are equivalent for two random vectors
X and Y.
(1) X <st Y.
(2) There exist random vectors X̂ and Ŷ on the same probability
space, with X ≈ X̂ and Y ≈ Ŷ , such that P( X̂ < Ŷ ) = 1.
The following fact is useful. Suppose X <st Y and X' is a random

vector obtained from X by omitting some components, and Y' is
obtained from Y by omitting the same components. Then X' <st Y'
(Müller & Stoyan, 2002, Theorem 3.3.10).
Suppose condition (2) is true. Because X̂ and Ŷ are on the same
probability space, we can take their difference Ẑ = Ŷ − X̂ . Let 0
denote a vector having the same number of components as Ẑ , every
component being 0. Because P( X̂ < Ŷ ) = 1, P(0 < Ŷ − X̂ ) = 1, that

is, P(0 < Ẑ ) = 1. On the other hand, suppose Ẑ is a random vector on
the same probability space as X̂ and Ŷ , every component of Ẑ is
nonnegative, and Ŷ = X̂ + Ẑ . Then clearly P( X̂ < Ŷ ) = 1. Hence,
the following condition is equivalent to condition (2), and thereby
equivalent to condition (1).
(3) There exist random vectors X̂ and Ẑ on the same probability

space, with 0 <st Ẑ , X ≈ X̂ and Y ≈ X̂ + Ẑ .
(As another technical detail, if 0 <st Ẑ , it is possible that a component

of Ẑ takes on a negative value, but this occurs with probability 0. If 0 ≤
Ẑ , it is not possible that a component of Ẑ is negative. Hence, 0 ≤ Ẑ
implies 0 <st Ẑ .)
Conditional expectation
For continuous random variables X and Y, the joint density is the function
f ( x, y) such that for all real x and y ,
x y
P( X  x, Y  y )  
 
f ( x, y )dxdy .
The marginal density of one of the variables, say X , is the function

f X (x ) such that for all real x ,
x
P( X  x)  f

X ( x)dx .
The conditional density of X for a given value y of Y is f(x|y) =

f(x,y)/fY(y). If fY(y) = 0, then f(x|y) is undefined. In expressions such as

f(x|y) we will implicitly assume the denominator is not 0. The
conditional expectation of X for a given value y of Y is

E X [ X | y]   xf ( x | y )dx .

If we now take the expected value with respect to Y of the above

conditional expectation, we obtain simply the expected value of X. That
is,
EY [ E X [ X | y ]] 

    

   
xf ( x | y ) dx 

f ( y ) dy  

 xf ( x, y )dxdy 

 xf

x ( x )dx  E[ X ].
Analogous results hold for larger numbers of random variables.
Effects of Experimental Factors on Processes
We begin with a definition of selective influence developed by

Dzhafarov and his colleagues (Dzhafarov, 2003a; Dzhafarov &
Gluhovsky, 2006; Kujala & Dzhafarov, 2008; Dzhafarov & Kujala,
2010). We then explain a stronger assumption needed for our purposes,
selective influence by increments.
Factors selectively influencing random variables
A statement such as “perceptual duration is a random variable that

depends on stimulus intensity,” does not specify a random variable until
a level of intensity is set by the experimenter, because before that no
value of perceptual duration can be obtained. We make this way of
speaking more precise by supposing that for every level i, i = 1,..., I of
intensity there is a probability space and a random variable Ai defined on
it. Then perceptual duration A is a random variable in the sense that it is

a member of the family of random variables {A1,..., Ai,..., AI}. Similarly,
to say random variables A and B depend on Factors Α and Β means that
for all levels i, i = 1,..., I of factor Α and j, j = 1,..., J of factor Β there is
a probability space with random variables Aij and Bij defined on it, and
the random vector
 A
A=  
B
is a member of the family of random vectors
  A11   Aij   AIJ  

   , ,   , ,    .
  B11   Bij   BIJ  
Under these circumstances we also say the random vector A depends on

Factors Α and Β. Recall that the component random variables of a
random vector are assumed to have a joint distribution. Generalization to
a finite number of factors and of components is straightforward.
Suppose random variables A and B depend on Factors Α and Β. With
the definition of Dzhafarov and his colleagues, random variables A and B
are selectively influenced by Factor Α and Factor Β, respectively, if there
is a random vector C defined on a probability space such that for every
level i of Factor Α there is a function f1i and for every level j of Factor Β
there is a function f2j with
 Aij   f1,i (C) 

  .
 Bij   f 2, j (C)
To indicate that a random vector has other component random variables

C1,..., Cp not influenced by Factors Α and Β one writes
 Aij   f1,i (C) 

   
 Bij   f 2, j (C) 
C1    f (C)  ,
 ij   3 
   
   f (C)
Cpij   p  2 
where functions f3,..., fp+2 do not depend on the levels i and j. Again,
generalization to a finite number of factors and of vector components is
straightforward.
Suppose random variables A and B are selectively influenced by
Factor Α and Factor Β, respectively. Note that for different pairs of
levels <i, j > and <i', j' > random variables Aij and Ai'j' may not be
defined on the same probability space. But in the definition of selective
influence the same random vector C on the same probability space is
employed for every pair of levels. Hence, through the assumption of
selective influence there are random variables f1i(C) and f1i'(C) defined
on the same probability space that have distributions equivalent to Aij and
Ai'j', respectively. Note also that the level j of Factor Β is not needed to
specify Aij, so we may write simply Ai. This manner of defining selective
influence of factors on random variables brings conceptual clarity and in
a more general form can be employed widely, not only to process
durations, but to perceptual images and mental tests. More about it is in
Chapter 10.
Factors ordering random vectors
Let Dij be a random vector whose component random variables are the
durations of the processes required for a task when Factor A is at level i
and Factor B is at level j; that is, random vector D depends on Factors Α
and Β. A pair of factor levels can be written as a vector <i, j>. These
vectors are partially ordered; if i' ≤ i and j' ≤ j, we write <i', j'> ≤ <i, j>.
Suppose that whenever <i', j'> ≤ <i, j>
D i ' j '  st D ij .
ˆ , D̂
Then whenever <i', j'> ≤ <i, j> there exist random vectors D i' j' ij
ˆ , D
and Ẑ on the same probability space, with 0 <st Ẑ , D i ' j '  D i' j' ij
ˆ + Ẑ . Under these circumstances, we say Factors A and B

 D̂ ij  D i' j'
order D with the usual stochastic order. This assures a common

probability space for every pair of stochastically ordered random vectors.
But it does not assure that the same probability space will suffice for all
pairs (Fill & Machida, 2001).
A stronger condition is that the random vectors are realizably
monotone (Fill & Machida, 2001). In our notation, for i = 1, ..., I and j =
1, ..., J random vectors { Dij } are realizably monotone if there exist
random vectors { D̂ ij } all defined on the same probability space such
that for every i and j, Dij  D̂ ij , and whenever <i', j'> ≤ <i, j>
D i ' j '  st D ij .
For discussion of partial orders of factor levels (indices) for which the
usual stochastic order leads to realizable monotonicity, see Fill and
Machida (2001).
Factors selectively influencing random vectors by increments
For our purposes factors must do several things. Factors must selectively
influence process durations in the sense of Dzhafarov and his colleagues,
increasing factor levels must order process durations with some type of
stochastic ordering, and finally, increasing the level of a factor must add
something to the process duration it selectively influences. The
following assumption leads to these.
Consider the random vector
Uˆ * 
 1 
 
ˆ 
U I * 
ˆ 
V1 * 
C =  
> 0.

 
V J * 
 Aˆ 
 base 
 Bˆ 
 base 
Cˆ 1 
 
 
Cˆ p 
 
Now suppose for every level i of Factor Α and every level j Factor Β
 Aˆ base  Uˆ 1 *    Uˆ i *
 Aij   
   Bˆ base  Vˆ1 *    Vˆ j * 
 Bij 
 
Aij = C1ij   Cˆ1 .
 
   
   
Cpij  Cˆ p 
 
Then random variables A and B are selectively influenced by increments

by Factor Α and Factor Β, respectively. Generalization to larger finite
numbers of factors and components is straightforward.
Suppose Factor Α and Factor Β selectively influence by increments
random variables A and B, respectively. Then clearly A and B are
selectively influenced by Factor Α and Factor Β, according to the
definition of Dzhafarov and his colleagues. Further, random vectors
{Aij} are realizably monotone with the definition of Fill and Machida
(2001), and hence ordered with the usual stochastic order.
Townsend and Schweickert (1989) showed that a factor increasing
the duration of a process A by adding a nonnegative random variable to

its duration is equivalent to the duration of A at the lower level of the
factor being smaller than the duration of A at the higher level, in the
usual stochastic order. This is also equivalent to modifying the task
network in the following way. We replace the arc in the network
representing process A with two arcs in series. We call one the base,
Abase, and the other the prolongation, U. The duration of process A is the
duration of the base plus the duration of the prolongation. When the
factor selectively influencing process A by increments is at level 1, the
duration of the prolongation U is 0, and the duration of process A is
simply the duration of the base, Abase. That is, the duration of process A is
A1 = Abase. (We use the name of a process with suitable subscripts as the
symbol for the duration of the process.) When the factor selectively
influencing process A by increments is at level i, the duration of the
prolongation U is a nonnegative random variable Ui, and the duration of
process A is the duration of the base Abase plus Ui. That is, when the
factor selectively influencing process A by increments is at level i, the
duration of process A is Ai = Abase + Ui. We do not assume that the
durations of the base and the prolongation are independent. Similarly,
we replace the arc representing process B with two arcs in series, the
base Bbase of process B and the prolongation V of process B. When the
factor selectively influencing process B by increments is at level 1, the
duration of the prolongation V is 0. When the factor selectively
influencing process B by increments is at level j, the duration of the
prolongation V is a nonnegative random variable, Vj, and the duration of
process B is the duration of Bbase plus Vj.
Now suppose the task network has processes U, V, Abase, Bbase, C1,...,
Cp. Suppose Factor A selectively influences process A by increments,
and Factor B selectively influences process B by increments. When
Factor A is at level i and Factor B is at level j, the duration of a process W
is a random variable, which we denote by Wij. (Although the subscripts
could be simplified, we use two subscripts for every process duration to
indicate the general situation.)
For every i and j, let
U ij 
 
V ij 
A 
 baseij 
Dij =  B .
baseij
 
C1ij 
 
 
Cp ij 
We assume that there are nonnegative random variables Âbase , B̂base ,

Ĉ1 ,..., Ĉp , Uˆ i * , for i = 1, 2,..., and Vˆ j * , for j = 1, 2,..., all defined on
a common probability space such that the following are true.
First, Uˆ 1 * = Vˆ1 * = 0; that is, the only value Uˆ 1 * takes on is 0, and
the same is true for Vˆ1 * . Second, for Factor Α at level i and Factor Β at
level j,
Uˆ1 *    Uˆ i *
 
Vˆ1 *    Vˆj * 
 
 Aˆbase 

Dij  D̂ ij = Bˆ 
 base . (4.1)
Cˆ1 
 
 
 
Cˆ p 
 
In particular, for both factors at level 1,

0 
0 
 
 Aˆ 
 base 
D 11  D̂11 =  Bˆ  .
base
 
ˆ
 C1 
 
 
Cˆ p 
For Factor Α at level i and Factor Β at level 1,
Uˆ 1 *    Uˆ i *
 
0 
ˆ 
ˆ
D i1  D  Abase 
i1 = ˆ .
Bbase
 
Cˆ1 
 
 
Cˆ p 
 
For Factor Α at level 1 and Factor Β at level j,
0 
ˆ ˆ 
V1 *    V j *
ˆ 
 Abase 
D1 j  D̂1 j =  Bˆ .
 base 
Cˆ1 
 
 
ˆ 
Cp 
Now, let
Uˆ i  Uˆ 1 * Uˆ 2 *    Uˆ i *
and
Vˆ j  Vˆ1 * Vˆ2 *    Vˆ j * .
Because all the random variables are nonnegative, if i' ≤ i and j' ≤ j,
ˆ D
P (D ˆ )  1;
i' j' ij
D i ' j '  st D ij .
It follows further that if i' ≤ i and j' ≤ j,
U i ' j '  U ij 
   st  ,
Vi ' j '  Vij 
and for all i and j,
ˆ
 Abaseij   Abase 
  ˆ 
 baseij   Bbase 
B
C1   Cˆ1  .
 ij   
 
  
   
 ij  Cp 
Cp  ˆ
 
The double subscripts indicate that the duration of a process at factor

levels i' and j' is not necessarily the same random variable as the duration
of the same process at factor levels i and j, even if neither factor
influences the process.
Monotonic reaction time means
Suppose the processes in an AND or OR task network are U, V, Abase,

Bbase, C1,..., Cp. Let a list of numerical values of the durations of these
processes be u, v, abase, bbase, c1,..., cp. The reaction time is a function of
these, t(u, v, abase, bbase, c1,..., cp). Clearly, function t is monotonically
increasing.
Suppose expected values of the response time exist at all
combinations of levels of the factors. Consider levels i' and i of Factor A
with i' ≤ i. Given a level j of Factor B, because t is monotonically
increasing and Di'j ≤st Dij, it follows immediately that E(t(Di'j) ≤ E(t(Dij).
That is, for a fixed level of Factor B the expected value of the response
time is monotonically increasing with the level of Factor A. Similarly,
for a fixed level of Factor A the expected value of the response time is
monotonically increasing with the level of Factor B.
Interaction contrasts
Let us consider interaction contrasts for two factors selectively

influencing two different processes by increments in an AND network.
The case of an OR network is similar. We derive expressions for the
combined effect of simultaneously prolonging processes A and B in two
steps. We first consider the effect of prolonging A, and then consider the
further effect of prolonging B.
Suppose Factor A is at level i and Factor B is at level j. Consider a
sample
u 
v 
 
abase 
 
b base ,
c1 
 
 
cp 
 
from the common probability space, of values of the random vector
Uˆ ij 
 
Vˆij 
 
 Aˆ baseij 
 .
 Bˆ baseij 
 
Cˆ 1ij 
 
 
Cˆ p 
 
ij
In this sample, the value for the duration of process A when Factor A
is at level i is abase + u. In the sample, a value for the duration of process
A when Factor A is at level 1 is abase. Similarly, for process B, in the
sample the value for the duration of process B when Factor B is at level j
is bbase + v. In the sample, a value for the duration of process B when
Factor B is at level 1 is bbase.
If both factors are at level 1, the total slack for process A is given by
the following equation,
s(A,r) = d(o, r) − d(o, A') − abase − d(A", r), (4.2)

Schweickert (1978). (Recall that A' and A" denote the starting and
ending vertices of process A, respectively.) All expressions in the
equation are for the sample values in the case when both factors are at
level 1. The duration of the longest path from the starting vertex, o, to
the terminal vertex, r, is d(o, r). The duration of the longest path with
process A on it, going from o to r, is d(o, A') + abase + d(A", r). If process
A is on the longest path from o to r, then the total slack for A is 0.
Otherwise, the total slack for A is the difference in the durations of these
two paths.
When both factors are at level 1, the reaction time is t11 = d(o, r). For
a real number x, let [x]+ = max{0, x}. When Factor A is at level i and
Factor B is at level 1, the reaction time is ti1 = t11 + [u − s(A, r)]+.
Prolonging process A by u may change the total slack for process B.
Similar to Equation (4.2), when Factor A is at level i and Factor B is at
level 1, the total slack for process B is
si1(B, r) = di1(o, r) − di1(o, B') − bbase − di1(B", r). (4.3)
Here the subscript i1 indicates the values are for the case when Factor A
is at level i and Factor B is at level 1. The total slack for process B when
process A is prolonged by u depends on whether processes A and B are
concurrent or sequential.
Concurrent processes
Suppose processes A and B are concurrent. Then process A is not on the

longest path from o to the starting vertex of B, nor on the longest path
from the ending vertex of B to r. The only term in the equation above
that may change when process A is prolonged by u is di1(o, r).
Substituting di1(o, r) = ti1 = d(o, r) + [u − s(A, r)]+, we find
si1(B, r) = d(o, r) + [u − s(A, r)]+ − d(o, Bʹ) − bbase − d(B", r)

= s(B, r) + [u − s(A, r)]+.
We are now in a position to consider the effect of prolonging process

B by amount v after process A has already been prolonged by amount u.
The reaction time when both processes are prolonged is
tij = ti1 + [v − si1(B, r)]+

= t11 + [u − s(A, r)]+ + [v − s(B, r) − [u − s(A, r)]+]+
= t11 + max{[u − s(A, r)]+,[v − s(B, r)]+},
(Schweickert, 1978). It follows that when Factor Α is at level 1 and

Factor Β is at level j,
t1j = t11 + [v − si1(B, r)]+.
After a little algebra, the interaction contrast is
h = tij − ti1 − t1j − t11

= − min{[u − s(A, r)]+,[v − s(B, r)]+}. (4.4)
Because s(A, r) and s(B, r) are functions of abase, bbase, c1,..., cp, we can
write h as a function h(u, v, abase, bbase, c1,..., cp).
Let us consider the effect on h of changing the prolongations of A and
B. If u = v = 0, then h has value 0. For fixed values of abase, bbase, c1,...,
and cp, function h is monotonically decreasing in u and v.
From Equation (4.1),
U 11  0
V   0 .
 11   
Because
0  U ij 
0  st  ,
  Vij 
0 = EU11,V11(h(U11, V11, Abase11, Bbase11, C111,...,Cp11)|abase, bbase, c1,..., cp)

> EUij,Vij(h(Uij, Vij, Abaseij, Bbaseij, C1ij,...,Cpij)|abase, bbase, c1,...,cp).
(To avoid double subscripts in the expected value symbol, U11 is written
as U11, and so on.) On each side of the inequality the random variables
conditionalized on have the same joint distribution for all levels of the
factors, namely the joint distribution of Aˆbase , Bˆ base , Cˆ 1, , Cˆ p . By taking
expected values on both sides of the inequality over the random variables
conditionalized on, it follows that 0 > E(h(U, V, Abase, Bbase, C1,..., Cp).
Hence, for all i and j, the interaction contrast h is 0 or negative (see also
Schweickert, 1978).
Factors selectively influencing two concurrent processes by
increments will ordinarily have a negative interaction. Exactly additive
effects are impossible except under extraordinary conditions, such as a
factor having no effect (see Schweickert & Townsend, 1989; Townsend
& Ashby, 1983; Townsend and Schweickert, 1989).
Further, because for i' ≤ i and j' ≤ j,
U i ' j '  U ij 
   st  ,
Vi ' j '  Vij 
reasoning similar to that above shows that
E(h(Ui'j', Vi'j', Abasei'j', Bbasei'j', C1i'j',..., Cpi'j'))

> E(h(Uij, Vij, Abaseij, Bbaseij, C1ij,..., Cpij)).
That is, the interaction contrasts are monotonically decreasing in i and j.

Analogous reasoning starting with Equation (4.4) shows that
(AB)ij = E(h(Uij, Vij, Abaseij, Bbaseij, C1ij,..., Cpij)

≥ −E[min{[Uij − S(A, r)]+,[Vij − S(B, r)]+}]
≥ −E[[Vij − S(B, r)]+]
= −E[T1j] + E[T11]. (4.5)
That is, for every given value of j, (AB)ij is bounded from below.
Likewise, for ever given value of i, (AB)ij is bounded from below.
Sequential processes
Suppose processes A and B are sequential, and process A precedes

process B. Recall from Equation (4.3) when Factor A is at level i and
Factor B is at level 1, the total slack for process B is
si1(B, r) = di1(o, r) – di1(o, B') – bbase – di1(B", r),
where the subscript i1 on a quantity indicates the value calculated from

the sample with Factor A at level i and Factor B at level 1.
We saw earlier that di1(o, r) = ti1 = d11(o, r) + [u − s11(A, r)]+. The
value of di1(o, B') is the duration of the longest path from o to B', the
starting vertex of process B. The value can be found as follows. The
only processes relevant for paths from o to B' are processes preceding B'.
Remove from the network all processes that do not precede B'. In the
remaining network, the terminal vertex is B', so in the remaining network
when both factors are at level 1 the total slack for process A is the slack
from A to B', that is, s11(A, B'). Hence, if process A is prolonged by u,
the duration of the longest path from o to B' increases by [u − s11(A,
B')]+. It follows that di1(o, B') = d11(o, B') + [u − s11(A, B')]+. Now
return to our original network and note that because process A precedes
process B, the duration bbase does not depend on the duration of process
A, and neither does the duration of the longest path between B" and r.
Then Equation (4.3) can be written
si1(B, r)=d11(o, r)+[u–s11(A, r)]+–d11(o, B')–[u–s11(A,B')]+–bbase– d11(B", r)

= s11(B, r) + [u – s11(A, r)]+ – [u – s11(A, B')]+.
Then
tij = ti1 + [v – si1(B, r)]+

= t11+[u− s11(A, r)]++[v − s11(B, r) – [u – s11(A, r)]+ + [u – s11(A, B')]+]+.
After a little algebra, the interaction contrast is
h = [v–s11(B, r) – [u–s11(A, r)]+ + [u–s11(A, B')]+]+– [v–s11(B, r)]+ (4.6)
(Schweickert, 1978).
When u = v = 0, h is 0. When u and v are each so large as to
overcome all the slacks, the interaction contrast is h = s11(A, r) − s11(A,
B'), a constant not dependent on u and v. This constant is the coupled
slack between A and B,
k(A, B) = s(A, r) − s(A, B') = d(o, r) − d(o, B') − d(A", r) + d(A", B').
Because the interaction is contrast is complicated, its values in various

conditions are in Table 4.1.
As in the situation when processes A and B are concurrent, because
s(A, r) and s(B, r) are functions of a, b, c1,..., cp, we can write h as a
function h(u, v, a, b, c1,..., cp).
There are three possibilities to consider for the arrangement of
sequential processes A and B. The first is that A and B are not on
opposite sides of a Wheatstone bridge. In that case, only the top half of
Table 4.1 is relevant, and for fixed values of abase, bbase, c1,..., and cp,
function h is monotonically increasing in u and v. Reasoning as in the
situation when processes A and B are concurrent, from Equation (4.1),
U 11  0
V   0 .
 11   
Because
0 U ij 
0  st  ,
  Vij 
0 = EU11,V11(h(U11, V11, Abase11, Bbase11, C111,...,Cp11)|abase, bbase, c1,...,cp)

≤ EUij,Vij(h(Uij, Vij, Abaseij, Bbaseij, C1ij,...,Cpij)|abase, bbase, c1,...,cp).
Then, by taking expected values with respect to the random variables

conditionalized on,
0  E(h(U, V, Abase, Bbase, C1,..., Cp)).
That is, the expected value of the interaction contrast is nonnegative for
all levels of the factors.
Further, because for i' ≤ i and j' ≤ j,
U i ' j '  U ij 
   st  ,
Vi ' j '  Vij 
similar reasoning shows
E(h(Ui'j', Vi'j', Abasei'j', Bbasei'j',C1i'j',..., Cpi'j'))

≤ E(h(Uij, Vij, Abaseij, Bbaseij, C1ij,..., Cpij)).
That is, the expected values of the interaction contrasts are

monotonically increasing in i and j.
The second possibility for sequential processes A and B is that they
are on opposite sides of a Wheatstone bridge, but there is no path from
the starting vertex of the network, o, to the terminal vertex of the
network, r, unless the path contains either A or B. We say A and B are on
opposite sides of an incomplete Wheatstone bridge. In that case, only the
lower half of Table 4.1 is relevant, and for fixed values of abase, bbase,
c1,..., and cp, function h is monotonically decreasing in u and v. For the
second possibility, reasoning as above leads to the conclusions that the
expected values of the interaction contrasts are nonpositive, and
monotonically decreasing as the factor levels increase.
The third possibility for sequential processes A and B is that they are
on opposite sides of a Wheatstone bridge, and there is a path from o to r
with neither A nor B on the path. We say A and B are on opposite sides
of a complete Wheatstone bridge. In that case, the top and bottom of
Table 4.1 are both relevant. This case is complicated because the
function h is monotonically increasing in u and v for some values of the
other arguments and monotonically decreasing for other values of these
arguments. Consequentially, the expected value of the interaction
contrast may change signs when the factor levels change, and need not
change monotonically with the factor levels. The sign of the interaction
contrast is not informative for this complete Wheatstone bridge case, but,
the structure can be revealed through the use of long prolongations, as
explained in Chapter 3.
OR networks
Effects of factors selectively influencing processes by increments in OR

networks are analogous to effects in AND networks. It is convenient to
consider the factors to be at their lowest levels when the process
durations are at their shortest, and to consider an increase in a factor level
as increasing the difficulty of a process, and hence increasing the
duration of the process. With this ordering of the levels, mean reaction
time is monotonically increasing with factor levels.
Table 4.1
Effects of Prolonging Sequential Processes A and B
in a Critical Path Network
k = s(A,r) – s(A,Bʹ ) = s(B,r) – s*(B,Aʺ) ≥ 0

u v ti1 – t11 t1j – t11
u ≤ s(A,Bʹ ) ≤ s(A,r) v ≤ s*(B,Aʺ) ≤ s(B,r) 0 0
s(A,Bʹ ) ≤ u ≤ s(A,r) v ≤ s*(B,Aʺ) ≤ s(B,r) 0 0
s(A,Bʹ ) ≤ s(A,r) ≤ u v ≤ s*(B,Aʺ) ≤ s(B,r) u – s(A,r) 0
u ≤ s(A,Bʹ )≤ s(A,r) s*(B,Aʺ) ≤ v ≤ s(B,r) 0 0
s(A,Bʹ ) ≤ u ≤ s(A,r) s*(B,Aʺ) ≤ v ≤ s(B,r) 0 0
s(A,Bʹ ) ≤ s(A,r )≤ u s*(B,Aʺ) ≤ v ≤ s(B,r) u – s(A,r) 0
u ≤ s(A,Bʹ ) ≤ s(A,r) s*(B,Aʺ) ≤ s(B,r) ≤ v 0 v – s(B,r)
s(A,Bʹ ) ≤ u ≤ s(A,r) s*(B,Aʺ) ≤ s(B,r) ≤ v 0 v – s(B,r)
s(A,Bʹ ) ≤ s(A,r) ≤ u s*(B,Aʺ) ≤ s(B,r) ≤ v u – s(A,r) v – s(B,r)
k = s(A,r) – s(A,Bʹ ) = s(B,r) – s*(B,Aʺ) ≤ 0

u v ti1 – t11 t1j – t11
u ≤ s(A,r) ≤ s(A,Bʹ ) v ≤ s(B,r) ≤ s*(B,Aʺ) 0 0
s(A,r) ≤ u ≤ s(A,Bʹ ) v ≤ s(B,r) ≤ s*(B,Aʺ) u - s(A,r) 0
s(A,r) ≤ s(A,Bʹ ) ≤ u v ≤ s(B,r) ≤ s*(B,Aʺ) u - s(A,r) 0
u ≤ s(A,r) ≤ s(A,Bʹ ) s(B,r) ≤ v ≤ s*(B,Aʺ) 0 v – s(B,r)
s(A,r) ≤ u ≤ s(A,Bʹ ) s(B,r) ≤ v ≤ s*(B,Aʺ) u - s(A,r) v – s(B,r)
s(A,r) ≤ s(A,Bʹ ) ≤ u s(B,r) ≤ v ≤ s*(B,Aʺ) u - s(A,r) v – s(B,r)
u ≤ s(A,r) ≤ s(A,Bʹ ) s(B,r) ≤ s*(B,Aʺ) ≤ v 0 v – s(B,r)
s(A,r) ≤ u ≤ s(A,Bʹ ) s(B,r) ≤ s*(B,Aʺ) ≤ v u - s(A,r) v – s(B,r)
s(A,r) ≤ s(A,Bʹ ) ≤ u s(B,r) ≤ s*(B,Aʺ) ≤ v u - s(A,r) v – s(B,r)

tij – t11 tij – ti1 – t1j – t11 bias b

0 0 –k
0 0 –k
u – s(A,r) 0 –k
0 0 –k
+ +
[v–s(B,r) + u–s(A,Bʹ )] [v–s(B,r) + u–s(A,Bʹ )] max{u–s(A,r)+v–s(B,r),–k}
u–s(A,r) + v – s*(B,Aʺ) v – s*(B,Aʺ) v – s(B,r)
v – s(B,r) 0 –k
v – s(B,r) + u – s(A,Bʹ ) u – s(A,Bʹ ) u – s(A,r)
u – s(A,r) + v – s(B,r) + k k 0
tij – t11 tij – ti1 – t1j – t11 bias b

0 0 –k
u – s(A,r) 0 –k
u – s(A,r) 0 –k
v – s(B,r) 0 –k
max{u–s(A,r), v–s(B,r)} max{s(A,r)–u, s(B,r)– v} max{s(A,Bʹ )–u, s*(B,Aʺ)–v}
u – s(A,r) s(B,r) – v s*(B,Aʺ) – v
v – s(B,r) 0 –k
v – s(B,r) s(A,r) – u s(A,Bʹ ) – u
u – s(A,r) + v – s(B,r) + k k 0
Note: Process A precedes B. Reaction time is t11 when neither is prolonged, ti1 when A is
prolonged by u, t1j when B is prolonged by v, tij when both are prolonged. Coupled slack
k(A, Bʹ) is denoted k.
Slack in an OR network can be defined in terms of a more intuitive

quantity, surplus. Recall that in an OR network the reaction time equals
the sum of the durations on the shortest path from the starting vertex of
the network to its ending vertex. Durations of shortest paths (geodesics)
between vertices are of interest, and d(p, q) denotes the duration of the
shortest path from vertex p to a vertex q which follows it. Suppose there
is a path from the ending vertex of a process A to a vertex p. The surplus
from A to p is the longest amount of time by which the duration of A can
be shortened without decreasing the duration of the shortest path from o
to p, that is, without A becoming an arc on the shortest path from o to p.
The surplus from A to p is
− d(o, p) + d(o, Aʹ) + d(A) + d(A", p).
The negative value of the surplus from A to p is the slack from A to p,
s(A, p) = d(o, p) − d(o, Aʹ) − d(A) − d(A", p).
The equation for slack has exactly the same form in OR and AND
networks; the difference is that the symbol d(p, q) is interpreted as the
duration of the shortest path between p and q in an OR network and of
the longest path in an AND network. The coupled slack between process
A and a process B following it has the same form in OR and AND
networks,
k(A, B) = s(A, r) − s(A, Bʹ).
Consider an OR network with all processes at the shortest durations

used in the experiment (the baseline network). Suppose process A is
prolonged by the nonnegative quantity u. The increase in reaction time
is u minus the surplus from A to r; in terms of slack the increase in
reaction time is [u + s(A, r)]+.
Suppose process A precedes process B. Now suppose the duration of
process A is increased by the nonnegative quantity u and the duration of
process B is increased by the nonnegative quantity v. The interaction
contrast is
h = − [v + s11(B, r) −[u + s11(A, r)]+ + [u + s11(A, Bʹ)]+]+ + [v + s11(B, r)]+,

(see Schweickert & Wang, 1993). Comparing this equation with

Equation (4.6) for AND networks, for interpretation of the interaction
contrast of reaction times, the only change from AND networks is in the
sign.
If two factors selectively influence two concurrent processes by
increments, the interaction contrast is nonnegative. The interaction
contrast is monotonically increasing with the factor levels. For
sequential processes not on opposite sides of a Wheatstone bridge, the
interaction contrast is nonpositive. The interaction contrast is
monotonically decreasing with the factor levels. For sequential
processes on opposite sides of an incomplete Wheatstone bridge, the
interaction contrast is nonnegative. The interaction contrast is
monotonically increasing with the factor levels. For sequential processes
on opposite sides of a complete Wheatstone bridge, the interaction
contrast may be negative, zero or positive. For sequential processes in
all arrangements, as the levels of the factors increase, the interaction
contrast approaches a constant, namely the expected value of the coupled
slack. For details, see Schweickert and Wang (1993).
Chapter 5
Critical Path Models of Dual Tasks

and Locus of Slack Analysis
In point of fact the three actions of perceiving,

determining, and responding were sequential; but so
infinitesimal were the intervals of time between them
that they seemed simultaneous. Jack London, The call
of the wild.
When two stimuli are presented close together in time, and each requires
a response, the response time to the second is usually longer than if it
were presented alone (Telford, 1931). The delay is called the
psychological refractory period (PRP). The underlying mental
architecture has been probed by hundreds of experiments; we see fine
detail in places, elsewhere even the broad outlines are faint. At this point,
resolution is best at the stimulus end of the system, coarser at central
processing, and roughest at the response end, where processing seems
most complex. This chapter does not have space for everything we
know, it focuses on how we know it, in particular, on how structure is
revealed by selectively influencing processes. For more on attention see
Johnson and Proctor (2004).
Critical Path Network Models of Dual Tasks
Part of a subject’s preparation for a task is setting up the processing to be

used. There is no reason to expect the set up to always be the same
(Meyer & Kieras, 1997a, 1997b). Nonetheless, many models of
93
processing are various forms of critical path networks. Two important

models are not. First, in the Executive Process Interactive Control
(EPIC) model of Meyer and Kieras (1997a, 1997b) processing for each
task in a dual task is sequential. But scheduling of processes from
different tasks is flexible; for example, the subject has the option of
simultaneously executing the central processes of each task, or of doing
central processing for one task before that of the other. In many
situations, processing in the EPIC model can be represented with a
mixture of critical path networks. Predictions about factors selectively
influencing processes are often straightforward. For example, if two
processes are concurrent in every critical path network in the mixture,
then two factors selectively influencing them will simply behave as two
factors selectively influencing concurrent processes.
Second, in the queuing network model of Liu (1996), processing is
done at servers, which send output to other servers for further processing.
The network is not acyclic; a server late in the system can send output to
a server earlier in the system. Each server has a queue for temporarily
storing entities waiting for processing. The model is applicable widely;
for example, the queues are a form of working memory and the
proportion of time a server is busy is a predictor of the blood oxygen
level dependent (BOLD) signal (Wu & Liu, 2008). When applied to dual
tasks (Wu & Liu, 2008), the queuing network often takes the form of a
critical path network. Time at a server can be expressed as time in queue
plus time processing. For calculating this and other quantities, a queuing
network can often be represented as an Order-of-Procesing diagram, see
Chapter 8. For the relation of the Queing Network-Model Human
Processor to other architectures, see Liu, Feyen and Tsimhoni (2006).
Central limitations
For over 50 years, the predominant, but always challenged, hypothesis

has been that the psychological refractory period is due “to some phase
of the two reactions not being able to overlap” (Welford, 1952, p. 2, his
italics). Welford surmised that the refractory delay is caused by
limitations of a central mechanism, rather than by sensory or response
Critical Path Models of Dual Tasks and Locus of Slack Analysis 95
limitations. The central limitations when there are two stimuli are due
either to a single central channel that can process only one stimulus at a
time (Welford, 1952, 1967), or to capacity constraints which slow their
simultaneous processing (Broadbent, 1958). In Welford’s single channel
theory, if a second signal arrives while the central mechanism is busy,
that signal must wait, causing the refractory delay.
Welford (1952) proposed an easily tested version of a single channel
model with sequential central processing of the stimuli. He proposed
estimating the central processing time of the first stimulus by its entire
reaction time. Let RT1 denote the reaction time to stimulus 1 (implicitly
in this model, this time is the same for both the single and dual task
conditions). Let RT2 (single) denote the response time to stimulus 2
when presented alone, and RT2 (dual) denote the reaction time to
stimulus 2 presented in the dual task condition. Let I denote the
interstimulus interval (now called SOA). Then
RT2(dual) = RT2(single) + RT1 − I, if RT1 > I,

RT2(dual) = RT2(single), if RT1 < I, (5.1)
see Figure 5.1. The resulting model explains many aspects of the data,
but is wrong in several details (e.g., Ollman, 1968. For discussion, see
Luce, 1986 and Schweickert & Boggs, 1984).
Fig. 5.1. Welford’s Single Channel Model. Time is on the horizontal axis. Physical
onsets of the first and second stimuli are denoted s1 and s2, physical onsets of responses
to them are at times r1 and r2, respectively.
Davis (1957) formulated a model in which only part of the

processing of the first stimulus contributed to the refractory delay. The
model was ahead of its time; no good way was known to test it. In it,
each stimulus requires sensory, central, and motoric processes; these are
sequential for each stimulus. The central processing of the two stimuli
must be sequential, and the central processing of a stimulus is followed
by a refractory period during which no other central processing can take
place. Other processing is concurrent, so, for example, sensory
processing of stimulus 2 can proceed concurrently with central
processing of stimulus 1. Davis presented the model as a Gantt chart, see
Figure 2.1. The corresponding critical path network, with notation of
Pashler and Johnston (1989), is in Figure 5.2. Davis worked out a
formula for the magnitude of the refractory delay, taking slack into
account. To test the model, he used estimates from the literature of the
durations of individual processes. At that time (as now) such durations
are not known very accurately, and his tests were not seen as conclusive.
Later tests using factors to selectively influence processes are far more
satisfactory.
Fig. 5.2. Dual task model of Davis (1957), with current notation. Also called Response
Selection Bottleneck model and Single Central Bottleneck model. Stimulus 1 and
stimulus 2 are presented at s1 and s2. Sensory, central and motor preparation processes
for s1 are denoted A1, B1, and C1; analogous notation for s2. Central refractory times
(currently often called switching times) occur after central processing of Task 1 and of
Task 2, denoted SWa and SWb, respectively. Responses to s1 and s2 are at r1 and r2. For
a single trial, SWb is ignored, and SWa is denoted SW. If another dual task trial follows,
SWb must finish before new central processing starts.
Response limitations
An early alternative to the central processing bottleneck hypothesis is

that the source of the refractory delay is response processing of the two
tasks. In the response interdiction model of Keele (1973), two responses
cannot be initiated at the same time. In the response conflict model
(Reynolds, 1964; Herman & Kantowitz, 1970), two responses can be
prepared simultaneously, but the greater the conflict in the prepared
movements, the slower the simultaneous preparation. These two models
differ on the same issue as the models of Welford (1952) and Broadbent
(1958), but the issue is raised this time about processing later in the
system. Is the delay caused because only one of two needed processes
can be executed at a time, or because both are executed at the same time,
but more slowly than when executed alone? If there is a choice,
ordinarily it is faster overall to devote all the capacity to one process and
then devote all capacity to the other (see Conway, Maxwell & Miller,
1967; Miller, Ulrich & Rolke, 2009; Tombu & Jolicoeur, 2003). In other
words, sequential processing may be a better scheduling strategy than
concurrent processing. The explanation is straightforward, see the
Appendix.
Both central and response limitations
The refractory delay might be caused by both central and response

limitations. Double bottleneck models were proposed by Logan and
Burkell (1986) and by de Jong (1993) for the psychological refractory
period and by Schweickert (1978) for a dual Stroop-like task of
Greenwald (1972). The model of de Jong (1993) is in Figures 5.3 and
5.4. There are two sources of the refractory delay: Only one central
process can go on at a time and only one response can start at a time.
(Figures 5.3 and 5.4 combine the end of the first central process and the
start of the first response; logically these could be separate events.)
In the response interdiction model of Keele (1973), the only
constraint is that the two responses cannot start at the same time. As de
Jong (1993) points out, if that were the case a factor selectively
influencing sensory processing for Task 1 (A1) and a factor selectively
influencing central processing for Task 1 (B1) would each interact in

exactly the same way with SOA. But typically, factors selectively
influencing A1 have negative (“overadditive”) interactions with SOA,
while factors selectively influencing B1 are additive with SOA. Details
are below, see also de Jong’s 1993 paper. Rarely, if ever, is response
interdiction the only constraint.
Fig. 5.3. de Jong double bottleneck model with Refractory Interval between start of
response 1 and start of response 2. Other notation as in Figure 5.2. (Based on de Jong,
1993, Figure 1.)
Fig. 5.4. de Jong double bottleneck model as a critical path network. (Refractory
Interval RI is not the same as response 2, r2.) For a later discussion of Karlin &
Kestenbaum, assign durations A1 = 40, B1 = 150, C1 = 200, SW = 49, RI = 200, SOA =
90 or 1150, A2 = 110, B2 = 100 or 190, C2 = 200.
Selective Influence of Processes in Dual Tasks
Viewed from a distance, evidence that certain experimental factors

selectively influence processes in critical path networks is substantial and
extensive. Up close, experiments rarely satisfy predictions perfectly, so

reasonable objections arise at every turn. At this time it is not clear
which prediction failures are signals about genuine problems and which
are noise. It is worth seeing what has been constructed so far, keeping in
mind that parts will need to be replaced.
A minimally satisfactory critical path network model of a dual task
requires at least seven processes: Perceptual, central, and response
processes for each task, and the SOA. There are 21 process pairs to test
for sequential versus concurrent arrangement. Experiments have
concentrated on a little over half a dozen of these, the manipulation most
often used being of the SOA. A list of factors influencing processes is in
Table 5.1. It is quite incomplete. For some pairs of processes so many
experiments have been done that it is impossible to discuss all; other
process pairs are unexplored.
We now survey evidence that factors selectively influence processes
by increments in dual tasks. The survey is not intended to be read as a
narrative. It is a catalog of empirical findings and sections can be read
independently. Patterns predicted are that mean reaction times increase
monotonically with factor levels, interaction contrasts all have the same
sign, and are themselves monotonic with factor levels. If two factors
selectively influence two processes by increments, resulting interaction
contrasts are negative for two concurrent processes and nonnegative for
two sequential processes not on opposite sides of a Wheatstone bridge.
For two processes on opposite sides of a Wheatstone bridge interaction
contrasts can be of any sign, and will be discussed case by case as they
occasionally arise. The various predictions are presented more fully in
Chapter 3. Here when we say a factor selectively influences a process
we mean selective influence by increments; that is, levels of a factor are
ordered, and the duration of a factor at level i + 1 is its duration at level i
plus a nonnegative random variable (see Chapter 3).
Sensory and Central Processes
We begin with the early, better understood part of the system.

Table 5.1
Some Factors Influencing Processes in Dual Tasks
Process Factor Reference

B1 number of Task 1 Karlin & Kestenbaum (1968)
alternatives Smith (1969)
Greenwald (1972)
Schweickert (1983a)
discriminability Johnston & McCann (2006)
memory set size Ehrenstein, et al. (1997)
target presence/absence Ehrenstein, et al. (1997)
MT1 hand movement distance Ulrich, et al. (2006)

A2 s2 intensity or contrast Pashler (1984)
Pashler & Johnston (1989)
Jentzsch, et al. (2007)
B2 target presence/absence Pashler (1984)
s2 repetition Pashler & Johnston (1989)
number of Task 2 Karlin & Kestenbaum (1968)
alternatives de Jong (1993)
Greenwald (1972)
Schweickert (1983a)
Van Selst & Jolicoeur (1997)
mental rotation degree Ruthruff, et al. (1995)
s2 discriminability Johnston & McCann (2006)
subtraction difficulty Ehrenstein, et al. (1997)
s2-r2 compatibility Jentzsch, et al. (2007)
C2 hand vs. foot de Jong (1993)
conflict resolution Stroop conflict Greenwald (1972)
Schweickert (1983a)
Notes: For SOA, see text. Evidence in most cases supports selective influence of the
factor on the process listed, but see text for details.
Central Processing in Task 1 and SOA (B1,SOA)
In most experiments the second stimulus is presented abruptly, and quite

likely interrupts processing of the first stimulus. It may be less time
consuming to delay the start of a process than to have it interrupted, so
central processing for Task 1 may wait until stimulus 2 is presented, that
is, follow the SOA. However, data indicate that central processing of
Task 1 was concurrent with SOA in experiments by Karlin and
Kestenbaum (1968) and by M. C. Smith (1969).

Their experiments also test the expectancy hypothesis: If the first
stimulus is followed rapidly by the second, the subject is not expecting
the second, and is not adequately prepared. Karlin and Kestenbaum
(1968) and Smith (1969) reasoned that if inadequate expectation of the
second stimulus were the only cause of the refractory delay, then the
delay would be sensitive to the values of the interstimulus interval, but
not sensitive to the time for central processing of the first stimulus.
In their experiments, central processing of Task 1 was prolonged by
increasing the number of alternatives for the first stimulus, thus
increasing the amount of information in the Task 1 decision. Reaction
time to the second stimulus increased as the number of alternatives for
the first stimulus increased. This shows that expectancy is not the sole
source of the refractory delay. Further, the number of alternatives for the
first stimulus and the interstimulus interval interacted in the pattern of
factors selectively influencing concurrent processes. Here are details.
Karlin and Kestenbaum

In the influential experiment of Karlin and Kestenbaum (1968), a
warning tone was followed, after a uniformly distributed delay, by the
first imperative stimulus, a visually presented digit. The subject
identified the digit by pressing a button with fingers of the left hand. The
digit was followed, after a uniformly distributed interstimulus interval,
by the second imperative stimulus, a tone. (More recently,
“interstimulus interval” has been replaced by the more precise term
“stimulus onset asynchrony.”) The subject identified the tone by
pressing a button with fingers of the right hand.
The number of alternatives for the first stimulus was 1, 2 or 5. There
were 12 interstimulus intervals from 90 to 1150 msec. The number of
alternatives for the second stimulus was 2 in the first phase of the
experiment. In later sessions with the same subjects, conditions with 1
alternative for the second stimulus and 1 or 2 alternatives for the second
stimulus were added. Results about the number of alternatives for the
second stimulus have been much discussed, but did not replicate in
experiments by Van Selst and Jolicoeur (1997). They will be considered
in the later section on central processing for Task 2.

Reaction time to the first stimulus (RT1) increased with the number of
alternatives for the first stimulus, but did not change with the
interstimulus interval (Karlin & Kestenbaum, 1968, Figure 4). This
shows that the interstimulus interval did not precede the Task 1 response,
and hence did not precede central processing for Task 1. (This part of
the experiment by Karlin and Kestenbaum was not included in the
replication by Van Selst and Jolicoeur, 1997.)
Reaction time to the second stimulus (RT2) is the time from second
stimulus presentation to its response. Mean RT2 increased as the number
of alternatives for Task 1 increased. Mean RT2 decreased monotonically
as the interstimulus interval increased. Any critical path network model
in which there is an arc of duration SOA from stimulus 1 (s1) to stimulus
2 (s2) makes this prediction. Let r1 and r2 denote the responses to s1
and s2, respectively, regardless of the order in which the responses are
made. Let t2(SOA) be the time from s1 to r2 when the Stimulus Onset
Asynchrony is equal to SOA. Then t2(SOA) = t2(0) + [SOA − s(SOA,
r2)]+. Now let RT2(SOA) be the time from s2 to r2, i.e., the reaction time
to the second stimulus. Then RT2(SOA) = t2(SOA) − SOA = t2(0) +
[SOA − s(SOA, r2)]+ − SOA. Hence, RT2(SOA) is a monotonically
decreasing function of SOA. This derivation also shows that the time
from s1 to r2, denoted t2, increases monotonically with SOA; this
prediction is easily verified, although not directly seen, in the data of
Karlin and Kestenbaum (1968). It is sometimes claimed that the
expected value of RT2 is predicted to decline linearly as SOA increases,
with a slope of -1. This is approximately correct, but not exactly,
because the term [SOA − s(SOA, r2)]+ is nonlinear.
Reaction times to the second stimulus give clear evidence of slack for
the interstimulus interval. Karlin and Kestenbaum (1968) found that
when the interstimulus interval increased from 90 to 190, an increase of
100 msec, the time from s1 to r2 increased by only 43 msec. The
remainder of the 100 msec, 57 msec, is slack. Although this is a rough
estimate, the small amount is evidence against Welford’s model
(Equation (5.1)) in which the slack is predicted to be equal to RT1, 234
msec.
Finally, if the number of alternatives for Task 1 selectively influences

a process concurrent with the interstimulus interval, the corresponding
interaction contrasts are predicted to be negative. This can be verified
informally; in Figure 1 of Karlin and Kestenbaum (1968), the effect of
the number of s1 alternatives decreases monotonically as SOA increases.
The negative interaction contrasts might indicate that the prolonged
processes are on opposite sides of the central arc in a Wheatstone bridge.
A Wheatstone bridge is not possible. In it, either (1) the decision about
the first stimulus precedes the ISI, or (2) the ISI precedes the decision.
Order (1) is logically impossible, because the decision about stimulus 1
cannot precede the presentation of stimulus 1, which starts the ISI. Order
(2) requires that stimulus 2 is always presented before the decision about
stimulus 1 is made. This cannot happen on all trials, because in some
conditions the interstimulus interval is longer than RT1; that is, the
second stimulus is presented after the Task 1 response is made.
Despite problems, the data clearly indicate that central processing of
stimulus 1 is concurrent with the interstimulus interval. The conclusion
is confirmed by the experiment by Smith (1969).
M. C. Smith
In Smith’s (1969) experiment, the first stimulus was a digit from 1 to 4
on either a red or green background. The subject identified the stimulus
by pressing a button; each finger indicated a different digit, fingers on
the left hand indicated the red background, those on the right indicated
the green. The second stimulus was a 1 or a 2 on a grey background; the
subject identified the digit vocally. There were two factors. The
interstimulus interval was varied as was the number of alternatives for
the first stimulus.
Let RT1 and RT2 be the reaction times to the first and second stimuli,
respectively. The data, from Figures 1 and 2 of Smith (1969), are
presented in Table 5.2. The response time to the second stimulus
increases with the number of alternatives for the first, as found by Karlin
and Kestenbaum (1968). Data again support the hypothesis that
inadequate expectancy of the second stimulus is not the only cause of the
refractory delay.
The first two rows in Table 5.2 for RT2 + ISI again provide a clear
demonstration of slack. When the interstimulus interval was increased
by 100 msec, from 50 to 150 msec, the increase in response time RT2
was much less than 100 msec. Specifically, the increase in RT2 was 15
msec when there are 2 alternatives, and only a few msec when there are 4
or 8 alternatives. Most of the 100 msec increase in the interstimulus
interval is expended in overcoming the slack.
Table 5.2
Reaction Times from Smith (1969)
RT1 Stimulus 1 Alternatives

ISI 2 4 8
50 480 594 666
150 459 582 639
300 444 561 635
500 459 571 628
RT2 Stimulus 1 Alternatives
ISI 2 4 8
50 615 716 785
150 530 619 686
300 450 509 589
500 413 429 463
RT2+ISI Stimulus 1 Alternatives
ISI 2 4 8
50 665 766 835
150 680 769 836
300 750 809 889
500 913 929 963
Interaction Contrasts from Smith (1969)
RT2 Number of S1 Alternatives

ISI 2 4 8
50 - - -
150 - -12 -14
300 - -42 -31
500 - -85 -120
The observed reaction times to the first stimulus are monotonically

increasing across the columns, as would be expected if increasing the
number of alternatives prolonged a process in a critical path network.
The interstimulus interval did not have a significant effect on RT1. For
RT2, the reaction times are monotonic in the rows and columns, again as
predicted if increasing the number of alternatives for s1 and increasing
the ISI were prolonging different processes in a critical path network.
Further, as predicted, the interaction contrasts are also monotonic, with
one exception involving the − 31 in the second row (Table 5.2). All
interaction contrasts are predicted to be negative, not just those in Table
5.2, and all are except those involving the cell with − 31 msec. The
negative interactions indicate that the prolonged processes are either
concurrent or on opposite sides of a Wheatstone bridge. A Wheatstone
bridge is implausible for the data of Smith (1969), because it is
impossible for the similar experiment of Karlin and Kestenbaum (1968).
A more likely representation is in the critical path network of Figure
5.2 (Davis, 1957; Pashler & Johnston, 1989). Sensory processing A1 of
s1, and the interstimulus interval (“process” SOA), are initiated
immediately after s1 is presented. Central processing B1 of Task 1 starts
after A1, and is sensitive to the number of s1 alternatives. After process
B1 completes, both the response process C1 for stimulus 1 and the
sequencing process SW (perhaps a dummy process of duration 0) are
initiated. As soon as process C1 is completed, the response r1 to the first
stimulus can be made. After both processes SOA and SW are completed,
central processing B2 for stimulus 2 can be initiated. It is followed by
response processing C2 for Task 2. As soon as process C2 is completed,
the response r2 to the second stimulus is made. Later, in the chapter on
Order-of-Processing Diagrams, the model in Figure 5.2 is fit to the data
of Smith. The pioneering work of Karlin and Kestenbaum (1968) and
Smith (1969) establishes that the process prolonged by manipulating the
ISI and the process prolonged by increasing the number of s1 alternatives
(process B1) are concurrent.
Later work on B1 and SOA

Information theory gives a reason why an increase in the number of
alternatives of a stimulus would increase response time; more

alternatives require processing more bits. Johnston and McCann (2006)
examined the demarcation between perceptual processes and more
abstract central processes such as processing bits. Could a difficult
perceptual manipulation influence central processing? The major
manipulation of Johnston and McCann (2006) was of stimulus
discriminability, intended to selectively influence a process they called
stimulus classification. Their Experiment 1 used two different first tasks,
one auditory, one visual. In Auditory Task 1, a reference tone was
presented, followed by a test tone to be judged as higher or lower than a
reference tone. In Visual Task 1, a reference trapezoid was followed by
a test trapezoid (the trapezoids were shaped like runways). The task was
to judge whether the second trapezoid appeared angled up or angled
down with respect to the first. In each Task 1, the judgment was made
hard or easy by manipulating discriminability. For each Task 1, the
stimulus for Task 2 was a circle with a cross inside. The task was to
judge whether the cross was to the left or right of center. For Task 2, the
judgment was made hard or easy by manipulating discriminability. A
third factor was the SOA.
Regardless of Stimulus 1 modality, the effect of Task 1
discriminability produced the same pattern. The effect of Task 1
discriminability was the same at each SOA for RT1, so processing of
Task 1 discriminability did not wait for presentation of s2. For RT2 the
effect of increasing Task 1 discriminability decreased monotonically as
SOA increased, a negative interaction with SOA (Johnston & McCann,
2006, Figure 2). The form of the interaction indicates that the SOA and
the process selectively influenced by Task 1 discriminability are
concurrent. Johnston and McCann (2006) say the classification
judgments are central processes and the Task 1 classification judgment is
the last Task 1 process to precede the central processing of Task 2.
We discuss two related data sets in later sections. Task 2
discriminability was also manipulated by Johnston and McCann (2006);
its effects are described in a later section on B2. Jentzsch, Leuthold and
Ulrich (2007) found evidence against a delay of central Task 1
processing until s2 is presented; their experiment is described in a later
section where it more naturally fits.
SOA and Task 2 Sensory Processing (SOA, A2)
Several names have been given to the model in Figure 5.2, including the
Single Channel, the Response Selection Bottleneck, the Standard
Bottleneck, and the Central Bottleneck model. As often happens, slight
variations of a model are given the same name, and exactly the same
model is given different names. We usually call the model in Figure 5.2
the Single Central Bottleneck model, to emphasize that it has one
bottleneck, whose function, whatever it may be, is central. Pashler and
Johnston (1989) tested the model through its predictions about
interactions. Their experiment will be described, followed by the
predictions.
In Experiment 1 Pashler and Johnston (1989) presented a high or low
pitched tone, followed after a Stimulus Onset Asynchrony (SOA) by a
letter. The subject pressed a button with the left hand to identify the
tone, and a button with the right hand to identify the letter. Intensity of
the second stimulus was varied, as was the SOA. Further, the second
stimulus was either the same as on the previous trial (a repetition) or not.
Response times RT1 to the first stimulus and RT2 to the second stimulus
in the double stimulation task were recorded. In separate blocks of trials,
response time RT2 to the second stimulus was recorded in the same
paradigm, but with no response required to the first stimulus.
Pashler and Johnston (1989) represented the model as a Gantt chart,
see Figure 2.1; the corresponding critical path network is in Figure 5.2.
For each stimulus the three processes, A, B, and C are for perceptual,
central, and response preparation, respectively. Process B is sometimes
called the bottleneck process. Stimulus onset asynchrony is denoted
SOA, and SW represents the switching of attention from the central
processing of Task 1 to the central processing of Task 2. The model is
essentially that of Davis (1957); process SW was called by him central
refractoriness.
One prediction is about interactions of factors with the special factor
of single vs. dual task responding. Pashler and Johnston (1989) followed
the reasoning of Pashler (1984), that if a factor selectively influences a

process during or following the central processing of s2, the resulting
increase in response time to s2 would be the same whether s2 is
presented alone, or in a dual task following s1. In other words, the factor
would have additive effects with the factor of single vs. dual task
responding. On the other hand, if a factor selectively influences process
A2, this factor would have a negative interaction with the factor of single
vs. dual task responding. The predictions were satisfied. Pashler and
Johnston (1989) concluded that intensity influenced the duration of
process A2 and repetition has an effect after A2 is completed, probably in
B2, as in Figure 5.2.
Locus of Slack Analysis

Another prediction is about interactions of factors with SOA in dual task
responding. Pashler and Johnston (1989) noted that the model predicts
SOA would interact differently with a factor selectively influencing a
process of Task 2, depending on whether the selectively influenced
process is sensory or central. If the process is sensory, interactions would
be positive. Positive interactions with SOA are sometimes called
“underadditive”. If the process is central, interactions would be 0, i.e.,
there would be additivity. See Schweickert (1978) and Chapter 3 for the
basis for these predictions.
Table 5.3 gives mean response times along with mean error rates.
Factors are SOA, s2 intensity, and s2 repetition. No factor had a
significant effect on RT1. For RT2, SOA had little effect when response
2 was made alone and a large effect when response 2 was made in the
dual task. In the dual task, response time 2 increased monotonically with
s2 intensity and s2 repetition, as expected. In the dual task, response
time 2 decreased with increasing SOA; it is easily checked (by adding
SOA) that response time 2 measured from s1 onset increased
monotonically as SOA increases, with small exceptions. It is also easily
checked that, as predicted, interaction contrasts of SOA and s2 intensity
were all positive (underadditive). Also, as predicted, s2 intensity and
SOA each had additive effects with repetition of s2, supporting the
hypothesis that s2 repetition selectively influences the central Task 2
process. Numerically, the interaction contrasts involving repetition are

positive, but small, consistent with the interpretation of Pashler and
Johnston. The later chapter on Order of Processing Diagrams shows that
gamma distributions for process durations give a good fit of the
Response Selection Bottleneck model (Figure 5.2) to the data of Pashler
and Johnston (1989).
Table 5.3
Reaction Time Means (Standard Deviations) in Milliseconds
Pashler and Johnston (1989) Experiment 1
Dual Task RT1

s2 Intensity
SOA s2 Repetition High Low
50 Yes 588 (139) 591 (152)
50 No 593 (140) 591 (145)
100 Yes 577 (125) 570 (133)
100 No 588 (123) 595 (137)
400 Yes 589 (135) 597 (140)
400 No 590 (127) 595 (128)
Dual Task RT2
s2 Intensity
50 Yes 848 (167) 852 (199)
50 No 866 (178) 871 (179)
100 Yes 788 (172) 789 (181)
100 No 814 (179) 835 (175)
400 Yes 597 (152) 625 (162)
400 No 610 (162) 651 (158)
Single Task RT2
s2 Intensity
50 Yes 487 (68) 547 (69)
50 No 524 (84) 580 (77)
100 Yes 483 (77) 538 (93)
100 No 503 (60) 557 (73)
400 Yes 479 (53) 536 (72)
400 No 495 (51) 536 (58)
Dual Task RT2 + SOA

s2 Intensity
50 yes 898 902
50 no 916 921
100 yes 888 889
100 no 914 935
400 yes 997 1025
400 no 1010 1051
Mean Percent (%) Error Rates (Standard Deviations)
Dual Task RT1

s2 Intensity
50 Yes 4.78 (8.16) 4.29 (4.83)
50 No 4.90 (4.93) 2.96 (3.90)
100 Yes 3.18 (7.49) 3.45 (4.06)
100 No 2.98 (3.96) 3.84 (5.23)
400 Yes 2.44 (4.04) 2.47 (5.14)
400 No 2.28 (3.31) 2.90 (3.76)
Dual Task RT2
s2 Intensity
50 Yes 3.80 (6.36) 3.67 (5.06)
50 No 5.98 (4.88) 5.85 (8.55)
100 Yes 2.75 (5.51) 3.30 (4.57)
100 No 5.24 (5.83) 6.15 (6.31)
400 Yes 6.22 (6.99) 7.96 (10.09)
400 No 8.89 (9.45) 9.95 (9.00)
Single Task RT2
s2 Intensity
50 yes 2.61 (5.43) 3.42 (9.10)
50 no 3.12 (5.81) 4.45 (7.64)
100 yes 1.35 (3.94) 3.59 (8.90)
100 no 3.47 (10.86) 3.58 (4.85)
400 yes 2.89 (7.83) 5.05 (11.61)
400 no 2.92 (5.12) 3.85 (5.96)
Note: H. Pashler (personal communication, April 15, 1990).
Results of Pashler and Johnston (1989) establish that Task 2 sensory

processing is sequential with the SOA, and both precede Task 2 central
processing. Suppose the Response Selection Bottleneck model (Single
Central Bottleneck model) in Figure 5.2 is true. Then for RT2 (1) a
factor having positive (underadditive) interaction contrasts with SOA
selectively influences a sensory, pre-bottleneck Task 2 process, and (2) a
factor having additive effects with SOA selectively influences a later,
post-bottleneck process. Analysis of effects of factors selectively
influencing processes in the Response Selection Bottleneck model is
called Locus of Slack Analysis by McCann and Johnston (1992).
An experiment by Jentzsch, et al. (2007) varied both SOA and
contrast of s2. The result was positive (underadditive) interaction
contrasts, as expected if SOA and s2 sensory processing are sequential
(as logically they must be). It is possible that subjects delay sensory
processing of s2 until its output is needed, but electrophysiological
recordings indicated that there is no such delay. More details are in a
later section on response processing.
SOA and Task 2 Central Processing, <SOA,B2>
Pashler & Johnston (1989) found additive effects of SOA and s2

repetition, discussed above, supporting the hypotheses that SOA is
sequential with Task 2 central processing, which is selectively influenced
by s2 repetition. Logically, the SOA must precede central Task 2
processing, and considerable evidence supports this.
Number of Task 2 alternatives

The first phase of the 1968 experiment of Karlin and Kestenbaum,
discussed earlier, used two alternatives for s2. In a later phase, for
comparison, sessions were added with only one alternative for s2. The
effects have generated much discussion, but discussion here is brief,
because different results were found in a replication by Van Selst and
Jolicoeur (1997).
In the later phase of the experiment of Karlin and Kestenbaum
(1968), the number of alternatives for s1 was either 1 or 2. Combining
results from the later and earlier phases of the experiment allows
comparison of conditions (1, 1), (1, 2), (2, 1) and (2, 2), where the
number of alternatives for s1 is listed first, followed by those for s2.
From Figure 4 of Karlin and Kestenbaum, it is clear that the
interstimulus interval had little or no effect on RT1, so the interstimulus
interval does not precede r1. It is also clear that at every interstimulus
interval, reaction time to s1 was faster with one alternative for s2. This
could indicate a violation of selective influence. At long interstimulus
intervals, the second stimulus is usually presented after the response to
the first stimulus is made, so the results could indicate that the number of
s2 alternatives has effects in two places, in processing of Task 1 and in
processing of Task 2, a violation of selective influence. But the results
could simply be due to subjects having more practice by the time they
started the later sessions with one alternative for s2.
Mean reaction times to the second stimulus are as expected if the
interstimulus interval and central Task 2 processing are sequential. First,
RT2 decreases monotonically as ISI increases. Second, if the ISI value is
added to RT2 (to obtain the time from s1 to r2), resulting times increase
monotonically with ISI. Finally, interaction contrasts are all nonnegative,
except for a few negligibly less than 0.
Clearly, there is slack for the interstimulus interval, for example,
when the ISI is increased by 100 msec, from 90 to 190 msec, RT2
increases by only 2 msec in the (1, 2) condition. There is also slack for
the number of s2 alternatives. When the ISI is 1150 msec, changing the
number of s2 alternatives from 1 to 2 increases RT2 by 90 msec, but
when the ISI is 90 msec the increase is only 39 msec. Such slack has
puzzled investigators; it requires a path from s1 presentation to r2 that
does not include process B2. One possibility is that B2 is concurrent
with B1, so that A1, B1, SW, C2 form a path from s1 to r2 (a
modification of Figure 5.2). Another possibility is a path provided by a
second bottleneck toward the response end of the system, as in de Jong’s
(1993) model (Figures 5.3, 5.4).
A numerical example may be helpful. Using the hypothetical process
durations in Figure 5.4, the effect of prolonging process B2 depends on
the SOA. Suppose SOA = 90. When B2 = 100, the critical path to r2 is
the “top” one, A1 + B1 + R1 + C2 = 590. When B2 = 190, the critical

path is the “bridge” path, A1 + B1 + SW + B2 + C2 = 629. The increase
in RT2 is 39. Now suppose SOA = 1150. The critical path to r2 is
always the bottom one, SOA + A2 + B2 + C2. Increasing B2 from 100 to
190 produces an increase in RT2 of 90.
Nevertheless, Van Selst and Jolicoeur (1997) did not replicate the
interaction found by Karlin and Kestenbaum (1968). Van Selst and
Jolicoeur (1997) followed as closely possible the procedure of Karlin and
Kestenbaum given that the latter did not report all details of their
method. In particular subjects in Experiment 2 of Van Selst and
Jolicoeur had long practice, comparable to that of Karlin and
Kestenbaum. Van Selst and Jolicoeur (1997) manipulated SOA and the
number of s2 alternatives; the number of s2 alternatives alternated
between 1 and 2 from block to block. (They did not manipulate the
number of s1 alternatives, as Karlin and Kestenbaum did.) Van Selst and
Jolicoeur (1997) found additive effects of SOA and number of s2
alternatives in Experiment 2, and in Experiment 3, which was nearly
identical to Experiment 2, but with subjects coming for a single session
(see their Figures 6 and 7).
Van Selst and Jolicoeur (1997) say the interaction found by Karlin
and Kestenbaum (1968) can be explained in a few ways. One is that in
the single alternative condition, subjects were making relatively many
anticipatory responses. Another is that Karlin and Kestenbaum used a
biased procedure for eliminating outliers.
The positive interaction found by Karlin and Kestenbaum (1968) and
the additivity found by Van Selst and Jolicoeur (1997) are both
consistent with the SOA and a process selectively influenced by the
number of s2 alternatives being sequential processes. The additivity is
easy to explain with the Single Central Bottleneck model (Figure 5.2):
the SOA and central task 2 process are separated by a cut vertex. But the
positive interaction is not explained and suggests a double bottleneck.
Degree of mental rotation

With tone identification as Task 1 and mental rotation of a displayed
letter as Task 2, Ruthruff, Miller and Lachmann (1995) found additive
effects of SOA and degree of rotation. The additivity is explained by the

Single Central Bottleneck model (Figure 5.2), with the mental rotation
occurring at B2, the central task 2 process.
Stimulus 2 discriminability
Experiment 1 of Johnston and McCann (2006) was described in the
earlier section on Task 1 Central Processing and SOA. Task 2
discriminability and SOA were also investigated. Recall that Task 1 was
auditory or visual (in separate blocks). Regardless of Task 1 modality,
there was no effect of Task 2 discriminability on RT1. The effect on RT2
increased monotonically as SOA increased. That is, for RT2, Task 2
discriminability had a positive interaction with SOA. The positive
interaction indicates that Task 2 discriminability selectively influenced a
Task 2 process sequential with SOA; logically, this process, which they
call classification, follows the SOA. Details of the interaction lead
Johnston and McCann (2006) to say that ordinarily Task 2 classification
follows Task 1 central processing (i.e., is after the bottleneck), but for
some subjects on some trials it does not. Their statement is not
parsimonious, but not easily rejected because a mixture of networks that
produce additivity with other networks that produce a positive interaction
would yield a positive interaction.
In later experiments, Johnston and McCann (2006) presented a
rectangle as s2, to be judged as narrow or wide. Discrimnability was easy
or hard. The effect of discriminabilty was additive with SOA for RT2.
Interpretation is straightforward. They concluded that discriminability
selectively influenced a Task 2 process which followed both the SOA
and central processing of Task 1. They say this classification process is
the earliest known Task 2 process to follow Task 1 central processing.
Number of Task 2 alternatives again, with response modality

The purpose of de Jong’s (1993) paper was to pursue the possibility of a
response bottleneck, suggested by Karlin and Kestenbaum’s (1968)
finding of slack for the number of s2 alternatives. de Jong compared
responses by hand and by foot. In his Experiment 2, the first task was
identification of a tone, either 1000 or 1035 Hz. Response 1 was a
button press with the left hand, one button for each tone. The stimulus
for the second task was B or D. Response 2 was a button press with the
right hand in one condition (hand-hand) and a pedal push with the left or
right foot in another condition (hand-foot). The number of response 2
alternatives was manipulated. In the one-alternative condition, for both
B and D the subject pressed the same button or pushed the same pedal.
In the two-alternative condition, each letter was assigned a different
button or pedal. The SOA was 25, 250 or 800 msec. Summary data are
in Figure 5.5 (de Jong’s, 1993, Fig. 4), and for RT2, in Table 5.4, as read
from that figure.
Fig. 5.5. Mean reaction times in Experiment 2 as a function of first-task response

modality (hand-foot), second-task complexity (simple choice), and stimulus onset
asynchrony (SOA). Top panel: Reaction times in the first auditory task [RT1]. Bottom
panel: Reaction times in the second visual task [RT2]. From de Jong, R., 1993, Multiple
bottlenecks in overlapping task performance, Journal of Experimental Psychology:
Human Perception and Performance, 19, Fig. 4. Copyright 1993 American
Psychological Association. Reproduced with permission.
Table 5.4
Mean RT2 (msec) in de Jong’s (1993) Experiment 2
Task 2 Task 2 SOA (msec)

Alternatives Responses 25 250 800
2 foot 647 520 424
2 hand 562 426 338
1 foot 520 399 271
1 hand 504 349 200
Let’s start with statistical details. For RT1, there was a significant
effect of the number of Task 2 alternatives. We will not pursue it
because it is small and does not affect our conclusions about RT2, but
perhaps the number of alternatives for Task 2 influenced processes in
Task 1 as well as in Task 2, a violation of selective influence.
For RT2, there were significant effects of SOA, number of Task 2
alternatives, and response modality. There was a significant positive
(underadditive) interaction between SOA and number of Task 2
alternatives and a significant positive interaction between number of
Task 2 alternatives and response modality. Numerically, there was a
positive (underadditive) interaction between SOA and response modality,
nearly significant (p < .06).
The three way interaction of SOA, number of Task 2 alternatives and
response modality was significant. In a separate ANOVA on only trials
with a foot response, SOA and number of Task 2 alternatives had
additive effects.
An important result for de Jong’s purpose is the significant positive
(underadditive) interaction between SOA and number of Task 2
alternatives; it supports the existence of a response bottleneck under his
experimental conditions. Compare the critical path networks in Figures
5.2 and 5.4. If as in Figure 5.2 there is a cut vertex between the SOA and
process B2 (selectively influenced by the number of Task 2 alternatives),
then the factors would always have additive effects. Instead, they have
additive effects only when Response 2 is made with the foot, when
presumably process C2 is long. Results are explained by the network in
Figure 5.4. When process C2 is short, process B2 has slack because of a
path to r2 that bypasses B2. When process C2 is long, process B2 no

longer has slack. We now test these ideas quantitatively.
A reasonable hypothesis is that SOA, number of Task 2 alternatives,
and response modality selectively influence three sequential processes in
that order (namely, SOA, B2 and C2). If so, RT2 is predicted to change
monotonically with the factor levels. In Table 5.4, RT2 (1) decreases
when SOA increases, (2) increases when the number of Task 2
alternatives increases, and (3) increases when the response is by foot
rather than by hand. Further, by adding SOA to RT2 it is easy to check
that (4) time to r2 measured from the onset of s1 increases when SOA
increases (see Table 5.5). Monotonicity is satisfied.
Table 5.5
Mean RT2 + SOA (msec) in de Jong’s (1993) Experiment 2
Task 2 Task 2 SOA (msec)

Alternatives Responses 25 250 800
2 foot 672 770 1224
2 hand 587 676 1138
1 foot 545 649 1071
1 hand 529 599 1000
Also for each pair of factors, the interaction contrasts are predicted to
be positive and monotonically increase with factor levels. We check
them with Table 5.5. The lowest level of each factor, used as baseline
for the following calculations, is SOA 25, one Task 2 alternative, and
hand response. (Rounding is done after, rather than before, calculations.)
For SOA and Number of Task 2 Alternatives, interaction contrasts
are, for SOA = 250,
676 − 599 − 587 + 529 = 19
and for SOA = 800,
1138 − 1000 − 587 + 529 = 80.

The interaction contrasts are positive and increase monotonically with

SOA.
For SOA and response modality, interaction contrasts are, for SOA =
250,
649 − 599 − 545 + 529 = 33
and for SOA = 800,
1071 − 1000 − 545 + 529 = 55.
Again, the interaction contrasts are positive and increase monotonically

with SOA.
Finally, for number of Task 2 alternatives and response modality, the
interaction contrast is
672 − 587 − 545 + 529 = 69.
All interactions are nonnegative, and those involving SOA increase

with SOA, as predicted for factors selectively influencing sequential
processes.
If we suppose each factor at its largest level has prolonged its process
sufficiently so that Equation 3.1 applies, then we have the following
estimates of coupled slack:
k(SOA, B2) = 80; k(SOA, C2) = 55; and k(B2, C2) = 69.
Then, if the processes are in the order SOA, B2, C2 we predict the
reaction time for prolonging all three processes from the effect of
prolonging each process individually and two of the three coupled slacks
(see the Appendix):
Δt(ΔSOA, ΔB2, ΔC2)

= Δt(ΔSOA, 0, 0)+k(SOA, B2)+Δt(0, ΔB2, 0)+k(B2, C2)+Δt(0, 0, ΔC2).
From Table 5.5 the left hand side is
1224 − 529 = 695.
The right hand side is
(1000 − 529) + 80 + (587 − 529) + 69 + (545 − 529) = 695.
The left and right sides are equal (within rounding). The coupled slack
not used in the above equation, k(SOA, C2), corresponds to the first and
last process in the sequence of three. No other order fits the data so well,
so we conclude B2 is the process in the middle.
In addition to demonstrating once again that SOA precedes central
Task 2 processing, de Jong (1993) found that the factor of responding by
hand vs. by foot selectively influences response preparation for Task 2.
He showed that central processing and response preparation for Task 2
are sequential, and showed that in some circumstances there is a path
concurrent with them. Quantitative analysis shows central processing for
Task 2 is between the SOA and response preparation for Task 2.
Sensory and central Task 2 processing, <A2, B2>
Pashler (1984) presented a light bar for Task 1, and the response was a
button press with the left hand to identify whether the bar was high or
low in position. Task 2 was a visual search for a letter. There were two
Task 2 factors. The contrast of the display of letters to be searched
through for Task 2 was high or low, and the target letter was either
present or absent. The visual search was done either alone or in a dual
task, in different blocks. In the dual task, subjects were instructed to
respond for Task 1 before responding for Task 2. The SOA was fixed at
100 msec.
Whether the visual search was done alone or as one of the dual tasks,
s2 contrast and target presence/absence had additive effects on search
task reaction time. The additivity supports the hypothesis that the two
factors selectively influenced two different processes.
Pashler (1984) compared single and dual task conditions to obtain
information about process arrangement. The effect of contrast was
greater in the single task condition than in the dual task condition. If we
assume the prolongation produced by decreasing contrast was the same
in the single and dual task conditions, then, because the effect was
smaller in the dual task condition, there was slack in the dual task
condition for the process selectively influenced by contrast. With the
notation of Figure 5.2 the slack is expected if s2 contrast selectively
influenced process A2. However, the effect of target presence/absence
was the same in the single and dual task conditions. If we again assume
the prolongation was the same in the single and dual task conditions,
there was no slack in the dual task condition for the process selectively
influenced by target presence/absence. With the notation of Figure 5.2,
the additivity is expected if target presence/absence selectively
influenced B2.
In another experiment on A2 and B2, Pashler and Johnston (1989)
found additive effects of s2 intensity and s2 repetition; this is explained if
these factors selectively influence A2 and B2; the work was discussed in
the section on SOA and B2.
Central processing of Task 1, central processing of Task 2, <B1, B2>
Central processing for the two tasks of a dual task has been investigated
in the Psychological Refractory Period (PRP) paradigm and in Stroop-
like tasks, where conflict is induced by the stimuli.
PRP: Number of alternatives

Karlin & Kestenbaum (1968) varied the number of alternatives for s1
and for s2. No statistical tests are reported, but the effects are
approximately additive. This can be seen for RT2 and RT1 in their
Figures 3 and 4, respectively, where the curves for one s1-alternative are
approximately parallel to those for two s1-alternatives. Additive effects

on RT2 are readily explained if these manipulations selectively influence
the sequential central processes in the Single Central Bottleneck model.
The number of s2 alternatives affected RT1, perhaps a violation of
selective influence, but not pursued here because effects of two s2-
alternatives are confounded with practice effects (details in the section
Task 1 central processing and SOA). Unfortunately, no attempt to
replicate this important part of the experiment of Karlin and Kestenbaum
has been made.
PRP: Discriminability
We introduced Johnston and McCann’s (2006) experiment in the section
on Task 1 central processing and SOA. Discriminability of stimulus 1
was either easy or hard, as was discriminability of stimulus 2. There was
no effect of Task 2 discriminability on RT1. But Task 1 discriminability
and Task 2 discriminability had additive effects on RT2, indicating that
these two factors selectively influenced sequential processes.
PRP: Central Process Order

If responses to two stimuli must be selected, and only one response can
be selected at a time, what controls their order? It is possible that the
order is not under the subject’s control. If the stimuli arrive at different
times, the central processor might schedule them first come, first served.
Or central processing for a particular modality might always go first. If
instead the order is under the subject’s control, an experimenter can
simply give instructions to schedule the processes in a certain order, and
the subject will be able to carry out the instructions.
Ehrenstein, Schweickert, Choi and Proctor (1997) tested this. One
task was memory search, the other was mental arithmetic. These tasks
are more complex than usual for the Psychological Refractory Period
paradigm. The stimulus for each task was a digit, and there were small
effects of relations between the digits, such as whether they were the
same or not. These cross-talk effects (Navon & Miller, 1987) are small
enough to be neglected in the analysis, but undesirable; for details, see
Dutta, Schweickert, Choi and Proctor (1995).
Subjects in Experiment 1 were instructed to complete the arithmetic

task before the memory-search response was made, while subjects in
Experiment 2 were instructed to use the reverse order. The order of the
two central processes was established by analyzing the effects of (1)
memory set size and (2) target presence or absence, both intended to
selectively influence the memory search, and (3) subtraction difficulty,
intended to selectively influence the subtraction. Analysis of the effects
of the factors on the difference between the reaction times for the two
tasks indicated that subjects carried out the two central processes in the
order proposed in the instructions.
Here are more details. At the start of a trial the subject was presented
with a list of either 4, 5, or 6 digits to memorize. After a short delay, two
digits were presented simultaneously, one above the other. Subjects
were to search the memory set for the upper number and to subtract
either 1 or 2 from the lower number. The amount to be subtracted was
the same throughout a block of trials. With the left hand the subject
pressed a button to indicate whether the probe was present in the memory
set or absent. With the right hand, the subject typed the answer to the
arithmetic problem on the numeric keypad.
In Experiment 1, response times for each task increased significantly
with (1) increase in memory set size, (2) target absence rather than
presence, and (3) subtraction difficulty. Therefore, each central process
preceded each response. In the Appendix a method is described for
determining process order when there are two responses, through
Equation (5A.4). When subjects who made their responses close
together in time (grouping) were eliminated, and statistical tests done on
the way the difference between the two response times changed with the
factor levels, the tests indicated that the subtraction process preceded the
memory search process.
Results of Experiment 2, in which subjects were instructed to use the
reverse order, indicated that the reverse order was indeed used; that is
memory search preceded subtraction. Results are complicated by the
behavior of one group of subjects. Although the complication did not
invalidate the conclusion, new subjects were run in place of this group.
Results for the new group were the same as for the other three original
groups. For the three original groups combined with the new group,
increased memory set size and target absence increased the reaction time
for each task, but subtraction difficulty only increased the reaction time
for the arithmetic task. Results can be explained with a simple model in
which the two tasks were carried out one after the other, the memory
search preceding the subtraction.
Stroop tasks
In the Stroop (1935) task, the name of a color is presented in colored
print, with the color of the print to be named. Typically, responses are
slower and less accurate when the name and print color conflict than
when they agree.
Number of alternatives and Stroop conflict

In a Stroop-like task of Greenwald (1972), the word “left” or “right” was
presented through earphones, and immediately following its onset an
arrow was presented pointing left or right. The number of alternative
words was either 1 or 2, as was the number of alternative arrows. A third
factor was whether the directions indicated by the word and arrow agreed
or conflicted (Stroop conflict). Evidence discussed earlier indicates that
in Psychological Refractory Period experiments central processing of the
two stimuli takes place in two separate sequential processes. Potentially,
the conflict in Greenwald’s task could have influenced the central
processing of the word or of the arrow. But the three factors (number of
alternative words, number of alternative arrows, presence or absence of
conflict) behaved as if they selectively influenced three different
processes (decision about the word, decision about the arrow, conflict
resolution).
Here are details. In the high ideomotor compatibility condition, the
subject repeated the stimulus word and moved a joystick in the direction
of the arrow. (Ideomotor compatibility means that the feedback from the
response resembles the stimulus; reviewed in Lien & Proctor, 2002.) In
the low ideomotor compatibility condition, the subject spoke the
direction of the arrow and moved a joystick in the direction indicated by
the word. Response order was not specified in instructions.
In the high ideomotor compatibility condition, the combined effect of

increasing the number of alternative words and increasing the number of
alternative arrows was approximately equal to the maximum of the
separate effects. This held for both the manual and the verbal responses,
and for trials with conflict and trials without conflict. Quite reasonably,
Greenwald (1972) concluded that with high ideomotor compatibility the
decision about the word is concurrent with the decision about the arrow.
An alternative explanation is that the decisions are sequential but on
opposite sides of a Wheatstone bridge.
The process selectively influenced by conflict was sequential with the
decision about the arrow (additive effects for both manual and verbal
response times); their order could not be established. The process
selectively influenced by conflict and the decision about the word were
either concurrent, or sequential on opposite sides of a Wheatstone bridge.
Analysis of the low ideomotor compatibility condition is rather
complex. Briefly, the factors behaved as if they selectively influence the
same three processes as those identified above in the high ideomotor
compatibility condition. Analysis suggests that in the part of the network
leading to the verbal response, the decision about the word and the
decision about the arrow are on opposite sides of a Wheatstone bridge.
(The combined effect of both manipulations exceeds the maximum of the
separate manipulations by over 100 msec, and the interaction contrast is
negative.) The part of the network leading to the manual response is not
in the form of a Wheatstone bridge. The order of two processes was
found, through Equation (5A.1) in the Appendix. The decision about the
word precedes both the decision about the arrow and the process
selectively influenced by conflict. The order of the decision about the
arrow and the process selectively influenced by conflict could not be
determined. Two reasons for caution about the interpretations are that
the conditions with a single stimulus alternative may be special, and
some interaction contrasts are small.
In the Stroop task per se, a color name is displayed in a colored ink
(Stroop, 1935), so Greenwald’s (1972) task is not the usual Stroop task.
Schweickert’s (1983a) experiment was modeled on Greenwald’s, but
with colors. A color patch and a color name were simultaneously
displayed on each trial. There were three factors, the number of

alternative hues of the patch, the number of alternative color names, and
whether the hue and color name conflicted or not. As in Greenwald’s
(1972) experiment, the conclusion was that the number of alternatives for
each stimulus selectively influenced a different process, and these two
processes were different from the process prolonged by conflict.
Details follow; see Schweickert (1983a) for more and see Chen and
Chen (2003) for further work with this approach. In the word-naming
condition, the subject spoke the word, and pressed a button to indicate
the hue. In the color-naming condition, the subject spoke the name of
the hue, and pressed a button to indicate the displayed color name. One
subject was in each condition. Subjects were instructed to make the
manual response before the verbal response.
In the word-naming condition, in each block the number of
alternative hues was 1, 2 or 4 and the number of alternative color names
was varied orthogonally, also 1, 2, or 4. In a block, each hue occurred
equally often at random, as did each color name. Ideally, whether the
hue and color name conflicted or not would vary orthogonally with the
number of alternative colors and color names, but this is not possible.
The proportion of trials with conflict depended on the number of
alternative hues and color names; for example, with 1 hue and 4 color
names the proportion of conflict trials was 1/4, while with 1 hue and 2
color names the proportion of conflict trials was 1/2. In the word-
naming condition, when there was 1 alternative for the manual response
there was no effect on reaction times for the experimental factors.
Consequentially, in the color naming condition, run later, blocks with
one alternative for the manual response were not tested.
In the word-naming condition, there was a significant interaction
between the number of alternative hues and the number of alternative
color-names. Omitting trials with one alternative for the manual
response (where factors had no effect), the interaction was negative, in
the form predicted if these two factors selectively influenced processes
that are (1) concurrent or (2) sequential on opposite sides of a
Wheatstone bridge. In the color-naming condition, the number of
alternative hues and the number of alternative color names did not
interact significantly, indicating that these factors selectively influence

sequential processes. Numerically, the interaction contrasts are negative
(mean of −11 msec). Results from both conditions can be explained by
saying the two factors selectively influence sequential processes on
opposite sides of a Wheatstone bridge.
In both the word-naming condition and the color-naming condition
there was an interaction between the number of alternative hues and
conflict. But the interaction was negative in the word-naming condition
and positive in the color-naming condition.
The positive interaction in the color-naming condition can be
explained by saying central processing of the hue and processing of
conflict are sequential. The negative interaction in the word-naming
experiment can be explained by central processing of the hue and
processing of conflict being (1) concurrent or (2) sequential, on opposite
sides of a Wheatstone bridge. Arrangement (2) is consistent with the
arrangement found for the color-naming condition.
If the three processes are sequential in both word naming and color
naming, the order that best accounted for the data is with the decision
about the manual response coming first, followed by the decision about
the verbal response, followed by conflict processing. In other words,
central processing in the two tasks was in the same order as the
responses. The reasoning is based in part on Equation (5A.1), details are
in Schweickert (1983a).
Process arrangements in this Stroop task and the Stroop-like task of
Greenwald (1972) are consistent with one another although different
details are revealed in each. The additional processing time required
when the word and hue or word and arrow conflict is one of many signs
of complications near the response end of the system.
Post-Central and Response Processes
Response time includes time for all processing except set up prior to the
stimulus and tear down after the response. A more discriminating
analysis can be made by referring to a time mark within the task. Mental
processes are not directly observable, but some are accompanied by
electrical potentials that can be measured and timed at the scalp.

Preparation of a movement by a hand produces a larger potential in
the motor cortex on the side of the brain contralateral to the movement
than on the side ipsilateral to the movement (see Coles, 1989, for an
introduction). The contralateral-ipsilateral difference can be used to
determine when movement preparation begins. During performance of a
task voltages are measured at electrodes located over the motor cortices
of the left and right sides of the brain (specifically at special sites on the
scalp denoted C3' and C4'). Essentially, if at some time t there is no
difference between the voltages at these two sites, then the subject is not
tending to move one hand more than the other. But when the difference
between voltages at these two sites becomes nonzero the subject is
preparing a movement with one of the hands. (The sign of the difference
could oscillate, perhaps indicating changes in which movement is being
prepared; small differences are usually ignored.) The sign of the
difference is chosen so that a positive difference indicates greater
negative potential (greater activation) over the motor cortex controlling
the movement, i.e., contralateral to the side on which the hand movement
is made. The voltage difference is called the Lateralized Readiness
Potential (LRP).
Interval from stimulus to onset of movement related brain potential
There is disagreement about the nature of central processing, but a term

is needed for it. Hick and Welford (1956) said decisions are central and
executed sequentially. “Decision” has been replaced by “response
selection” because considerable evidence suggests that selection of the
response for the second subtask is delayed in performance of a dual task.
We will use the neutral term central processing when, as in this section,
questions about its nature are the focus of discussion.
By measuring the LRP during performance of a dual task, Osman and
Moore (1993) concluded that preparation of the second response begins
after central processing in the first task is finished. From their results, it
is reasonable to assume LRP-onset is the start of process C2 (see Figure
5.2). We discuss their Experiment 2. The first subtask in their dual task
was to identify a high or low pitched tone by pressing a button with the
left or right foot. The second subtask was to press a button with the left
or right index finger to identify the second stimulus as an X or an O. The
SOA values were 50, 200 and 500 msec. The SOA was constant
throughout a block of trials. Note that Task 1 and Task 2 use different
sense modalities and different responding limbs. A delay in Response 2
produced by performance of Task 1 cannot be due to conflicting
demands for the same organ or limb.
Figure 5.6 (adapted from Osman & Moore, 1993, Figure 1) illustrates
predictions about the effect of the SOA on the onset of the Task 2 LRP
(LRP2-onset). The top panel illustrates the hypothesis that preparation
of the second response begins in the processing prior to the delayed
central process B2. In that case, the interval from s1, presentation of the
first stimulus, to LRP2-onset would not change with SOA. But the
interval from LRP2-onset to r2, the second response, would decrease as
SOA increases.
The bottom panel illustrates the hypothesis that preparation of the
second response begins sometime during the delayed central process B2,
or (not illustrated) sometime during response process C2. In the bottom
panel case, the interval from s1 to LRP2-onset would decrease as SOA
increases. But the interval from LRP2-onset to r2 would not change as
SOA increases.
Osman and Moore (1993) also state that if the hypothesis considered
in the bottom panel is true, the interval from the completion of the central
processing for Task 1 to r1, the response for Task 1, is less than or equal
to the interval from LRP-onset to r1. Further, if processing is as
illustrated in either panel, and the LRP for the first task arises during the
central process B1, then neither the interval from s1 to LRP1-onset, nor
the interval from LRP1-onset to r1 would change as SOA changes.
We have space only for major results; see Osman and Moore (1993)
for more details. As usual in dual tasks, mean RT2 decreased as SOA
increased. Contrary to both models illustrated in Figure 5.6 there was a
significant effect of SOA on mean RT1; because it is small it does not
affect the main conclusions.
Fig. 5.6. Top. Model for Task 2 Lateralized Readiness Potential (curved line, LRP)
arising during Sensory Processing of stimulus 2 (A2). Time from s2 to LRP onset is
invariant with SOA. Bottom. Model for Task 2 Lateralized Readiness Potential arising
during Central Processing of stimulus 2 (B2). Time from LRP onset to r2 is invariant
with SOA. Notation as in Figure 5.2. Based on Osman & Moore (1993) Figure 1.
Cumulative distribution functions for the response times are in Figure

5.7 (Osman & Moore, 1993, Figure 6). In some blocks of trials the only
task was Task 1, and in other blocks the only task was Task 2; such trials
are called single trials. In other blocks most trials were dual task trials,
with an occasional catch trial in which the stimulus for Task 2 was not
presented and no Task 2 response was required. We note that, although
Fig. 5.7. Vincentized cumulative distribution functions (CDFs) of reaction times for the
first (RT1) and second (RT2) tasks. (CDFs are shown for each task at each stimulus onset
asynchrony [SOA], for each task on single-task blocks, and for the first task on catch
trials. [Note: The first-task CDFs for the long SOA and catch trials are so similar that
they are difficult to distinguish from a single CDF.] From Osman, A., & Moore, C. M.,
1993, The locus of dual-task interference: Psychological refractory effects on movement-
related brain potentials. Journal of Experimental Psychology: Human Perception and
Performance, 19, Fig. 6. Copyright 1993 American Psychological Association.
Reproduced with permission.
not important for the purposes of Osman and Moore (1993), the
cumulative distribution functions test an assumption important for
analysis of selective influence by increments on reaction time. The
cumulative distribution functions do not cross, except at the tails for
RT1. Except for these tails, reaction times for the SOA and the trial
types of single and catch support the “usual stochastic ordering”
assumption (also called stochastic dominance, see Chapter 4).
Fig. 5.8. Stimulus (S)-locked and response (R)-locked lateralized readiness potentials
(LRPs) for the first and second tasks. (LRPs are shown for each task at each stimulus
onset asynchrony, for each task on single-task blocks, and for the first-task on catch
trials. The top portion of each panel shows the LRPs, and the bottom portion shows the
effects on these LRPs of which response occurred in the other task. S1 = onset of first
stimulus; S2 = onset of second stimulus; RT1 = first reaction time; RT2 = second reaction
time.) From Osman, A., & Moore, C. M., 1993, The locus of dual-task interference:
Psychological refractory effects on movement-related brain potentials. Journal of
Experimental Psychology: Human Perception and Performance, 19, Fig. 7. Copyright
1993 American Psychological Association. Reproduced with permission.
The LRPs are in Figure 5.8 (Osman & Moore, 1993, Figure 7). A
question arises whenever brain potentials are averaged over trials at each
time t. The potentials are measured throughout the performance of the
task, but performance on some trials stops sooner than on others. All
trials are available for averaging at stimulus onset, but as time goes on
fewer and fewer trials remain available. Osman and Moore (1993)
followed a common procedure. When the early part of the potential is
important, the average is calculated over every available trial at time t

after stimulus onset. These average potentials are called S-locked. But
when the late part of the potential is important, for each trial the time at
which the response was made is considered 0, and earlier times are
measured backwards with respect to this. (In effect, potentials are lined
up at their finishing times before the averaging is done.) These average
potentials are called R-locked.
The LRPs for Task 1 are in the top two panels of Figure 5.8. As a
technical detail, these LRPs have units of negative voltage, because the
responding limb is the foot. The motor cortex corresponding to a foot is
in the longitudinal fissure of the contralateral side of the brain. Although
this piece of cortex is physically close to the scalp over it, its orientation
leads to stronger potentials being recorded at the scalp on the opposite
side of the head, that is, at the recording site ipsilateral to the responding
foot. The convention of subtracting the ipsilateral potential from the
contralateral potential results in a negative voltage.
The LRPs for Task 1 are in the top two panels of Figure 5.8. As a
technical detail, these LRPs have units of negative voltage, because the
responding limb is the foot. The motor cortex corresponding to a foot is
in the longitudinal fissure of the contralateral side of the brain. Although
this piece of cortex is physically close to the scalp over it, its orientation
leads to stronger potentials being recorded at the scalp on the opposite
side of the head, that is at the recording site ipsilateral to the responding
foot. The convention of subtracting the ipsilateral potential from the
contralateral potential results in a negative voltage.
To the eye, the LRP waveforms for Response 1 differ little at the
different SOAs, and this is supported by statistical analyses. In
particular, the time at which the second stimulus is presented does not
affect the onset of the LRP for the first response, whether S-locked (to
the first stimulus) or R-locked (to the first response).
The bottom two panels of Figure 5.8 show the LRPs for the second
response. It is clear that the S-locked LRP (locked to S2) differs for
different SOAs (and for single task trials compared with trials having an
SOA). In particular, the SOA had a statistically significant effect on the
S-locked LRP-onsets. On the other hand, there is little difference
between the R-locked LRPS (locked to R2). And in particular, SOA did
not have a statistically significant effect on the R-locked LRP-onsets. In
summary, effects are as expected if the LRP onset for the second
response occurs within the central processing of Task 2 (called B2 in
Figure 5.6).
Post Task 1 central processing: The residual PRP effect

Early in the study of the Psychological Refractory Period, it was
discovered that the response to s2 took longer when it was presented in a
dual task than in a single task, even when in the dual task the response to
s1 was completed before s2 was presented. Further, in a dual task the
response time to the second stimulus typically decreases as SOA
increases even if the response for the first task is completed before the
stimulus for the second task is presented.
When stimulus 2 is presented after the response to s1, Jentzsch,
Leuthold and Ulrich (2007) call a delay in response 2 a residual PRP
effect, and investigated its source and location. The residual PRP effect
would not occur if the refractory period were due only to waiting for
selection of the response for the first task to be finished. Suppose after
response selection for the first task is completed, response selection for
the second task does not start immediately, but must wait for some
further process to finish. One possibility for such a process is the
refractory interval in the model of de Jong (1993), from the end of B1 to
the start of C2 (Figures 5.3, 5.4). Another possibility is the process SW
in the de Jong model and in the Single Central Bottleneck model of
Figure 5.2 (the Response Selection Bottleneck model); this process
extends from the end of B1 to the start of B2, and is sometimes
interpreted as task switching. Two other possibilities were suggested by
Welford: Feedback time from (1) the beginning or (2) the end of the first
response must elapse before Task 2 Central Processing, B2, can start.
The latter two, labeled FTb and FTe, respectively, are incorporated in the
Extended Selection Bottleneck Model of Jentzsch, et al. (2007), see
Figure 5.9. The figure uses notation of Pashler and Johnston (1989), but
C1 denotes the motor movement of the response for Task 1 rather than
its preparation. Note that if FTe were not present and there is no pause
between the end of B1 and the start of C1, process FTb is structurally
equivalent to process SW in the Single Central Bottleneck Model.
We discuss two predictions Jentzsch, et al. (2007) tested for trials
when r1 is completed before s2 is presented. First, a factor selectively
influencing A2 will have an underadditive interaction with SOA (i.e., a
positive interaction). Second, a factor selectively influencing B2 will
have additive effects with SOA. (See Chapter 3 for the basis of these
predictions.)
Fig. 5.9. Extended Selection Bottleneck Model of Jentzsch, et al. (2007). A

psychological refractory period occurs when s2 is presented after r1 is made. Here, C1
denotes motor movement for response 1. Feedback from the beginning or end of C1,
denoted FTb and FTe, delay r2. (Based on Jentz, et al. 2007, Figure 1.)
The prediction about A2 (s2 sensory processing) was tested in

Jentzsch, et al. (2007) Experiment 2. Subjects identified a high or low
pitched tone in Task 1, and identified an X or an O displayed visually in
Task 2. One factor was the SOA, with values of 400 or 600 msec. The
other factor was the contrast, high or low, of the second stimulus. To
enable LRP recording, responses were made with the two hands. For one
group of subjects responses to high and low tones were made with the
left and right index fingers, respectively, while the X and O were
responded to with the left and right middle fingers, respectively. Other
groups of subjects were given other response assignments in such a way
as to counterbalance which finger (index or middle) was used for which
task, and which side (left or right) was used for which stimulus. The
SOA values and the s2 contrast values were presented randomly in each
block of trials.
Trials on which RT1 was less than both SOAs (i.e., less than 400
msec) were analyzed. As one would expect, there was no effect of SOA
or s2 contrast on RT1. Also, as one would expect, there was an effect of
s2 contrast on RT2. Of importance for the authors’ purposes, when SOA
increased, RT2 decreased; that is, there was a residual PRP effect.
Moreover, the combined effect of SOA and s2 contrast was as predicted
from the Extended Response Selection Bottleneck model; that is, there
was an underadditive interaction (a positive interaction), as predicted if
the source of the residual PRP effect is a process (or more than one
process) concurrent with both SOA and A2 (see Figure 5.9). Results do
not distinguish between possibilities of FTb, or FTe, or both in Figure
5.9. We will speak of a single process FTx, keeping mind that there may
be more than one.
Clearly the required process FTx must precede r2. Further
information comes from LRP analysis. Suppose FTx must precede the
onset of the LRP for r2. Predictions are that (1) increasing the SOA will
decrease the interval between s2 and LRP-onset, and (2) prolonging A2
by changing the contrast of s2 will increase the interval between s2 and
LRP-onset. But neither manipulation will change the interval between
LRP-onset and r2. Results were as predicted. Note that the results have
the same form as found by Osman and Moore (1993) (although all their
trials were combined).
On the other hand, suppose as before FTx is concurrent with SOA and
A2, but FTx need not precede the LRP-onset. There are several ways this
could happen while still satisfying the constraint that FTx precedes r2.
Suppose, for example, that the LRP-onset is an event within process B2,
and FTx is concurrent with B2, but precedes C2. (In that case, FTx
would be the refractory interval in the model of de Jong, 1993, see
Figure 5.4). Because FTx is relatively long, such an arrangement
predicts that neither SOA nor s2 contrast will affect the interval between
s1 and LRP-onset. However, such an arrangement also predicts that
increasing SOA will decrease the interval between s2 and LRP-onset, as
will increasing s2 sensory processing difficulty by changing s2 contrast.
Results did not satisfy these predictions. The conclusion is that FTx
precedes the LRP-onset. Because the model makes the same predictions
for trials on which r1 follows s2, analyses were also done for all trials
combined. Those analyses are consistent with those we summarize here;
see Jentzsch, et al (2007) for details.
The scheduling of Task 2 sensory processing, A2
Further electrophysiological results of Jentzsch, et al. (2007) are

important. Recall that the second stimulus was visual while the first
stimulus was auditory. Evoked-potential components P1 and N1 arise in
the visual system. As one would expect, P1 and N1 peak latencies were
later when s2 contrast was low (for all trials combined). In the model, P1
and N1 would be produced in process A2, sensory processing of s2. The
scheduling of A2 could be as illustrated in Figure 5.9; that is, A2 begins
as soon as s2 is presented. Another option is just-in-time scheduling.
The subject could try to delay A2 a little, so that it finishes just before its
output is needed at the start of B2. This would be advantageous if there
is some chance the output of A2 degrades if it is stored in memory for the
brief interval before B2 starts. With just-in-time scheduling, the interval
from s2 to P1 and N2 would be longer when SOA is shorter. This was
not found. Instead, the duration of the SOA had no effect on the time of
the peaks of P1 and N2. This was found for trials on which r1 preceded
s2 and also for all trials combined. (To obtain enough data for analysis of
trials on which r1 preceded s2, trials were pooled over levels of s2
contrast, so effect of s2 contrast cannot be analyzed for these trials.)
Perhaps some part of A2 is scheduled later when s2 is presented early, but
P1 and N1 are not.
The Extended Selection Bottleneck Model is a double bottleneck
model; B1 must precede B2, and there is a response constraint whose
details are unknown; for example, perhaps the start of C1 must precede
the start of C2 (response interdiction, represented by RI in Figure 5.3).
Consider for comparison a model with only a response constraint; to be
concrete, suppose B1 need not precede B2, but process RI exists as in
Figure 5.3. There are two possibilities. (1) The SOA is so long that RI
is never relevant. In that case, further increase in SOA will have no
effect on RT2; that is, there will be no residual PRP effect. (2) The SOA
is short enough that RI is relevant, at least sometimes. In that case,
further increase in SOA can lead to a decrease in RT2 (a residual PRP
effect can occur); moreover, increasing SOA and prolonging B2 will
have underaddive effects (a positive interaction). With such a model, it
is not possible to observe both (1) a residual PRP effect and (2) additive
effects of SOA and a factor selectively influencing B2. With the
extended selection bottleneck model, on the contrary, both effects can
occur together.
Let us assume, as is generally agreed, that stimulus-response
compatibility for Task 2 (s2-r2 compatibility) selectively influences
process B2. Experiment 3 of Jentzsch, et al (2007) tested the effects of
SOA and s2-r2 compatibility. Most aspects of the design were the same
as in their Experiment 2. In particular, the SOAs were 400 and 600
msec. And at low s2-r2 compatibility, s2 was either an X or an O,
presented in high contrast. A new aspect of Experiment 3 was that at
high s2-r2 compatibility, s2 was an arrow pointing either left or right, in
the direction of the response to be made. The various s2 stimuli were
randomly presented in each block, as were the two SOAs. (There was no
electrophysiological recording.)
With all trials combined, SOA and s2-r2 compatibility had additive
effects on RT2, indicating that SOA and B2 are sequential processes with
no long path of processes concurrent with them. This is what McCann &
Johnston (1992) found. Of particular interest is what happened when
Response 1 preceded presentation of Stimulus 2.
For trials in which RT1 was less than the shorter SOA (400 mesc),
increasing SOA significantly decreased RT2, a residual PRP effect.
Further, decreasing s2-r2 compatibility significantly increased RT2.
However, these did not interact. (For these trials, RT1 was not affected
by SOA or by s2-r2 compatibility.)
In conclusion, the extended selection bottleneck model in Figure 5.9
is supported and models with a single bottleneck sufficiently late that B1
need not precede B2 are not supported.
Post Task 1 central processing: Task switching and Task 2 sensory

processing <SW, A2>
Earlier, we said a possible source of the residual PRP effect is the
process SW in the Single Central Bottleneck model (Figure 5.2) and in
the de Jong model (Figure 5.3). Oriet and Jolicoeur (2003) report a
surprising relevant finding, although their experiments did not use a
psychological refractory period paradigm. Briefly, a single digit was
presented on every trial in a task switching paradigm. The digit was of
high contrast or low contrast. The position of the digit changed from
trial to trial in a predictable fashion. The position cued the task to be
done, either (1) to respond that the digit was odd or even or (2) to
respond that the digit was greater than 5 or less than 5.
We can consider the current trial and the previous trial as a dual task
in which the second stimulus is presented after the response to the first
stimulus is made. If a task switch prolongs process SW1 in Figure 5.2 or
5.3 and digit contrast prolongs process A2, the factors are predicted to
interact with a negative interaction. Instead, Oriet and Jolicoeur (2003)
found additive effects. Additivity is explained if a task switch and digit
contrast selectively influence sequential processes. But it is puzzling that
sensory processing of the current digit would be delayed by a task switch
from the previous trial. The finding makes it unlikely that a task switch
is the sole source of the residual psychological refractory period effect,
because in Experiment 2 of Jentzsch, et al. (2007) this source was
concurrent with A2.
Speaking generally, task switching has complex effects. These can
prevent factors from selectively influencing processes; see Logan and
Gordon (2001), Logan and Schulkind (2000), and Lien, Schweickert and
Proctor (2003).
SOA and Response 1 movement time, (SOA, MT1)
When response movement matters, the reaction time is measured to the

onset of the response and the time between response onset and offset is
called motor time. To represent motor time for the first response in the
Single Central Bottleneck model (Figure 5.2), let C1 denote preparation
of the first response, let r1 denote the onset of the first response, and then
add a further process following r1 to denote movement; call it MT1. The
end of this further process is the response offset, and its duration is the
motor time.
In the Single Central Bottleneck model, the last Task 1 process to
precede r2 is B1, the central processing of Stimulus 1. According to the
model, Response 1 preparation and movement occur too late to have an
effect on response time 2. However, increasing the movement time of
Response 1 increased RT2 in an experiment by Ulrich, Fernández,
Jentzsch, Rolke, Schröter and Leuthold (2006). They explain their
results by modifying the Single Central Bottleneck model so Response 1
offset precedes Task 2 response preparation; that is, MT1 precedes C2.
In the modified model, the movement following preparation for each
response is represented.
In more detail, Task 1 was identification of a tone as high or low
pitched. The response was made with the left hand by sliding a handle
along a smooth track on a box, forward for one tone and backward for
the other tone. The response ended when the handle reached the end of
the box; the starting position of the handle made the distance longer to
one end than to the other end. The stimulus for Task 2 was an X or an O;
the subject identified it by pressing a button with a finger of the right
hand. Response 2 movement time was not measured.
Results are in Figure 5.10 (Ulrich, et al., 2006, Figure 2). The middle
panel shows the movement time for response 1 (MT1); it is longer when
the response distance is longer, as intended with the apparatus. The time
of onset of Response 1, RT1, is actually shorter when the distance to be
moved is longer. Ulrich, et al (2006) explain that greater force is needed
to start a long movement than a short one (e.g., Schmidt, Zelaznik,
Hawkins, Frank & Quinn, 1979), and greater force produces a shorter
time for response onset criterion to be recorded (Ulrich & Wing, 1991).
The crucial finding is that Response 2 reaction time is longer when
the movement time for Response 1 is longer, contrary to the Single
Central Bottleneck model, but explained by the modification of Ulrich, et
al. (2006). Recall that with their modification, the movement for
Response 1, MT1, immediately precedes preparation for response 2, C2.
With the modification, the movement for Response 1 precedes Response

2, so an increase in movement time for Response 1 increases RT2. With
the modification, SOA and Response 1 movement are concurrent
processes, so a negative interaction is predicted for effects of prolonging
them, as seen in Figure 5.10.
We caution that neither the Single Central Bottleneck model nor this
modification explain a small but significant effect of SOA on RT1 and
MT1 nor a small but significant interaction of SOA and movement
distance on MT1.
Fig. 5.10. Task 1 reaction time (RT1; top panel), Task 1 movement time (MT1; middle
panel), and Task 2 reaction time (RT2; bottom panel) as a function of stimulus onset
asynchrony and movement distance of the Task 1 response. From Ulrich, R., et al., 2006,
Motor limitation in dual-task processing under ballistic movement conditions,
Psychological Science, 17, Fig. 2. Copyright 2006 Wiley. Reproduced with permission.
Response grouping
Responses in dual tasks are not tidy with respect to factors selectively
influencing processes because subjects sometimes delay the first
response and make the two responses close together. This is called
grouping. Grouping is comfortable for some reason. Control is more
efficient if two responses are produced together (Rinkenauer, Ulrich &
Wing, 2001). Control may be intermittent (Craik, 1948), so responses
may be made together to bundle signals. Nothing observable indicates
whether a subject grouped responses or simply made them close together
because the tasks happened to finish at about the same time. Grouping
makes interpreting response times difficult, particularly those to the first
stimulus.
Because there is little theory or data for guidance, Ulrich and Miller
(2008) used simulations to study a grouping model and some of its
variants. They assumed that when the subject did not group, he or she
used the Single Central Bottleneck model. When the subject grouped, he
or she used a model proposed by Borger (1963) as formulated by Pashler
and Johnston (1989). Borger’s model is in Figure 5.11 (based on Ulrich
& Miller, 2008, Figure 2). The duration of the interval between r1 and
r2 is a nonnegative random variable, D. For response time 2, with
Borger’s model it essentially makes no difference whether subjects group
or not. With the grouping schedule, response time 2 is
RT2 = max{A1 + B1, SOA + A2} + B2 + C1' + D − SOA.
If we let C2 = C1' + D, we obtain the formula for RT2 according to the

Single Central Bottleneck Model schedule (Figure 5.2). In the equation
the duration of a process is indicated by its label. (In the notation of
Ulrich and Miller, C1' denotes a version of process C1, and not, as in our
usual notation, the starting vertex of process C1.) Random variable C1'
takes on the same value when it is the duration of response 1 preparation
and when it is an addend of the duration of response 2 preparation C1' +
D.
With the grouping schedule, response time 1 is
RT1 = max{A1 + B1, SOA + A2} + B2 + C1'.

Fig. 5.11. The Borger (1963) model for response grouping. Motor preparation for both
responses begins after B2. Notation as in Figure 5.2, except duration of response 1 motor
preparation is C1'. Duration of response 2 motor preparation consists of the same amount
of time C1' plus a nonnegative random variable D. (Based on Ulrich and Miller, 2008,
Figure 2.)
It is different with the Single Central Bottleneck Model schedule,
RT1 = A1 + B1 + C1.
Ulrich and Miller (2008) considered various probabilities for the

subject to use the grouping schedule. In the simplest, model Bor, the
probability of grouping is fixed. This is contrary to a finding in the
literature, that grouping decreases as SOA increases, where grouping is
considered to occur when the two responses occur within a certain short
time window. Each of their other variations predicts that the probability
of grouping decreases as SOA increases, consistent with the literature. A
common notion to all variants is that the subject waits to find out
whether two crucial events in the processing of the two tasks occur close
together; if so, the subject groups the responses in the manner of model
Bor, if not, he does not group the responses.
The variants differ according to what the two crucial events are and
when they occur. For model BorS1, after s1 is presented the subject waits
for a uniformly distributed time W. If s2 appears before the waiting time
is over, the subject groups. That is, the probability of using the grouping
schedule is P[W ≥ SOA]. For model WA the subject starts a waiting time
when s1 is presented, and if he or she perceives the two stimuli at about
the same time, the subject groups the two responses. Perception is
assumed to occur at the end of process A for each stimulus, so the
subject groups if SOA + A2 < A1 + W. (In simulations of this model and

the following ones, W was set to a constant.) For model WB the subject
waits to determine whether the two responses are selected at about the
same time, and if so, he groups them. That is, the subject uses the
grouping schedule if max{A1 + B1, SOA + A2} + B2 < A1 + B1 + W. A
few other variants are discussed as well.
Each grouping model variant is a mixture of two critical path
networks, the Single Central Bottleneck model (Figure 5.2) and the
Borger model (Figure 5.11). The variants differ in their mixing
probabilities; in all but the simplest these depend on process durations, so
predicting the effects of factors changing process durations is rather
complicated. In the simulations of Ulrich and Miller (2008), process
durations were lognormal; results were similar with gamma distributions
(Ulrich & Miller, 2008, p. 85). Plausible values for process means were
chosen. The coefficient of variation of each process duration was .2. For
each variant, the probability of grouping when SOA = 0 was set to .5.
By design, Response 2 behaves the same way whether there is
grouping or not. The main simulation results are about Response 1 and
the relation between Responses 1 and 2. Of particular interest are the
correlation between the two response times and whether the interval
between them is short.
Response Time 1
In the simulations, as SOA increased RT1 decreased for some models
(including BorS1 and WA mentioned above) and increased for other
models (including WB mentioned above).
Short interresponse intervals

Typically if grouping is a concern, trials on which the two responses
occurred within a short critical time window are dropped. The
simulations investigated windows of 100 and of 200 msec, typical
values. It is reassuring that for most variants, simulated trials with short
interresponse intervals were those on which grouping occurred. But for
two variants in which the probability of grouping depended on central
process durations (one is model WB mentioned above), the proportion of
simulated trials with responses in the critical window exceeded the

proportion of actually grouped responses. For these two variants,
removing trials with the two responses made within a short critical time
window “appears to worsen the contamination” (Ulrich & Miller, 2008,
p. 94).
Correlation between RT1 and RT2

With the Single Central Bottleneck model, a source of correlation
between RT1 and RT2 is that the value of RT2 sometimes depends on
durations of Task 1 processes (see equations above). As SOA increases,
the probability decreases that Task 1 process durations affect the time of
Response 2, so the correlation between RT1 and RT2 decreases as SOA
increases. This occurred in the simulations for all but two variants, one
of which is WA mentioned above.
Grouping and Locus of Slack

To simulate a factor prolonging a process, the mean and standard
deviation of the influenced process were doubled (this maintains a
constant coefficient of correlation). The processes preceding r2 have
essentially the same arrangement in the Single Central Bottleneck model
(Figure 5.2) and in Borger’s model (Figure 5.11). It is not surprising that
simulations of mixtures of the two show that effects on RT2 of
prolonging processes are as predicted by the former, whether or not trials
with short interresponse intervals are omitted. In other words,
predictions for RT2 on which Locus of Slack analysis are based would be
unaffected if on a proportion of trials grouping in the manner of the Bor
model is used.
For RT1, the Single Central Bottleneck model predicts that effects of
selectively influencing a process A1, B1 or C1 will have the same effect
on RT1 at every SOA. (Ulrich and Miller say the effect of manipulating
such a process will be additive with changes in SOA; this is a little
confusing, although technically true because the effect of changing SOA
is 0.) The prediction failed in the simulations. For some variants of the
grouping model, the effect of such a manipulation increased with SOA
and for others it decreased. The failure was not eliminated by removing
trials with short interresponse interval.

For RT1, the Single Central Bottleneck model predicts there will be
no effect of selectively influencing a process A2, B2 or C2. It is clear
that this will not be true when the grouping schedule is used on some
proportion of the trials, and indeed the prediction failed in the
simulations. Effects are complex and depend on the variant of the
grouping model. Effects are sometimes additive with SOA, sometimes
increasing with SOA, and sometimes decreasing with SOA. Some
failures are eliminated by eliminating trials with short inter-response
intervals, but some are not.
For grouping in the manner simulated by Ulrich and Miller (2008),
predictions of the Single Central Bottleneck model are robust for RT2 but
not for RT1. This is neither reassuring nor surprising because their
model was designed so RT2 would behave about the same way with or
without grouping. As we noted at the outset, results for RT1 and its
relation to RT2 show that grouping, in theory, can undermine attempts to
interpret experimental results in terms of the Single Central Bottleneck
model. Response Time 1 increased with SOA for some variants of their
model but decreased for others, effects on RT1 of changing Task 1
process durations sometimes varied with SOA, and changing Task 2
process durations sometimes changed RT1. The correlation between
RT1 and RT2 decreased as SOA increased for most variants of their
model but not for all. The accepted practice of dropping trials in which
RT1 and RT2 were made within a short time of each other effectively
removed trials on which grouping occurred for some variants, but
increased contamination due to grouping with others. A prediction of the
Single Central Bottleneck model satisfied for all variants was that the
effect of a prolongation of A1 or B1 on RT2 was less than or equal to its
effect on RT1. Part of the difficulty in interpreting effects of factors is
that although the variants of their model are all mixtures of critical path
networks, a factor changing the duration of a process also changes the
mixing probabilities. Progress will require more data on the way
grouping is actually done.
Remarks
With logically independent dual tasks, there is no output from a Task 1

process that is needed as input to some Task 2 process. No logical
constraint prevents Task 1 processes from proceeding concurrently with
Task 2 processes at the same rate as when the tasks are done singly. Yet
this does not happen, so resource constraints must prevent it. The system
is no doubt capable of fancy scheduling of resources, such as time
sharing, in which a process is broken into parts and parts of various
processes are executed in an interleaved manner. But a simple schedule
may be optimal. Time sharing is not ordinarily optimal if process
durations can be well estimated ahead of time. Perhaps counter-
intuitively, little is usually gained by having two processes share a
resource and execute simultaneously, if they execute more slowly than
they would execute alone. Welford’s hypothesis that certain processes
from one task are sequential with certain process from the other task is
still viable. Places where sequential processing arises depend on the
tasks, but are commonly found at response selection and response
preparation. Curiously, few experiments directly manipulate the
difficulty of central processing for both Task 1 and Task 2, to verify they
are sequential. Despite little direct evidence about central constraints,
they are important and response resource competition is rarely, if ever,
the only constraint.
At the time of Welford and Hick the central resource constraint was
thought to be channel capacity in the information theory sense. But a
single neuron is capable of accumulating evidence and firing when the
evidence reaches a threshold, so if mechanisms for decisions were the
only constraint thousands of decisions could proceed simultaneously.
For a mechanism to decide on a response by accumulation of evidence,
inputs must be connected to the mechanism along with outputs to all
possible responses, and criteria for all responses must be set. For a
laboratory task, these settings are temporary. There is growing support
for the hypothesis that the constraint on response selection is in
establishing and maintaining these task sets (e.g., Duncan, 1979; see Lien
and Proctor, 2002, for review and related material). With this
interpretation, selective influence of a response selection process
becomes more technically demanding. Instead of prolonging the time a

process uses a resource in isolation, the experimenter needs to think
about prolonging resource preparation, tear down for the process that just
finished and set up for the process scheduled to start. A factor
influencing this task switch may easily have effects at more than one
locus and prevent factors from selectively influencing processes (Logan
& Gordon, 2001; Logan & Schulkind, 2000).
One can argue that the brain operates smoothly like the gut, and
divisions of perception, response selection and movement are arbitrary.
But if there is segmentation, as in the limbs, then it is not surprising to
find perception, response selection and movement proceeding in that
order for each task. What is surprising is evidence that factors selectively
influence processes, i.e., that processes are separately modifiable
(Sternberg, 1998). Considerable evidence suggests that processes
communicate and their outputs combine; see Miller (1982) for a general
approach, e.g., Diederich (1995) and Colonius and Diederich (2009) for
sensory processes, and e.g., Hommel (1998) for central processes.
Despite potential entanglement, the evidence indicates that factors in
certain situations selectively influence processes. Considerable effort
will be needed to sort out the situations.
In dual task studies, conclusions are usually based on mean reaction
times. More information is available from entire reaction time
distributions, to which we now turn. We will see that correlations
between process durations do not preclude selective influence.
Appendix
Why use sequential processing if concurrent processing is possible?
It would be better to execute processes simultaneously rather than one by

one if there is no cost for simultaneous execution. But ordinarily a
process is faster if it is executed alone than if it is executed
simultaneously with others. The result is that sequential scheduling is
ordinarily better than concurrent. In Figure 5A.1, for example, suppose
processes A and B each have duration 1 when executed alone, but have
duration 2 when executed together, because capacity is split between
them. For the time at which A and B are both completed it makes no
difference whether they are sequential or concurrent; this completion
time is 2 in either case. But if A is executed first, followed by B, then A
completes at time 1. If A and B are executed concurrently A completes at
time 2. If the completion time of each process matters, e.g. if the
objective is to minimize the average time at which the processes are
completed, then a sequential schedule is better than a concurrent
schedule.
Fig. 5A.1. If the rate of concurrent processing is slower than that of sequential
processing, then the time at which the first process is finished is later with concurrent
processing.
Discovering Process Order
Information about the order in which processes are executed can be

found in three main ways.
Selectively influencing three processes
Suppose process A precedes process B which precedes process C. By

prolonging the processes two at a time, the three coupled slacks k(A, B),
k(A, C) and k(B, C) can be estimated. Denote the reaction time by T(0, 0,
0) when no processes are prolonged, and by T(ΔA, 0, 0) when process A
is prolonged by amount ΔA, with the other processes not prolonged. The
effect of prolonging process A by ΔA is ΔT(ΔA, 0, 0) = T(ΔA, 0, 0) − T(0,
0, 0). Other times and effects are denoted analogously. When all three
processes are prolonged by amounts large enough to overcome the
relevant slacks, the combined effect is
ΔT(ΔA, ΔB, ΔC) = ΔT(ΔA, 0, 0) + k(A, B) + ΔT(0, ΔB, 0) + k(B, C) +

ΔT(0,0,ΔC). (5A.1)
The term k(A, C) does not appear, and its absence indicates that A and C
are at the extremes with B in the middle. Each coupled slack parameter
arises from slack between two processes. The three processes have only
two spaces between them, so one of the coupled slacks is irrelevant.
Derivation in Schweickert (1978).
Analyzing two responses
Consider a task with two responses, r1 and r2. Suppose process A

precedes process B, which precedes both responses. The slack from A to
B does not depend on which response is made, but the total slack from A
to r1 may not be the same as the total slack from A to r2. There may be a
different coupled slack for each response.
k1(A, B) = s(A, r1) − s(A, B)

k2(A, B) = s(A, r2) − s(A, B) .
Suppose process A is prolonged by ΔA, which is larger than both s(A, r1)
and s(A, r2). Let t1(ΔA, 0) denote the response time for r1 when A is
prolonged by ΔA, and denote other times similarly. Then (see Chapter 3)
t1(ΔA, 0) − t1(0, 0) = ΔA − s(A, r1)

t2(ΔA, 0) − t2(0, 0) = ΔA − s(A, r1).
Hence,
t1(ΔA, 0) − t1(0, 0) + k1(A, B) = t2(ΔA, 0) − t2(0, 0) + k2(A, B). (5A.2)
The analogous equation if process B precedes process A is
t1(0, ΔB) − t1(0, 0) + k1(A, B) = t2(0, ΔB) − t2(0, 0) + k2(A, B). (5A.3)
Both Equations (5A.2) and (5A.3) can be true; this happens for example
if A and B are in series. But if Equation (5A.2) is true and Equation
(5A.3) is false, then process A precedes process B.
If process A precedes process B, an equation similar to Equation
(5A.2) is useful because it does not require estimates of coupled slacks,
t1(ΔA, ΔB) − t1(0, ΔB) = t2(ΔA, ΔB) − t2(0, ΔB). (5A.4)
Derivation in Schweickert (1978).
Analyzing comparability
The two quantitative analyses above require good estimates of response

times. The following is a simple qualitative way to determine process
order (see Golumbic, 1980).
Suppose it is known somehow that process A precedes process B.
Suppose it is also known that process C is concurrent with A and

sequential with B. If C followed B, then C would follow A. This
contradicts C concurrent with A. Therefore, C precedes B.
Chapter 6
Effects of Factors on Distribution Functions

and Consideration of Process Dependence
Previous chapters discussed effects on mean reaction times of selectively

influencing processes and what the effects reveal about process
organization. But a mean is only a summary, and the complete
information about a distribution is in its cumulative distribution function
(cdf). We turn to what these reveal about process organization. For a
random variable T, the cumulative distribution function is F(t) = P[T ≤ t].
For a continuous random variable T the density function, f(t), is the
derivative of the cumulative distribution function.
We begin with processes in series and turn to mixtures of processes.
Both arrangements predict additive effect of factors selectively
influencing processes on mean reaction times. But cumulative
distribution functions can discriminate between the arrangements.
Tests of Equal Distribution Functions
Ashby and Townsend (1980) found a simple but important relation for
cumulative distribution functions for processes in series. It is easy to
explain the relation intuitively. With our usual notation, suppose a task is
performed by executing process A followed by process B. Suppose
Factor Α changes the duration of process A, leaving process B
unchanged, and Factor Β changes the duration of process B, leaving
process A unchanged. Suppose when Factor Α is at level i and Factor Β
is at level j the response time is Tij = Ai + Bj. Additive effects of the
factors on mean reaction times are predicted, that is E[T22] + E[T11] =
152
Effects of Factors on Distribution Functions 153
E[T12] + E[T21], the basis of Sternberg’s (1969) Additive Factor Method.
Processes in series: Cumulative distribution functions
Consider an analogous equation for cumulative distribution functions.

Values of response times at different levels of the factors are obtained
from different trials of an experiment, so it is not obvious what it would
mean to form a sum of response times such as T22 + T11. (Would one add
the reaction time on the first occurrence of factor level combination (2,
2) to that on the first occurrence of combination (1, 1), and so on? What
if the number of trials for the two combinations is not the same?) If a
way to form such sums could be found, it is easy to see that
T22 + T11 = A2 + B2 + A1 + B1,
and
T12 + T21 = A1 + B2 + A2 + B1.
The two sums consist of the same terms in different orders. This
suggests that under suitable assumptions,
T22 + T11 ≈ T12 + T21,
where “≈” means “has the same distribution as.” That is, for every time
t, if we let FT22+T11(t) and FT12+T21(t) denote the cumulative distribution
functions of T22 + T11 and T12 + T21, respectively, then for every time t
FT22+T11(t) = FT12+T21(t).
To put this on a rigorous footing and develop a practical test, Ashby

and Townsend (1980) assumed that for every combination (i, j) of levels
of the factors Α and Β, the durations Ai and Bj are independent. A sample
of response times can be obtained for every combination (i, j) of levels of

the factors, allowing an empirical estimate of the cumulative distribution
function Fij(t) and density function fij(t) of Tij. If one assumes process
durations are independent, a substantial assumption,
F22(t) f11(t) = F12(t) f21(t).
Here,  denotes the convolution operation, see Equation (6.4). The

equation is testable by carrying out numerical convolutions with the
estimated cumulative distribution and density functions.
Processes in series: The Summation Test
Roberts and Sternberg (1993) proposed a test based more directly on the
sampled response times. Their test was derived using the assumption
that for every combination (i, j) of levels of the factors Α and Β, the
durations Ai and Bj are independent. Independence is a sufficient, but not
necessary condition for their test of FT22+T11(t) = FT12+T21(t); see their
chapter for discussion. Independence of process durations is a strong
assumption, but there is evidence that it sometimes occurs. Sternberg
(1969) pointed out that factors selectively influencing processes in series
with independent process durations would have not only additive effects
on means, but on variances and cumulants at all levels. Additive effects
of factors on both means and variances were reported by Sternberg
(1969, p. 305); further reports are mentioned in the section on Process
Dependence below.
We first describe the test of Roberts and Sternberg (1993) and then
give their explanation of it. To start, every possible pair is formed with
the first member of the pair a response time from the set of response
times observed in combination (2,2) of factor levels and the second
member of the pair a response time from the set of response times
observed in combination (1,1). This is the Cartesian product of the two
response time sets. Then every pair of response times in this Cartesian
product is added. The result is a sample of values of T22 + T11. From this
sample an estimate of the distribution function FT22+T11(t) is formed; for
every t, the estimated value of FT22+T11(t) is the proportion of summed

response times less than or equal to t. The same procedure is followed
again, starting with the Cartesian product of the set of response times
observed in combination (1,2) of levels and the set of response times
observed in combination (2,1) of levels. An estimate of the distribution
function FT12+T21(t) is formed. If the assumptions are met, the estimated
cumulative distribution functions will be equal.
The rationale for the Cartesian products is as follows. For every
combination (i, j), the random variable Ai is assumed to be independent
of the random variable Bj. Consider combination (1,2). A value a1 of A1
is equally likely to occur with any value b2 of B2. To approximate this
property in the samples, the Cartesian product is formed of pairs <t11, t22>
= <a1 + b1, a2 + b2>. Observed response times from combination (1, 1)
are paired with observed response times from combination (2, 2). In the
pairs of this Cartesian product, each observed value a1 occurs exactly
once with each observed value b2. The sum of the pairs in the Cartesian
product is a1 + b1 + a2 + b2. This sum rearranged equals a1 + b2 + a2 + b1.
Sampled values of the rearranged sum are formed by adding the pairs in
the Cartesian product <t12, t21> = <a1 + b2, a2 + b1>. The two Cartesian
products are different, but the sums of their pairs are predicted to have
the same cumulative distribution functions.
Results of the Roberts and Sternberg (1993) Summation Test for
three data sets are in Figure 6.1. The top panel illustrates results for
manipulation of two factors in a detection experiment (Backus &
Sternberg, 1988, Experiment 1). Subjects responded to a light flash by
pulling a lever. The foreperiod interval between a warning signal and the
flash was varied, as was the intensity of the flash. The bottom two
panels illustrate results of manipulating two factors in an identification
experiment (Sternberg, 1969, Experiment V). Subjects responded to a
visually presented number by saying a number. The middle panel is for
the case where two numbers were possible on any trial, the bottom panel
is for the case where eight numbers were possible. For each number of
alternatives, stimulus quality was either high or low, and the subject
either named the number (easy compatibility) or named its successor
(difficult compatibility). In each panel of Figure 6.1 agreement between
the cumulative distribution functions predicted to be equal is striking,

support for the independent serial stage model. Visual agreement is
reinforced by statistical tests, see Roberts and Sternberg (1993) and
Table 6.1 for details.
Fig. 6.1. Results of Summation Test. Panel A: detection data. Panels B and C:
identification data with 2 and 8 alternatives, respectively. Left of each panel: cdfs for
four factor level combinations. Right of each panel: cdfs for summations. Note: From
Roberts and Sternberg (1993), The meaning of additive reaction-time effects: Tests of
three alternatives. In Meyer, David E., and Sylvan Kornblum (Eds.), Attention and
Performance XIV: Synergies in experimental psychology, artificial intelligence, and
cognitive neuroscience, figure 26.2. Copyright 1993, Massachusetts Institute of
Technology, by permission of The MIT Press.
Subjects were very highly practiced, and data were combined over
different sessions. It is likely that the durations of both processes A and
B differ for different subjects; in other words, different subjects may
induce a covariance between process durations. Likewise, the different
stimulus numerals might induce a covariance. To avoid this, calculations
were done for combinations of levels of nuisance factors separately, then
averaged over combinations. For example, with eight alternatives in the
identification experiment, calculations were done for each subject and
numeral separately, then averaged over subjects and numerals. Before
averaging, observations were rescaled with a linear transformation
intended to increase the sensitivity of the test. See Roberts and Sternberg
(1993) for details.
Mixtures of processes: The Mixture Test
The Alternate Pathways Model of Roberts and Sternberg (1993) has

quite different architecture, but also produces additivity for mean
reaction times. Suppose a subject can perform a task in more than one
way. In particular, suppose on a given trial the subject chooses with
probability p to perform the task in such a way that the response time has
the cumulative distribution FA(t). Alternately, suppose the subject
chooses with probability 1 − p to perform the task in such a way that the
response time has the cumulative distribution function FB(t). Over all
trials, the response time has the cumulative distribution
F(t) = pFA(t) + (1 − p)FB(t).
The overall cumulative distribution is a mixture of FA(t) and FB(t).

Suppose Factor Α changes FA(t), Factor Β changes FB(t), and neither
factor changes p. Suppose when Factor Α is at level i and Factor Β is at
level j the overall cumulative distribution is
Fij(t) = pFAi(t) + (1 − p)FBj(t).

The Alternate Pathways Model predicts additive effects of the factors

on means. Let Tij be the reaction time when Factor Α is at level i and
Factor Β is at level j. The reaction time density function is obtained by
differentiation;
fij(t) = pfAi(t) + (1 − p)fBj(t).
Then
 
E[Tij ]   tpf Ai (t )dt   t (1  p) f Bj (t )dt  pE[ Ai ]  (1  p) E[ B j ].
0 0
Additivity of expected values follows immediately.
Distinguishing serial processes from mixtures
Cumulative distribution functions can distinguish the Alternate Pathways

Model and the serial independent processes model. For any time t, the
cumulative distribution function interaction contrast is
C(t) = F22(t) − F21(t) − F12(t) + F11(t). (6.1)
With the Alternate Pathways Model, for every time t, C(t) = 0. To

facilitate comparison with the summation test, the above equation can be
written as
1 1
[ F22 (t )  F11 (t )]  [ F21 (t )  F12 (t )] .
2 2
Roberts and Sternberg (1993) call this the Mixture Test. Note that the
cumulative distribution function for each combination of levels can be
estimated separately; that is, there is no need for the Cartesian products
used in the Summation Test. Results are in Figure 6.2 for the same three
data sets as the Summation Test was applied to in Figure 6.1. Clearly,
the mixture test fails, so the Alternate Pathways Model can be rejected
for these data. See Roberts and Sternberg (1993) for more details about
carrying out the test.
The Alternate Pathways Model and the independent serial processes
(stages) model lead to contrary predictions. For example, they make
different predictions about variances. With independent serial processes,
V[Tij] = V[Ai] + V[Bj].
It follows immediately that the factors have additive effects on

variances. But with the Alternate Pathways Model, the factors are
predicted to have interactive effects on the variances. The interaction
follows from
V[Tij] = E[Tij2] − (E[Tij])2.
Fig. 6.2. Results of Mixture Test. Panel A: detection data. Panels B and C:
identification data with 2 and 8 alternatives, respectively. Note: From Roberts and
Sternberg (1993), The meaning of additive reaction-time effects: Tests of three
alternatives. In Meyer, David E., and Sylvan Kornblum (Eds.), Attention and
Performance XIV: Synergies in experimental psychology, artificial intelligence, and
cognitive neuroscience, figure 26.4. Copyright 1993, Massachusetts Institute of
Technology, by permission of The MIT Press.
For the Alternate Pathways Model, with levels 1 and 2 of each factor,
the interaction contrast for the variances is
V[T22] − V[T21] − V[T12] + V[T11] =

− 2p(1 − p)(E[A2] − E[A1]) (E[B2] − E[B1]).
The factors will not have additive effects on the variance except trivially,
when one or the other has no effect on the mean.
Interestingly, the Alternate Pathways Model predicts the factors will
have additive effects on the squared reaction times (the second raw
moments). Reasoning as with expected values, additivity follows from
E[Tij2] = pE[Ai2] + (1 − p)E[Bj2].
Simulations satisfying the Mixture Test1
Data cannot simultaneously satisfy both the Summation Test and the
Mixture Test, except trivially, when, for example, there are no effects.
The data sets analyzed by Roberts and Sternberg (1993) passed the
Summation Test but failed the Mixture Test. To consider the opposite
situation, Roberts and Sternberg (1993, Note 20) generated simulated
data that pass the Mixture Test. Failure of the Summation Test is
predicted, but would the failure be noticeable?
To ensure realistic effect sizes, variances and so on, simulated data
were generated from the actual data sets discussed earlier, which passed
the Summation Test. Recall that for each data set there were two
experimental factors, each with two levels. In a particular data set, let Tij
denote the reaction time when the first factor is at level i and the second
at level j. Simulated data for the four factor level combinations were
generated as follows. The observed reaction times T11 and T22 were used
directly as simulated reaction times T*11 and T*22, respectively. Then the
observed reaction times T11 and T22 were pooled, and the pool randomly
divided into two halves. One half was used as T*12, the other half as
T*21. For each data set, the procedure was carried out once for each
subject and stimulus type separately. The number of observations in

conditions (1, 1) and (2, 2) determined the number of simulated trials.
To see that the Mixture Test is satisfied with this procedure, let F*ij(t)
be the cumulative distribution function of the simulated reaction time
T*ij. Then
1 1
F *12 (t )  F *21 (t )  F *11 (t )  F *22 (t ) ;
2 2
so,
F *22 (t )  F *21 (t )  F *12 (t )  F *11 (t )  0 .
The cumulative distribution contrast of Equation (6.1) is 0 as required for

passing the Mixture Test.
The Summation Test was carried out, in the way described earlier, for
each of the three simulated data sets. Results are in Figure 6.3 for the
detection data and Figures 6.4 and 6.5 for the data on identification of
numerals by naming, with number of alternatives 2 and 8, respectively.
It is obvious visually that the Summation Test fails. (In the figures,
F*11(t), F*12(t),... are denoted F11, F12,....)
Statistical tests support this conclusion. The Summation Test predicts
F*T22+T11(t) = F*T21+T12(t). At 10 msec intervals, t-tests were conducted
on the difference between the left and right hand side, whenever the
between-subject standard error was nonzero. Results are in Table 6.1.
The Summation Test was satisfied for the actual experimental data, but is
plainly rejected for the simulated data. The results show that simulated,
but realistic, data satisfying the Mixture Test conspicuously failed the
Summation Test.
Table 6.1
Statistical Tests for the Summation Test
Experimental Data Simulated Data

(Not Satisfying Mixture Test) (Satisfying Mixture Test)
Data Set Tests Significant Tests Tests Significant Tests
Detection 42 0 39 29
Numeral Naming
2 Alternatives 66 0 69 37
8 Alternatives 61 0 91 57
Note: From Roberts & Sternberg (1993, Note 20) and S. Sternberg (personal
communication, June 21, 2011).
Fig. 6.3. The Summation Test fails for Simulated Data by Roberts and Sternberg that pass
the Mixture Test: Detection. S. Sternberg (personal communication, June 21, 2011).
Fig. 6.4. The Summation Test fails for Simulated Data by Sternberg and Roberts that pass
the Mixture Test: Identification, 2 Alternatives. S. Sternberg (personal communication,
June 21, 2011).
Fig. 6.5. The Summation Test fails for Simulated Data by Sternberg and Roberts that
pass the Mixture Test: Identification, 8 Alternatives. S. Sternberg (personal
communication, June 27, 2011).
Statistical mimicking
Although the Alternate Pathways Model cannot precisely mimic the

independent serial processes model, Van Zandt and Ratcliff (1995)
showed that simulated data generated by an Alternate Pathway Model
can statistically pass both the Mixture Test (passing predicted, as
described above) and the Summation Test (passing not predicted, as
described above).
The simulation of the Alternative Pathways Model was of a 2  2
design with two factors. A third factor was included to simulate the
nuisance factors in the experiments analyzed by Roberts and Sternberg
(1993). The procedure for simulating the third factor is not completely
described in Van Zandt and Ratcliff (1995). They say it interacted with
the other two factors in their simulations, but it is not clear from their
description whether an interaction was produced, and if so how.
(Roberts and Sternberg (1993) did not report that any nuisance factor
they analyzed interacted with an experimental factor.) For analysis of
the third factor, Van Zandt and Ratcliff (1995) say they followed the
procedure of Roberts and Sternberg (1993).
In the simulation, a process A was selected with probability .427 and
a process B with probability .573; these probabilities did not change with
the factor levels. Each process duration was ex-Gaussian, i.e., the sum of
a normal and an independent exponential random variable. In the
simulated 2  2 design each factor increased the mean of the exponential
random variable for one of the processes, leaving all else unchanged.
This model satisfies the assumptions of the Mixture Test, but not of the
Summation Test. However, in several statistical analyses the simulated
data passed both tests. Passing the summation test is not a consequence
of following the procedure of Roberts and Sternberg (1993) for the
nuisance factor and rescaling, because it was passed whether or not the
procedure was applied.
The simulation demonstrates that an investigator can be misled. With
the Alternate Pathways Model each factor selectively influences a
different process, and the factors are predicted to have additive effects on
reaction time means (although the processes are not in series). An
investigator finding mean additivity might decide that the most plausible
explanation is that the factors selectively influence serial stages. It
would be natural for the investigator to use the Summation Test as a
further diagnostic. But with the simulation it led to erroneous
confirmation of serial stages.
There are warnings to the investigator in the simulated data. One
warning is large reaction time variances, signaling low power. A
measure of variability is the coefficient of variation, the standard
deviation divided by the mean. For reaction times, it is typically less
than .5 (some examples are in Luce, 1986). For the three data sets in
Figure 6.1 A, B and C, the coefficients of variation are .11, .14 and .08,
respectively, from the grand means and variances in Tables 26.2 and 26.3
of Roberts and Sternberg (1993). But for the simulated reaction times it
is unusually large, .86, when both factors are at their high levels. This
suggests an unusually large number of observations would be needed to
achieve good power. Indeed, an ANOVA on variances failed to detect
the interaction that is present for the Alternate Pathways Model. Another
warning to the investigator is that both the Summation Test and Mixture
Test were passed, theoretically not possible. The simulations
demonstrate that an investigator ignoring the warnings could be misled;
visually the agreement between the two sides of the Summation Test is
remarkably good for the Alternate Pathways Model simulation (Van
Zandt & Ratcliff, 1995, Figure 14).
To summarize, the distribution equality tests of Ashby and Townsend
(1980) and Roberts and Sternberg (1993) are valuable because they
provide radically different information from the usual analysis of mean
reaction times. Success is evidence that factors selectively influence
different serial processes, whose durations are stochastically
independent. Failure is evidence that one or the other assumption is
wrong. In particular, failure together with additive effects of the factors
on mean reaction times could be the result of factors selectively
influencing serial processes with dependent response times. Van Zandt
and Ratcliff (1995) demonstrate that, as with all tests, attention to both
incorrect acceptance and incorrect rejection is required.
Commutative, associative operations: The Decomposition Test
The summation test can be generalized to operations other than +. Other

operations arise. For example, if two processes are in parallel and the
response is made when both are finished, then the reaction time equals
the maximum of the durations of the two processes. The operation max
is associative and commutative; these are the crucial properties of the
operation + in the reasoning of Roberts and Sternberg (1993) underlying
the summation test.
Dzhafarov and Schweickert (1995) considered process durations
combined with an arbitrary binary operation ♦ that is associative and
commutative. Suppose when Factor Α is at level i and Factor Β is at
level j the reaction time is
Tij = Ai ♦ Bj.
Because ♦ is associative and commutative, we conclude, as for the

summation test,
T11 ♦ T22 ≈ T12 ♦ T21.
Suppose process durations are stochastically independent. Then to

carry out the Decomposition Test for ♦, Cartesian products of the
observed reaction times are formed in the same way as for the
Summation Test. That is, one forms {<t11, t22>} and {<t12, t21>}. At the
next step, instead of adding the pairs in each Cartesian product, one
combines them with ♦, to form expressions t11 ♦ t22 and t12 ♦ t21. The
cumulative distribution functions for these expressions are predicted to
be equal.
Stochastic independence of process durations is not required for a
variation of the test. Process durations satisfy perfect positive stochastic
interdependence (Dzhafarov, 1992) if there exists a random variable X
and functions f (x)and g(x), both functions increasing in x, such that for
every level i of Factor Α and every level j of Factor Β
<Ai, Bj> ≈ <f(i, X), g(j, X)>.
Perfect positive stochastic interdependence is sufficient for the

Decomposition Test provided the Cartesian products are formed
accordingly; see Dzhafarov and Schweickert (1995) for details.
See the paper also for discussion of representation and uniqueness.
Recall, for example, that Roberts and Sternberg (1993) found that the
mixture model will produce additive effects of factors on mean reaction
times, yet in that model the reaction time is not expressed as the sum of
the durations of two random variables. This raises the question of when
component random variables are guaranteed to exist, combined with +
when the Summation Test is passed, or combined with ♦ when the
Decomposition Test is passed. With perfect positive stochastic
interdependence, if the Decomposition Test is successful for operation ♦
then the reaction times can be expressed in terms of component random
variables combined with ♦. Further, under general conditions, if the
Decomposition Test is successful for an operation ♦ it cannot also be
successful for a distinct operation ◊.
Distribution function equalities give more information than analysis
of means, and they lead to valuable representation and uniqueness
properties. As developed so far the tests require a strong assumption
such as process independence, or knowing the form of process
dependence. We turn now to the other main type of distribution test,
based on distribution function inequalities. For these tests, the form of
process dependence need not be known.
Distribution Function Interaction Contrasts
Nozawa (1992) and Townsend and Nozawa (1995) developed a test

carried out with survivor functions, or equivalently, with cumulative
distribution functions. The survivor function for a random variable T is
1 minus its cumulative distribution function. That is, for every real
number t,
S(t) = 1 − F(t) = P[T > t].
Conveniently, the expected value of a continuous nonnegative random

variable T is the integral of its survivor function. That is,

E[T] = S (t ) dt ,
0
(6.2)
see, e.g., Cinlar (1975).

When Factor Α is at level i and Factor Β is at level j let the reaction
time be Tij, with survivor function STij(t). Consider levels i* and i of
Factor Α and levels j* and j of Factor Β. In earlier chapters we worked
with the mean interaction contrast,
E[Tij] − E[Ti*j] − E[Tij*] + E[Ti*j*].
By analogy, for every time t, the survivor interaction contrast is
SIC(t) = STij(t) − STi*j(t) − STij*(t) + STi*j*(t). (6.3)
Because the cumulative distribution function is 1 minus the survivor

function, the equivalent cumulative distribution function interaction
contrast of Equation (6.1) is
C(t) = FTij(t) − FTi*j(t) − FTij*(t) + FTi*j*(t) = − SIC(t).
Suppose two factors selectively influence two different processes and

the response is made when both are completed. Nozawa and Townsend
showed that under plausible assumptions, if the two processes are in
parallel the survivor interaction contrast is negative or zero at all times t.
If the two processes are in series, the survivor interaction contrast is
negative or zero for small times t and positive or zero for large times t.
Further, for serial processes the net area bounded by the survivor
interaction contrast is zero. Statistical tests for these properties have

recently become available, see Houpt and Townsend (2010).
To explain the predictions, let us begin with a simple situation for
which we make three major assumptions. Let X denote both the name of
process X and also the random variable for its duration. When Factor Α
is at level i and Factor Β is at level j denote the duration of process A as
Aij, with survivor function SAij(t) and denote the duration of process B as
Bij with survivor function SBij(t). We also write, e.g., Pij[A > a] to denote
the probability that when Factor Α is at level i and Factor Β is at level j
the duration of process A is greater than a.
Assumption 1. Independence of Process Durations
For all levels i and j, for every t,
Pij[A > a and B > b] = Pij[A > a]Pij[B > b].
Assumption 2. Marginal Selective Influence
For all levels i and j, for every t, the survivor function for the duration
of process A does not depend on j,
SAij(t) = SAi(t).
The analogous statement is true for the duration of process B,
SBij(t) = SBj(t).
Assumption 3. Stochastic Dominance (a.k.a. The Usual Stochastic

Ordering)
Suppose Marginal Selective Influence holds. The levels of Factor Α

can be ordered i = 1, 2,... so that if i < i' then for every time t
SAi(t) < SAi'(t).
Likewise, the levels of Factor Β can be ordered j = 1, 2,... so that if j < j'
then for every time t
SBj(t) < SBj'(t).
For recent statistical tests of stochastic dominance, see Heathcote,

Brown, Wagenmakers and Eidels (2010).
Processes in parallel or in series
With two processes there are two cases, parallel and serial, both
considered by Townsend and Nozawa (1995). The following theorem
based on their work describes the survivor interaction contrast for two
parallel processes.
Theorem 6.1 Suppose processes A and B are in parallel with

stochastically independent process durations. Suppose Factor Α
selectively influences process A and Factor Β selectively influences
process B so that marginal selectivity is satisfied. Consider levels i* and
i of Factor Α with i* < i so for every time t, SAi*(t) ≤ SAi(t). Consider
levels j* and j of Factor Β with j* < j so for every time t, SBj*(t) ≤ SB(t).
(a) If the response is made as soon as either A or B finishes, then for

every time t
SIC(t)  0.
(b) If the response is made as soon as both A and B finish, then for
every time t
SIC(t) ≤ 0.
(c) Further, suppose there is an interval I of times over which SAi*(t)
< SAi(t) and SBj*(t) < SB(t). Then the inequalities above for SIC(t)
are strict for t in I.
(d) The same conclusions follow for C(t), but with the signs reversed.
Proof: Suppose processes A and B are in parallel, and the response is

made as soon as either A or B is finished. When Factor Α is at level i and
Factor Β is at level j the reaction time is
Tij = min{Aij, Bij}.
The probability that the task is not finished at time t is the probability
that A is not finished at time t and B is not finished at time t. With
marginal selective influence and stochastic independence, the survivor
function of the reaction time is
STij(t) = SAi(t)SBj(t).
Consider levels i* and i of Factor Α and levels j* and j of Factor Β as

described in the theorem. For every time t, the survivor function
interaction contrast is
SIC(t) = STij(t) −STi*j(t) − STij*(t) + STi*j*(t)

= SAi(t)SBj(t) −SAi*(t)SBj(t) −SAi(t)SBj*(t) + SAi*(t)SBj*(t)
= [SAi(t) −SAi*(t)][SBj(t) −SBj*(t)]  0.
The inequality follows from stochastic dominance.

It is immediate that the inequality is strict as described. Reasoning is
similar when the response is made as soon as both A and B are finished.
The statement about the cumulative distribution function interaction
contrast follows from a change of sign. ∎
Now consider processes A and B in series. Suppose A precedes B and

the response is made as soon as B is finished. The reaction time is the
sum of the durations of A and B, T = A + B. The cumulative distribution
function for the reaction time is given by the convolution; that is, if the
process durations are stochastically independent, the cumulative
distribution function is
 t
FT (t )  P[T  t ]   FA (t  x) f B ( x)dx   FA (t  x) f B ( x)dx. (6.4)
0 0
See, e.g., Feller (1971). The integral in Expression (6.4) is the

convolution, denoted FA*fB(t). Capital letter F denotes a cumulative
distribution function and small letter f denotes the corresponding density
function. The upper limit of the integral can be changed here from ∞ to t
because A does not take on negative values.
To reach conclusions about the convolution, we must sometimes
make assumptions about the density function for a process duration.
Suppose there is an interval of time I = (0, τ) over which fXk(t) > fXk*(t).
Then over this interval stochastic dominance is satisfied, that is,
t t
FXi (t )  f
0
Xi (t ' )dt '  FXi* (t )  f
0
Xi* (t ' )dt ' (6.5)
for all t  (0, τ).

Note that the converse is not true. It may be that over an interval (0,
τ), FXi(t) > FXi*(t), yet the density functions cross repeatedly so there is
no interval (0,υ) over which fXi(t) > fXi*(t). (The assumption that there is
such an interval seems to be implicit in the proof of Theorem 4 of
Townsend and Nozawa, 1995, p. 354.)
Suppose the density functions fXk(t) and fXk*(t) cross only at one
nonzero time τ, that is, for 0 < t < τ, fXk(t) > fXk*(t); for t = τ, fXk(t) =
fXk*(t); and for τ < t, fXk(t) < fXk*(t). Then stochastic dominance holds for
all t; that is, for all t, FXi(t)  FXi*(t). This follows from Inequality (6.5)
and the fact that the total area under a density function must be 1.
The following theorem based on work of Townsend and Nozawa
(1995) describes the survivor interaction contrast for two serial
processes.
Theorem 6.2 Suppose process A precedes process B and the

response is made when process B finishes. Suppose process durations A
and B are stochastically independent. Suppose Factor Α selectively
influences process A and Factor Β selectively influences process B so that
marginal selectivity is satisfied for both.
Suppose for levels i* and i of Factor Α, for every time t, SAi*(t) ≤

SAi(t). Suppose for levels j* and j of Factor Β the densities fBj(t) and fBj*(t)
cross one time, at time τ, with fBj*(t) < fBj(t) for 0 < t < τ.
Then
(a) For times t in the interval (0,τ), SIC(t) ≤ 0.

(b) If the above inequality is strict over an interval, then there is an
interval over which SIC(t) > 0.
(c) The net area bounded by SIC(t) is 0.
(d) The same conclusions follow for C(t), but with the signs reversed.
Proof: The survivor interaction contrast at time t is
SIC(t) = STij(t)  STi*j(t)  STij*(t) + STi*j*(t)

= [1FTij(t)]  [1FTi*j(t)]  [1FTij*(t)] + [1FTi*j*(t)]
t t
=   F Ai (t  x ) f Bj ( x ) dx   FAi* (t  x ) f Bj ( x) dx
0 0
t t
  FAi (t  x) f Bj* ( x) dx   FAi* (t  x ) f Bj * ( x ) dx
0 0
t
=   [ FAi (t  x )  FAi* (t  x )][ f Bj ( x )  f Bj * ( x )]dx .
0
For all t the first multiplier in the integrand is nonnegative and for all x in
the interval (0, τ) the second multiplier is positive. Hence for t in the
interval (0, τ) SIC(t) < 0.
The mean interaction contrast is 0, that is,
E[Tij] − E[Ti*j] − E[Tij*] + E[Ti*j*] = 0,
because the expected value of a nonnegative random variable equals the

integral of its survivor function (Equation (6.2)),
 SIC (t )dt  0.
0
Finally, because the net area bounded by SIC(t) is 0, if it is negative

over some interval there must be some interval over which it is positive.
The further conclusions follow immediately. ∎
Townsend and Nozawa (1995) showed that results of Theorems 6.1

and 6.2 are true under more general conditions. For Theorem 6.1, the
conclusions are true for processes A and B in parallel, followed (or
preceded) by a base process C. Further, stochastic independence of
durations A and B can be weakened to conditional independence, as
follows:
For all levels i of Factor Α and j of Factor Β,
Pij[A > a and B > b|C = c] = Pij[A > a|C = c]Pij[B > b|C = c].
Finally, stochastic dominance can be weakened to the following

assumption of conditional stochastic dominance:
Suppose marginal selective influence holds. The levels of Factor Α

can be ordered so if i* < i then for every time t
SAi*(t|C = c) ≤ SAi(t|C = c).
Likewise, the levels of Factor Β can be ordered so if j* < j then for every
time t
SB*j(t|C = c) ≤ SBj(t|C = c).
The proof is similar to that of Theorem 6.1 here, see their Theorem 1, A-
4 for details.
For Theorem 6.2, the conclusions are true for processes A and B in
series, but with stochastic independence not required, and stochastic
dominance weakened to conditional stochastic dominance, stated with
notation of this case as:
Suppose marginal selective influence holds. The levels of Factor Α

can be ordered so that if i* < i then for every time t
SAi*(t|B = b) ≤ SAi(t|B = b).
To see that conclusions of Theorem 6.2 follow with these weaker

conditions, note that if process durations A and B are not independent,
the cumulative distribution function for the reaction time T = A + B can
be written

P[T  t ]  FA (t  x | B  x ) f B ( x )dx.
0
When the expression for P[T < t] now in the proof of Theorem 6.2 is
replaced by the expression above, and conditional stochastic dominance
is assumed, the conclusions follow immediately.
Task networks
The situation is more complex when there are several processes in a

directed acyclic network. Nonetheless, behavior of reaction time
distribution functions can usually distinguish whether selectively
influenced processes are concurrent or sequential, and whether there are
AND gates or OR gates (Dzhafarov, Schweickert & Sung, 2004;

Schweickert & Giorgini, 1999; Schweickert, Giorgini & Dzhafarov,
2000).
To explain the results we need to explain serial-parallel networks. A
single arc is a serial-parallel network. Given two serial-parallel
networks, they can be connected in parallel, so no arc in one is sequential
with an arc in the other. Alternatively, they can be connected in series,
so every arc in one is sequential with every arc in the other. If they are
connected in series or in parallel the result is a serial-parallel network. A
directed acyclic network is serial-parallel if it can be constructed in a
finite number of such steps. (All networks we consider are assumed to
be finite.) Networks that are not serial-parallel can be easily
characterized; they have at least one subnetwork in the form of
(homeomorphic to) a Wheatstone bridge (Kaerkes & Mohring, 1978;
Dodin, 1985).
We begin by stating the main results and illustrating them with
simulations. We suppose processes A, B,..., Z are represented by arcs in a
directed acyclic task network in which all gates are AND gates or all are
OR gates. In the theoretical results when we refer to a serial-parallel
network, we mean an arbitrary finite serial-parallel network; when we
refer to a Wheatstone bridge, we mean specifically the Wheatstone
bridge in Figure 3.4 or a network homeomorphic to it such as that in
Figure 3.5. Processes A and B in Figure 3.4 are said to be on opposite
sides of the Wheatstone bridge.
The serial parallel network simulated is the Response Selection
Bottleneck model of Figure 5.2. Process SWb in that figure is omitted.
The simulated network homeomorphic to the Wheatstone bridge is the
modified Response Selection Bottlenck model (Figure 3.5) in which the
subject is instructed to make response r1 before making response r2. In
that network, process C1 must finish before r2 is made. For both
models, the simulated results are for the time to make response r2. Note
the results are specifically for the time from the onset of stimulus s1 to
response r2. The usual reaction time t2 would be obtained from these
times by subtracting the SOA; none of the following results would be
changed by subtracting this constant.
Synopsis of results for task networks
A glance at Figures 6.7 and 6.9 shows how survivor interaction contrasts
differ for concurrent and sequential processes. The following
summarizes results for nonzero interaction contrasts. Simulations
illustrating the results are described in detail in a later section and in the
Appendix.
For concurrent processes, the survivor interaction contrast is simple;
it is positive with OR gates and negative with AND gates. For the latter,
see Figures 6.6 and 6.7. For sequential processes in a serial-parallel
network with OR gates, the survivor interaction contrast is negative over
an interval close to time 0. It may or may not change signs as t increases.
For any time t greater than 0, the area from 0 to t bounded by the
survivor interaction contrast and the time-axis is negative.
For sequential processes in a serial-parallel network with AND gates,
the survivor interaction contrast is negative over a short interval after 0
and then changes sign at least once. For any time t greater than 0, the
area from t to ∞ bounded by the survivor interaction contrast and the
time-axis is positive. See Figures 6.8 and 6.9.
In a Wheatstone bridge, when a pair of sequential processes are
selectively influenced the results are the same as if they were in a serial-
parallel network, with one exception. For two sequential processes on
opposite sides of the bridge the survivor interaction contrast at small
values of t is positive with OR gates (Figures 6.10 and 6.11) and
negative with AND gates (Figures 6.12 and 6.13).
For sequential processes on opposite sides of the bridge, the survivor
interaction contrast need not change signs as prolongations become long.
In simulations with long prolongations with OR gates, it remained
positive (Figures 6.14 and 6.15). However, in simulations with AND
gates when prolongations are large, the survivor interaction contrast can
be made to change sign for large values of t (Figures 6.16 and 6.17).
These results do not require independent process durations. The figures
are from simulations with dependent process durations, described below.
The remainder of the chapter is organized as follows. Results for
survivor interaction contrasts are stated in more detail. A proof assuming
stochastic independence is given as an example. Sources of process

dependence are then discussed. Some forms of dependence lead to
violations of results predicted for factors selectively influencing
processes. A form of dependence that does not lead to violations is then
described, demonstrated with simulations. Finally, an example of a
proof not assuming independence is given.
Results for task networks
This section fills in details of the synopsis above and can be skipped
without loss of continuity. As before, the density function for a random
variable X is denoted by fX(t), the cumulative distribution function by
FX(t), and the survivor function by SX(t). A random variable X will be
indexed by a level k of a factor if the distribution of X depends on the
level of the factor, or potentially might depend on it. For example, if
Factor Α has level i and Factor Β has level j, then Tij denotes the reaction
time, a random variable, when Factor Α has level i and Factor Β has level
j. The cumulative distribution function of Tij is usually denoted GTij(t)
with corresponding density gTij(t). In an independent network, process
durations are mutually stochastically independent.
Each inequality for a survivor function holds for the corresponding
cumulative distribution function with the direction of the inequality
reversed. Likewise, each inequality stated for the survivor interaction
contrast, SIC(t), holds for the cumulative distribution interaction contrast,
C(t), with the direction of the inequality reversed. More details can be
found in Schweickert, Giorgini and Dzhafarov (2000), Schweickert and
Giorgini (1999) and Dzhafarov, Schweickert and Sung (2004).
Selectively influencing a single process
Result 1. Suppose Factor Α with levels i* and i selectively influences

a process A in an independent serial-parallel network. Suppose for all t <
τ, the densities for process A durations are related as fAi*(t)  fAi(t).
Then for all t < τ, the densities for the reaction times are related as
fTi*(t) ≤ fTi(t).
Fig. 6.6. Survivor functions of simulated reaction times. Factors selectively influence
concurrent processes, B1 and A2 in the Response Selection Bottleneck Model. Functions
for (Low, High) and (Mid, High) are nearly superimposed, because reaction time is
governed by the maximum of the prolongations of B1 and A2. Parameters: Unique
component of B1 has α of 4 and 36 at Low and Mid levels, respectively. Unique
component of A2 has α of 4 and 96 at Low and High levels, respectively.
Fig. 6.7. Interaction contrasts for survivor functions of concurrent processes in Figure
6.6 are never positive.
Fig. 6.8. Survivor functions of simulated reaction times. Factors selectively influence
sequential processes, A1 and B1 in the Response Selection Bottleneck Model.
Parameters: Unique component of A1 has α of 4 and 96 at Low and High levels,
respectively. Unique component of B1 has α of 4 and 36 at Low and Mid levels,
respectively.
Fig. 6.9. Interaction contrasts for survivor functions of sequential processes in Figure 6.8
are negative at low times and then change sign.
Fig. 6.10. Survivor functions of simulated reaction times. Process prolongations are
short. Factors selectively influence sequential processes B1 and B2 on opposite sides of
an OR Wheatstone bridge. Parameters: Unique component of B1 has α of 48 and 96 at
Mid and High levels, respectively. Unique component of B2 has α of 48 and 96 at Mid
and High levels, respectively.
Fig. 6.11. Interaction contrasts for survivor functions in Figure 6.10 are nonnegative.
Fig. 6.12. Survivor functions of simulated reaction times. Process prolongations are
short. Factors selectively influence sequential processes B1 and B2 in an AND
Wheatstone bridge. Functions at (Mid, Low) and (Mid, Mid) level combinations are
superimposed because reaction times are determined by the maximum of the
prolongations of B1 and B2. Parameters: Unique component of B1 has α of 4 and 36 at
Low and Mid levels, respectively. Unique component of B2 has α of 4 and 12 at Low
and Mid levels, respectively.
Fig. 6.13. Interaction contrast for survivor functions in Figure 6.12 is negative at small
times. It is slightly positive at large times.
Fig. 6.14. Survivor functions of simulated reaction times. Process prolongations are long.
Factors selectively influence sequential processes B1 and B2 in an OR Wheatstone
bridge. Parameters: Unique component of B1 has α of 4 and 96 at Low and High levels,
respectively. Unique component of B2 has α of 4 and 96 at Low and High levels,
respectively.
Fig. 6.15. Interaction contrasts for survivor functions of Figure 6.14 are nonnegative. It
is not known whether such interaction contrasts can change sign.
Fig. 6.16. Survivor functions of simulated reaction times. Process prolongations are long.
Factors selectively influence sequential processes B1 and B2 on opposite sides of an
AND Wheatstone bridge. Parameters: Unique component of B1 has α of 4 and 96 at Low
and High levels, respectively. Unique component of B2 has α of 4 and 96 at Low and
High levels, respectively.
Fig. 6.17. Interaction contrasts for survivor functions of Figure 6.16 are negative at small
times and then change sign.
Result 2. Suppose Factor Α with levels i* and i selectively influences

a process A in an independent serial-parallel network or Wheatstone
bridge. Suppose for all t < τ, the survivor functions for process A
duration are related as SAi*(t) ≤ SAi(t).
Then for all t < τ, the survivor functions for reaction times are related
as STi*(t) ≤ STi(t).
Result 2 is the Long RT property of Sternberg (1973), also called

stochastic dominance or the usual stochastic ordering for reaction times.
Selectively influencing two concurrent processes
Result 3. Suppose processes A and B are concurrent in an independent

serial-parallel network or Wheatstone bridge. Suppose Factor Α with
levels i* and i selectively influences process A and Factor Β with levels
j* and j selectively influences process B, producing marginal selective
influence. Suppose for all t, SAi*(t) ≤ SAi(t) and SBj*(t) ≤ SBj(t).
If all the gates are OR gates, SIC(t)  0 and the mean interaction
contrast is nonnegative.
If all the gates are AND gates, SIC(t) ≤ 0 and the mean interaction
contrast is nonpositive.
Selectively influencing two sequential processes
For the remaining Results 4 through 9, suppose processes A and B are

sequential in an independent network. Suppose Factor Α with levels i*
and i selectively influences process A and Factor Β with levels j* and j
selectively influences process B, both producing marginal selective
influence. Suppose for all t, SAi*(t) ≤ SAi(t) and SBj*(t) ≤ SBj(t).
Sequential processes: Small times
Result 4. Suppose all gates are AND gates in a serial-parallel

network. Suppose for all t < τ, fAi*(t)  fAi(t).

Then for all t in the interval (0,τ), SIC(t) ≤ 0.
The inequality need not hold in an OR serial-parallel network.
Result 5. Suppose all gates are AND gates in a Wheatstone bridge,

with processes A and B on opposite sides of the bridge. If one of the
selectively influenced processes is A, suppose there is a time τA such that
for all t in the interval (0, τA), fAi*(t)  fAi(t). If one of the selectively
influenced processes is B, suppose there is a time τB such that for all t in
the interval (0, τB), fBj*(t)  fBj(t).
Then there is an interval (0,υ) over which SIC(t) ≤ 0.
If exactly one of the two selectively influenced processes is A, υ = τA;
if exactly one of the two selectively influenced processes is B, υ = τB; if
one is A and one is B, υ = min{τA, τB}.
Result 6. Suppose the gates are OR gates in a Wheatstone bridge.

Suppose the assumptions of Result 5 are met. Suppose one of the
selectively influenced processes is A and the other is B.
Then over the interval (0, min{τA, τB}), SIC(t)  0.
For any other pair of selectively influenced sequential processes in
the Wheatstone bridge, there is an interval (0,υ) over which SIC(t) ≤ 0.
Sequential processes: All times
Suppose it is not the case that one of the selectively influenced

sequential processes is A and the other is B on opposite sides of a
Wheatstone bridge.
Result 7. Suppose all gates are AND gates.

Then for all t  0,

 SIC(u )du  0.
t
Further, the mean interaction contrast is nonnegative.
Result 8. Suppose all gates are OR gates.

Then for all t  0,
 SIC (u )du  0.
0
Further, the mean interaction contrast is nonpositive.
Result 9. Suppose the selectively influenced processes are A and B,

on opposite sides of the Wheatstone bridge. Suppose all gates are AND
gates.
Simulations suggest that for all t  0,
 SIC(u)du  0.
0
An example proof assuming independence.
To illustrate the method of proof, we prove Result 2 using the

assumption of stochastically independent process durations as in
Schweickert, Giorgini and Dzhafarov (2000). Later we derive the result
as in Dzhafarov, Schweickert and Sung (2004) using the weaker
assumption of conditional independence of Dzhafarov (2003a).
Theorem 6.3 Suppose Factor Α with levels i* and i selectively

influences a process A in an independent serial-parallel network in which
all gates are OR gates or all gates are AND gates. Suppose for all t < τ,
SAi*(t) ≤ SAi(t). Then for all t < τ, STi*(t) ≤ STi(t).
Proof. The proof is by induction on the number of processes in the

network. If A is the only process, the conclusion is immediate.

Suppose the conclusion is true when the number of processes is n or
fewer. Consider a network N meeting the assumptions of the theorem,
with n + 1 processes. There are two cases: (a) N is formed by connecting
two serial-parallel networks in parallel and (b) N is formed by connecting
two serial-parallel networks in series.
(a) Suppose N consists of a serial-parallel network NA* having n or

fewer processes, one of which is A, and another serial-parallel network
N** in parallel with NA*. Let the survivor function for the completion
time of network N** alone be S**(t). When Factor Α is at level i, let the
survivor function for the completion time of network NA* alone be S*i(t)
and let the completion time of network N be Ti.
Suppose the gates are OR gates. Then the completion time of
network N is the minimum of the completion times of networks NA* and
N**. The probability the completion time of N is greater than t is the
probability the completion time of NA* is greater than t and the
completion time of N** is greater than t. By the assumption of
stochastically independent process durations, when Factor Α is at level i,
STi(t) = S*i(t) S**(t).
By the induction hypothesis, S*i*(t) ≤ S*i(t).
Then
STi*(t)  STi(t) = S*i*(t) S**(t)  S*i(t) S**(t) ≤ 0.
The proof is analogous when all the gates are AND gates.
(b) Suppose N consists of a serial-parallel network NA* having n or

fewer processes, one of which is A, and another serial-parallel network
N** in series with NA*. Let the density function for the completion time
of network N** be g**(t). When Factor Α is at level i, let the cumulative
distribution function for the completion time of network NA* be Fi*(t)

and let that of network N be FTi(t). Other notation is as in Case (a).
Whether there are OR gates or AND gates, with the assumption of
stochastically independent process durations, the cumulative distribution
function for the completion time of network N is

FTi (t )  Fi * ( x ) g ** (t  x )dx ;
0
see Equation (6.4).
The corresponding survivor function is STi(t) = 1 − FTi(t).

By the induction hypothesis, for all x, S*i*(x)  S*i(x), so F*i*(x) 
*
F i(x).
Then
S Ti * ( t )  S Ti ( t )  [1  FTi * ( t )]  [1  FTi ( t )]
t

 [ Fi* ( x )  Fi** ( x )]g ** ( t  x ) dx  0.
0
The proof is analogous if all the gates are AND gates. ∎
Process Dependence
Although stochastic independence of process durations is a strong

assumption, it is supported in some situations. It is not tested directly
because that would require observing individual process durations, but
there have been a few indirect tests. (a) Sternberg (1969, p. 305),
Shwartz, Pomerantz and Egeth (1977) and Roberts and Sternberg (1993)
found additive effects of factors on both means and variances of reaction
times. Additive effects on means are predicted if the factors selectively
influence processes in series, whether their durations are independent or
not. But additive effects of factors on variances are not predicted unless
process durations are independent or their covariances are restricted (e.g.,

COV[Ai, Bj] is invariant with i and j). (b) The beginning of this chapter
summarized a successful test by Roberts and Sternberg (1993) of a
prediction by Ashby and Townsend (1980). The test assumes
independent process durations. (c) Three studies found multiplicative
effects of factors on the probability of a correct response together with
additive effects of the factors on mean reaction time. Such effects can be
readily explained if process durations are independent (Schweickert,
1985). The experiments are on identification (Shwartz, Pomerantz &
Egeth, 1977), lexical decision (Schuberth, Spoehr & Lane, 1981) and
memory scanning (Lively, 1972); see Chapter 9 for more about them.
On the other hand, considerable evidence indicates that process
durations are not always stochastically independent. For example,
Dzhafarov and Rouder (1996) used a test developed by Dzhafarov (1996)
to show that simple response times to a step signal could be accounted
for by assuming two process durations were increasing functions of the
same random variable, rather than stochastically independent. As
Townsend and Thomas (1994) show, if processes are interdependent,
influencing one process can lead to changes in other processes,
producing outcomes difficult to interpret. Similar objections are raised
by Logan and Schulkind (2000) and Logan and Delheimer (2001).
Logan and Schulkind (2000, p. 1075) explain the problem well for the
Single Central Bottleneck Model, “The locus of slack analysis of Task 2
difficulty effects relies on the assumption that the difficulty manipulation
affects one and only one stage. Cross-talk between stages of different
tasks suggests that stage durations are correlated such that factors that
affect one stage also affect the other in the same manner. That is, factors
can no longer selectively influence one stage and not the other. This
violation of the assumption of selective influence means that the locus of
slack logic cannot be applied properly to situations in which there is
crosstalk from Task 2 to Task 1 (e.g., Hommel, 1998).” The results of
Hommel (1998) will be described below.
The problem of process dependence has lead to intense theoretical
work to clarify the notion of selective influence. Chapter 10 explains in
particular work of Dhzafarov and colleagues (Dzhafarov, 2003a;
Dzhafarov & Gluhovsky, 2006; Dzhafarov & Kujala, 2010; Dzhafarov,

Schweickert & Sung, 2004; Kujala & Dzhafarov, 2008, 2010). Later we
show that (a) a factor that influences a process and as a byproduct
changes another process can indeed violate predictions based on
selective influence, but (b) for the same system, there may be a different
factor that selectively influences an individual processes in such a way
that predictions are satisfied. Objections are well founded, but apply in
some situations and not in others. Next we discuss a few forms of
process dependence. Then we take up selective influence again, in the
context of process dependence.
Capacity
Capacity limits are one reason process durations would be stochastically

dependent. Some parts of the human information processing system may
be limited to carrying out only one process at a time (e.g., Welford
(1959), or a few at a time (Fisher, 1984). Other parts might execute
more than one process simultaneously, but because of limited capacity,
an increase in the rate of one process leads to a decrease in the
processing rate of another.
As we said in Chapter 5, some theories say the delay in reaction times
when subjects perform a dual task rather than a single task is due to
processes going more slowly when they are carried out simultaneously
than when alone. Several dual task phenomena can be explained by
assuming central processing is concurrent with shared capacity; Tombu
and Jolicoeur (2003) provide a model. However, as Miller, Ulrich and
Rolke (2009) point out, if the price of concurrency is slowness, it may be
optimal to schedule processes sequentially so they do not share capacity
(see Chapter 5, Appendix). This matters because if processes share
capacity, it may be difficult to selectively influence them.
There is a way to determine whether processes share capacity,
through a measure defined in terms of distribution functions (Townsend
& Ashby, 1983; Townsend & Nozawa, 1995; Townsend & Wenger,
2004). It is often used in factorial experiments to learn about the way
processes depend on each other. It would take us too far from our topic
of selective influence to discuss capacity more than briefly, so a reader

wanting more information is referred to the original papers.
Consider a redundant signal paradigm. One stimulus, another, or
both are presented. The task is to respond as soon as either stimulus is
detected. When stimulus sa is presented alone, suppose the response is
made when a single process A is completed. Let Aalone denote the
reaction time when stimulus sa is presented alone. Likewise, when
stimulus sb is presented, suppose the response is made when a single
process B is completed, and denote the reaction time by Balone. When
stimuli sa and sb are both presented, suppose processes A and B are
executed in parallel, with durations A and B, respectively, and the
response is made as soon as either finishes. Denote the reaction time
when stimuli sa and sb are both presented as TAorB. Then TAorB = min{A,
B}. Let SAorB(t) denote the survivor function of TAorB. Then
SAorB(t) = P[TAorB > t] = P[A > t and B > t].
Suppose the durations of A and B are stochastically independent. Then

P[A > t and B > t] = P[A > t] P[B > t]. In terms of survivor functions,
SAorB(t) = SA(t)SB(t), (6.6)
where SA(t) and SB(t) are the survivor functions for the individual
durations of A and B, respectively. Suppose when A and B are executed
simultaneously, they are positively dependent; that is, as the duration of
A increases the duration of B tends to increase. Then the time to
complete either A or B when executed simultaneously tends to be shorter
than the time to complete A alone and the time to complete B alone. That
is,
SAorB(t) > SAalone(t)SBalone(t).
On the other hand, when A and B are executed simultaneously if they are
negatively dependent,
SAorB(t) < SAalone(t)SBalone(t).
This suggests comparing survivor functions of processing time when

both processes are parallel with survivor functions for their individual
processing times. In the measure of capacity this is done through their
hazard functions.
The hazard function of a random variable X with density function f(t)
and survivor function S(t) is
h(t) = f(t)/S(t).
The integrated hazard function for a nonnegative random variable X

is
t
H (t )   h(t ' )dt '.
t 0
Because dS(t)/dt = − f(t), it follows that
t
f (t ' )
H (t )  
t 0
S (t ' )
dt '   ln S (t ).
Now if processing durations of A and B are independent,
SAorB(t) = SA(t)SB(t),
so
HAorB(t) = HA(t) + HB(t).
For two parallel processes finishing at an OR gate, the capacity

coefficient is
H A or B (t )
Co (t )  .
H A (t )  H B (t )
(Subscript o stands for “or.”) Hazard functions in the numerator are

calculated from response times when both stimuli are presented; those in
the denominator are calculated from response times when sa is presented
alone and when sb is presented alone.
If durations of process A and B are independent, C(t) = 1, and
processing is called unlimited. If they are negatively dependent, C(t) < 1,
and processing is called limited capacity. The intuition is that when
limited capacity processes are executed simultaneously, if one is faster
the other will be slower. If process durations are positively dependent,
C(t) > 1, and processing is called supercapacity. Because C(t) can be
calculated at every time t, it can track changes in dependence over time.
Analogously, with an AND gate, suppose the response is made as
soon as both of parallel processes A and B finish. The reaction time is
TAandB = max{A, B}. The analogous measure of capacity is
K A (t )  K B (t )
Ca (t )  ,
K A and B (t )
Townsend and Wenger (2004). Here KX(t) is analogous to the integrated

hazard function. These functions are calculated in the numerator from
response times when sa is presented alone and when sb is presented alone,
and in the denominator from response times when both stimuli are
presented. For a nonnegative random variable X with density function
f(t) and cumulative distribution F(t)
t
f (t ' )
K X (t )   F (t ' ) dt ' .
0
Subscript a in Ca(t) stands for “and.” Expression f(t)/F(t) is the reverse

hazard function. As with Co(t), values of Ca(t) greater than 1, equal to 1,
and less than 1 indicate supercapacity, unlimited capacity and limited

capacity, respectively. The capacity measure is related to several
reaction time inequalities associated with capacity and process
dependence, see Colonius and Vorberg (1994) and Townsend and
Wenger (2004).
With the following design the capacity measure and distribution
interaction contrasts can be estimated in the same experiment. There are
two positions, say, left and right. In each position, a stimulus may be
present or absent. When a stimulus is presented, it can be either high
intensity or low intensity. The task is to respond as soon as either
stimulus is presented.
Position is of secondary importance in the design. The
Presence/Absence Factor allows the capacity measure to be estimated at
each level of intensity. At high intensity, for example, reaction time for
trials on which a stimulus is present in each position can be compared
with reaction times for trials on which only one stimulus is present, on
the left, and only one stimulus is present, on the right. The Intensity
Factor allows the survivor or cdf interaction contrast to be estimated
from trials on which both stimuli are present. The left stimulus can be of
high or low intensity, so can the right stimulus. An interaction contrast
for a particular stimulus is formed from reaction times in the conditions
<High, High>, <High, Low>, <Low, High> and <Low, Low>. The
design is a double factorial paradigm (Townsend & Nozawa, 1995).
This is the prototype design, there are several variations.
Cross-talk
Another source of process dependence is cross-talk. Cross-talk is an

informal term for unnecessary information sent to a process from a
stimulus or another process. It was found in dual tasks by Navon and
Miller (1987). Another example is in Experiment 1 by Hommel (1998).
The stimulus was an S or H centered on the screen, either red or green.
The first task was to press a metal plate on the left or right, using the left
or right hand, to identify the color. The second task was to say either left
or right (the German words “links” or “rechts”) to identify the letter,
saying, e.g., left for H, right for S. The words for the second response
were deliberately chosen to be names of locations for the first response.
The two stimulus features were orthogonal so information about the
correct first response is irrelevant to the correct second response.
If the participant responded to a red H with his left hand to indicate
red and said left to indicate H the two responses were compatible. But if
the participant responded to a red S with his left hand to indicate red and
said right to indicate S the two responses were incompatible. By design,
the color and the letter identity were statistically independent.
Nonetheless, both responses were faster if the responses were
compatible. An explanation in terms of cross-talk is that while the first
response is being selected, the second response is also being selected.
Irrelevant information about the likely second response is transmitted to
the process selecting the first response, increasing or decreasing the
duration of the first response selection process.
Further evidence for cross-talk was found by Logan and Schulkind
(2000), who found a role for task set in producing it. In the dual task of
their Experiment 2 stimuli for both tasks were digits. For a magnitude
judgment task, the response was to indicate whether the digit was large
or small. For a parity judgment task, the response was to indicate
whether the digit was odd or even. In one condition, both tasks in the
dual task were magnitude judgment or both were parity judgment. In the
other condition, one task was magnitude judgment and one task was
parity judgment. They found an effect of the two stimuli being from the
same category (same magnitude or same parity) when the tasks were the
same, but not when they were different. In other words, cross-talk
occurred when the set for the tasks was the same, but not when it was
different. A similar result was reported later by Lien, Schweickert and
Proctor (2003). There are now many reports of cross-talk (e.g., Dutta,
Schweickert, Choi & Proctor, 1995; Logan & Delheimer, 2001; Logan
& Gordon, 2001; Schweickert, Fortin & Sung, 2007).
Two issues are raised by the presence of cross-talk in dual tasks. The
first is whether response selection for the two tasks goes on concurrently,
rather than sequentially. Hommel (1998) proposes such concurrency,
but says there is still a limitation in processing because selection of the
first response must finish before the selection of the second response can
finish. This proposal can easily be represented by modifying the Single
Central Bottleneck model (Figure 5.2). An arrow (SWa) now indicates
that B1 must finish before B2 starts. The arrow is simply moved so it
indicates that B1 must finish before B2 finishes. (The resulting model is
similar to the 1973 response interdiction model of Keele). This issue, the
change in architecture from sequential to concurrent processing, presents
no problem for selective influence. The second issue is whether cross-
talk ruins selective influence, and we will see below that it can.
Coactivation
Cross-talk of the form considered by Hommel (1998) is often called

coactivation. The idea underlying coactivation is that parallel processes
send activation to each other, increasing each other’s rates (Miller, 1982;
Colonius, 1990; Townsend & Nozawa, 1995). A coactive processing
model is discriminated from other parallel models through the behavior
of the RT cumulative distribution functions in a redundant target
paradigm (Miller, 1982; Colonius, 1990). In a redundant target
paradigm, the subject is presented with one stimulus or with two stimuli
simultaneously, with instructions to respond as soon as any stimulus is
detected. The time to respond when both stimuli are presented is
expected to be faster than when only one stimulus is presented, of course.
The response may be faster simply because when two stimuli are
presented, a process for each is executed in the same way as it would be
if the corresponding stimulus were presented alone, but the response is
made as soon as the first of these two processes is finished (Raab, 1962).
In an influential paper, Miller (1982) pointed out that with this race
model, the following Race Model Inequality would be satisfied. (The
inequality is called Boole’s inequality or the union bound in probability
theory, and sometimes called Miller’s inequality.) Let TAorB be the
reaction time to respond when both stimuli are presented, let TA and TB
be, respectively, be the time to respond when stimulus sa is presented
alone and when stimulus sb is presented alone.
P[TAorB  t] = P[TA  t or TB  t]  P[TA  t] + P[TB  t].
Assuming two parallel processes, the left side of the inequality states
that the probability the system’s RT is less than t is the probability one of
the process durations, TA or TB, is less than t. In other words, this term
says the RT of a parallel self-terminating model is the duration of the
target process that finishes first when there are two targets available. This
probability is clearly bounded by the right side of the inequality because
P(TA  t or TB  t) = P(TA  t) + P(TB  t)  P(TA  t and TB  t)

 P(TA  t) + P(TB  t).
(See Colonius, 1990; Colonius & Verberg, 1994; Townsend & Wenger,
2004, and Ulrich and Miller, 1997, for other related bounds.)
Violations of this inequality reject the race model. At any time t the
amount of violation is P[TAorB  t] − P[TA  t] − P[TB  t]; see Colonius
and Diederich (2006) for an interpretation of this quantity. Violations
suggest a coactivation system, although other systems are logically
possible (Fific, Nosofsky & Townsend, 2008). Behaviorally, the
violation means that when there are two redundant targets, RT for
detecting either target tends to be faster than the minimum of two target
processes, each executed separately in the single target condition.
Selective influence with coactivation
It is possible for factors to selectively influence processes in the presence

of coactivation. For example, suppose two stimuli are presented side by
side in a redundant target paradigm (e.g., experiment 4 in Miller, 1982)
and the visual quality (high and low) of each stimulus is manipulated
(e.g., Egeth & Dagenbach, 1991). Consider a redundant target condition
only, in which two targets are presented to observers. For a simple
model, let A and B be the two process durations, for left and right side
targets, respectively, and let T11 denote the RT when two targets are of
high visual quality. When there is no coactivation, T11 = min{A1, B1},
where A1 and B1 are the process durations when both stimuli are of high
visual quality. Similarly, T12 = min{A1, B2} when the second stimulus is
of low visual quality only, and so on. Suppose when there is coactivation
it occurs in such a way that T11 can be written c min{A1, B1}, where c is a
constant, 0 < c < 1. Note that for all t > 0, t/c > t. Suppose coactivation
occurs in the same way for other conditions, e.g., P[T12  t] = P[c
min{A1, B2}  t] = P[min{A1, B2}  t/c].
Suppose process durations are stochastically independent. Suppose
stochastic dominance is produced so for all t, SA1(t)  SA2(t) and SB1(t) 
SB2(t). Then the survivor interaction contrast when there is coactivation
is nonnegative for all t, as it would be without coactivation.
ST22(t)  ST21(t)  ST12(t) + ST11(t)

= P[c min{A2, B2} > t]  P[c min{A2, B1} > t]  P[c min{A1, B2} > t]
+ P[c min{A1, B1} > t]
= P[min{A2, B2} > t/c]  P[min{A2, B1} > t/c]  P[min{A1, B2} > t/c]
+ P[min{A1, B1} > t/c]
= SA2(t/c)SB2(t/c)  SA2(t/c) SB1(t/c)  SA1(t/c) SB2(t/c) + SA1(t/c) SB1(t/c)
= [SA2(t/c)  SA1(t/c)][ SB2(t/c)  SB1(t/c)]  0.
The survivor functions behave as predicted when two parallel processes

followed by an OR gate are selectively influenced in the absence of
coactivation (see Theorem 6.1).
Yet there is coactivation. For some t (not all) the following inequality
is satisfied, indicating coactivation.
P(T11  t )  P[ c  min{ A1 , B1}  t ]

 P[min{ A1 , B1}  t c ]
 P( A1  t / c )  P( B1  t / c )
 P( A1  t )  P( B1  t ).
A failure of selective influence with coactivation: The Channel

Summation Model
Coactivation need not make selective influence impossible, but some

forms of it do. Consider a simple reaction time task, in which the subject
responds as soon as a stimulus is detected. When a stimulus is presented,
neurons stimulated by it fire. The more intense the stimulus, the faster
the firing rate. A natural counter model for the task is that when the
stimulus is presented discrete events are generated by a process with rate
u, so the expected number of events over a time interval t is ut. A
criterion k is set. When the count of events reaches k the subject
responds that the stimulus is present. Let the expected time to respond
be E[T]. Then uE[T] = k, so
E[T] = k/u.
Now consider two stimuli presented simultaneously, say one on the

left and one on the right. Suppose the one on the left generates events
through a process with rate uL and the one on the right generates events
through a process with rate uR. Consider the task of responding as soon
as either stimulus is detected. In a Channel Summation Model (Schwarz,
1989; Diederich and Colonius, 1991), events generated by the two
stimuli are sent to a common counter and the response is made when the
total number of events generated by both stimuli equals the criterion k.
Events reach the common counter at the rate uL + uR. Hence, the
expected time to reach k events is
E[T] = k/(uL + uR).
Consider two factors, each of which changes the rate of one process.
Suppose one factor with levels 1 and i leads to rates uL1 and uLi, with uL1
> uLi. Suppose the other factor with levels 1 and j leads to rates uR1 and
uRj, with uR1 > uRj. Note that as the level of a factor goes up, the
corresponding rate goes down, so the corresponding time to reach
criterion goes up. Let T11 denote the reaction time when both factors are
at level 1; other reaction times are denoted similarly. Townsend and
Nozawa (1995) showed that the mean interaction contrast is positive,

because
E[Tij]  E[Ti1]  E[T1j] + E[T11]

k k k k
=     0.
u Li  u Rj u Li  u R1 u L1  u Rj u L1  u R1
Suppose the processes generating the events are independent Poisson

processes. Then events arriving at the common counter form the
superposition of two Poisson processes; the result is a Poisson process
whose rate is the sum of the rates of the two processes (e.g., Cinlar,
1975, p. 87). With our usual notation, let STij(t) denote the survivor
function of Tij; other survivor functions are denoted analogously. Recall
that the survivor interaction contrast is
ST22(t)  ST21(t)  ST12(t) + ST11(t).
Townsend and Nozawa (1995, Theorem 5) showed the following:
The survivor interaction contrast is negative for times near 0.

The survivor interaction contrast is positive for larger times.
These properties distinguish processes with coactivation from

processes in serial or in parallel without coactivation. But they do not
distinguish processes with coactivation from processes that are not in
series, but are sequential in a task network. All three properties, i.e.,
positive mean interaction contrast, negative survivor interaction contrast
at times near 0 and positive interaction contrast at larger times, can be
produced by factors selectively influencing sequential processes in an
AND network.
The architectures can be distinguished, however. In the Channel
Summation Model, the mean interaction contrast does not approach a
limit as the levels of the factors increase. That is, as levels i and j
increase, uLi and uRj approach 0, so the mean interaction contrast

approaches infinity. But if factors selectively influence sequential
processes in an AND network, the mean interaction contrast approaches
a limit (see tables in the Appendix to this chapter, and Chapter 3).
Insertions
Another way processes interfere with each other is by inserting messages

or interruptions (Valls, Laguna, Lino, Pérez & Quintanilla, 1998). For
example, Fortin, Rousseau, Bourque and Kirouac (1993) found that when
a participant produces a target time interval, the interval is longer if a
memory search task is carried out in the middle of it. An explanation is
that timing stops while the memory comparison is carried out, and
resumes when it is finished. (For more on interruptions during timing,
see Fortin, Bedard and Champagne, 2005.) It is obvious that changing
the duration of a process (here, memory comparison) that interrupts
another process (here, timing) can change the duration of the interrupted
process. Selectively influencing the interrupting or interrupted process
may not be possible, but sometimes it is.
A failure of selective influence produced by dependencies
As an example of how things can go awry when there are insertions,

consider a model in which processes communicate about their status and
resource needs. For example, a process might generate a message “I will
require resource R.” Part of the time a process is busy is occupied in
generating such messages and reading those sent by others. Such
messages might be generated during task preparation, before stimulus
processing processes itself starts, and read by relevant processes when
they start.
Suppose the duration of process A consists of a time uA doing its
special job, a time cA generating a message for other processes, and a
time λcB reading a message generated by process B. The time process A
spends reading the message from process B is proportional to the time cB
spent by B generating the message. The duration of process A is then
A = uA + cA + λcB.
With analogous notation, suppose the duration of process B is
B = uB + cB + λcA.
Clearly, if a factor changes the duration of process B by increasing the

time B spends generating a message, the factor will also change the
durations of other processes that spend time reading the message from B.
To see what can happen in such a system, consider the Response
Selection Bottleneck Model of Figure 5.2. Processes B1 and A2 are
concurrent. If they are selectively influenced by two different factors the
ordinary prediction is that the combined effect of prolonging both B1 and
A2 would be less than the sum of the effects of prolonging them
individually. A negative interaction is predicted, see Chapter 3. This is
the prediction of locus of slack logic without insertions.
But suppose processes communicate (e.g., Logan and Schulkind,
2000). Suppose A1 and B1 send messages to A2, and A2 sends a
message to A1 and B1. To avoid further complications, suppose there are
no other messages and suppose the process durations are fixed numbers
rather than random variables.
Let the durations of the process be
A1 = uA1 + cA1 + λcA2

B1 = uB1 + cB1 + λcA2
C1 = 200
SOA = 50 (6.7)
A2 = uA2 + cA2 + λcA1 + λcB1
B2 = 200
C2 = 200.
Let process SWa have duration 0 and omit process SWb.

Response Time 1 equals A1 + B1 + C1. It is immediately clear that a
factor prolonging process A2 of Task 2 by increasing cA2 will produce an
increase in Response Time 1. This is not the ordinary prediction of the

Response Selection Bottleneck model, of course.
Response Time 2 equals max{A1 + B1, SOA + A2} + B2 + C2 − SOA.
Suppose one experimental factor influences process B1 by prolonging
cB1, the component of B1 that is inserted into other processes, and
suppose another experimental factor influences process A2, by
prolonging cA2, the duration of the component of A2 that is inserted into
other processes. Depending on the parameter values, the factors can be
additive, or can interact positively or negatively. The result can violate
the predicted effect of selectively influencing the processes, a violation
of locus of slack logic.
An example in which the combined effect is greater than the sum of
the individual effects for Response Time 2 is in Figure 6.18. (The figure
illustrates Response Time 2 + SOA. The interaction is not affected by
adding the SOA, of course.) Parameter values for this example are as
follows: uA1 = 15, cA1 = 50, uB1 = 15, cB1 = 50, SOA = 50, uA2 = 200, cA2 =
50, uC1 = uC2 = 200 and λ = .7. The increase in cB1 produced by changing
the level of the factor influencing it is 600. The increase in cA2 produced
by changing the level of the factor influencing it is 500. Changing λ
changes the interaction from positive to zero to negative.
Dependencies between concurrent processes are not the only source
of potential problems because priming and other memory effects suggest
that dependencies may arise between sequential processes. Clearly, there
are many ways dependencies can make selective influence fail. We turn
to a form of dependency with which it can succeed.
Successful selective influence with process dependence
The model of Equations (6.7) has process dependencies because a

process duration depends on variables shared with other processes. But
in the model a process duration also has a unique component shared with
no other process. For example, the durations of processes A and B both
depend on cA, making cA a common variable. But the duration of process
A, and of no other process, depends on uA, making variable uA unique to
process A.
Fig. 6.18. The combined effect of prolonging two concurrent processes in an AND
network can be greater than the sum of the effects of prolonging them individually when
the process durations are correlated.
An experimental factor that changes the unique random variable

associated with the duration of a process selectively influences that
process (Dzhafarov, 2003a). This allows two different factors to
selectively influence two different processes, even though the durations
of the two processes may be highly correlated. In examples above, the
unique part and the common part are added to produce the process
duration, but the parts need not be related by addition. The duration of
one process might be eU log W and that of another eW log V, where U and
V are the unique parts of the two processes and W is the common part.
Figures 6.6 through 6.9 show results of simulations of the processes
in the Response Selection Bottleneck model, when processes have
unique and common components. (See Appendix for details.) This time
the factor influencing process B1 changed the unique component of the
duration of B1, uB1, and the factor influencing process A2 changed the
unique component of the duration of A2, uA2. The factor influencing A2
has an effect on the correlation between the durations of A2 and B1, so,
changing the levels of this factor has an effect on the relationship
between Task 1 and Task 2. Nonetheless, the factors behave as factors
selectively influencing concurrent processes; see Figures 6.6 and 6.7.
(Survivor functions in Figure 6.6 are for the time from the onset of the
first stimulus to the second response, RT2 + SOA.)
Figures 6.10 through 6.17 show results when the Response Selection
Bottleneck Model is modified so Response 1 must be made before
Response 2. With the modification it becomes a double bottleneck
model in the form of a Wheatstone bridge (Figure 3.5). Figures 6.10 and
6.12 are for short prolongations of processes B1 and B2 in, respectively,
an OR and an AND Wheatstone bridge; Figures 6.14 and 6.16 are for
long prolongations. With this model Response Time 2 is
RT2 = max{A1 + B1 + C1, A1 + B1 + B2 + C2, SOA + A2 + B2 + C2} − SOA.
(Survivor functions illustrated are for the time from the onset of the first
stimulus to the second response, RT2 + SOA.)
When the factors change the unique component of the duration of
each process, they behave as predicted for factors selectively influencing
processes, despite high correlations between the process durations and
changes in the correlations with factor levels. Correlations are in Tables
6.2 to 6.4 and, for the Wheatstone bridge, in Tables 6.A3, 6.A4, 6.A6 and
6.A7.
In the simulations, components of the process durations have gamma
distributions. Adding a common gamma distributed random variable to
unique gamma distributed random variables is one way of producing a
multivariate gamma distribution (Kotz, Balakrishnan & Johnson, 2000).
Details of the simulations are in the Appendix.
For the simulated reaction times, means and mean interaction
contrasts are reported in the Appendix. For sequential processes X and Y,
as factor levels increase the mean interaction contrast approaches the
coupled slack, k(X,Y), despite correlations between process durations.
This is as predicted in Chapters 3 and 4 for factors selectively
influencing processes.
Selective Influence and Conditional Independence
A main point of this chapter is that factors can selectively influence

Table 6.2
Response Bottleneck Model
Correlations Between Process Durations in Simulations
All Factors at Lowest Levels
A1 B1 C1 A2 B2 C2
A1 1.0000 0.3925 0.2846 0.5817 -0.0099 -0.0222
B1 0.3925 1.0000 0.2774 0.5816 -0.0244 -0.0039
C1 0.2846 0.2774 1.0000 0.4254 0.5938 0.6047
A2 0.5817 0.5816 0.4254 1.0000 0.2895 0.2823
B2 -0.0099 -0.0244 0.5938 0.2895 1.0000 0.3943
C2 -0.0222 -0.0039 0.6047 0.2823 0.3943 1.0000
Table 6.3
Factors Selectively Influencing B1 and A2 at Highest Levels
A1 B1 C1 A2 B2 C2
A1 1.0000 0.1684 0.3120 0.3014 -0.0106 0.0148
B1 0.1684 1.0000 0.1227 0.1013 -0.0084 0.0103
C1 0.3120 0.1227 1.0000 0.2294 0.5903 0.5959
A2 0.3014 0.1013 0.2294 1.0000 0.1708 0.1747
B2 -0.0106 -0.0084 0.5903 0.1708 1.0000 0.3979
C2 0.0148 0.0103 0.5959 0.1747 0.3979 1.0000
Table 6.4
Factors Selectively Influencing A1 and B1 at Highest Levels
A1 B1 C1 A2 B2 C2
A1 1.0000 0.0783 0.1371 0.2583 -0.0035 0.0418
B1 0.0783 1.0000 0.1425 0.2737 0.0335 0.0174
C1 0.1371 0.1425 1.0000 0.4437 0.5882 0.5984
A2 0.2583 0.2737 0.4437 1.0000 0.3040 0.3044
B2 -0.0035 0.0335 0.5882 0.3040 1.0000 0.3990
C2 0.0418 0.0174 0.5984 0.3044 0.3990 1.0000
processes even when their durations are not stochastically independent.

Of course, with some forms of dependency selective influence will not
be possible. A fruitful alternative to stochastic independence is the
assumption that process durations are conditionally stochastically
independent. Such conditional independence follows directly from the
following definition of factors selectively influencing processes. The
definition is by Dzhafarov (2003a).
The definition is for selectively influenced random vectors X1,..., Xn.
A random vector is a list of random variables that have a joint
cumulative distribution function. As a special case, a random variable
can be considered to be a random vector with one random variable in its
list. If random vectors Y and Z have the same joint cumulative
distribution functions, we write Y  Z. The definition of selective
influence uses the notion of a measurable function, defined in the
Appendix. The role of measurable functions in the definition is to return
values for the components of a random vector Xk given values of random
vectors C and Sk. A vector of random vectors <T,..., W> has a joint
distribution P[T  t, ..., W  w]. Random vectors {T,..., W} are
mutually stochastically independent if their joint distribution is the
product of their individual distributions; i.e., P[T  t,..., W  w] = P[T 
t]... P[W  w].
Definition 6.1 (Specialized for Random Vectors) Suppose the

distributions of random vectors X1,..., Xn do not depend on factors other
than Α1,..., Αn. When Factor Α1 is at level i,..., and Factor Αn is at level m,
denote the random vector Xk by Xk<i,...,m>. Random vectors X1,..., Xn are
selectively influenced by factors Α1,..., Αn respectively, if there are
mutually stochastically independent random vectors C, S1,..., Sn, whose
distributions do not depend on the factors, and there are measurable
functions f1,i, ..., fn,m such that
<X1<i,... , m>,..., Xn<i,...,m> >  < f1,i(C, S1),..., fn,m(C, Sn)>.
The function f1,i(C,S1) has the same distribution as the random vector
X1<i,..., m>. When the definition holds, random vector X1<i,..., m> does not
depend on the level of any factor except level i of the first factor, so we
can write X1<i,..., m> simply as X1<i>. We can do the same for each random
vector Xk. According to the definition, each random vector Xk has the
same distribution as does a function of two arguments, a source C of
randomness common to all of X1,..., Xn and a source of randomness Sk
unique to Xk. The common source of randomness allows the random
vectors to be interdependent. The unique source of randomness for
random vector Xk allows its distribution to change when the level h of
the factor selectively influencing it changes, with no change in other
distributions.
Because we mainly use the definition for factors selectively
influencing random variables, for convenience we restate it in a form
specifically for them.
Definition 6.2 (Specialized for Random Variables): Suppose the

distributions of random variables A, B,..., Z do not depend on factors
other than Α, Β,..., Ζ. Let i be a level of Factor Α, j be a level of Factor
Β,..., and m be a level of Factor Ζ. When the factors have these levels,
denote the random variable A by A<i,..., m>; other notation is analogous.
Random variables A, B,..., Z are selectively influenced by factors Α, Β,...,
Ζ respectively, if there are mutually stochastically independent random
vectors C, S1,..., Sn, whose distributions do not depend on the factors,
and there are measurable functions f1,i, ..., fn,m such that
<A<i,... , m>,..., Z<i,...,m> >  < f1,i(C, S1),..., fn,m(C, Sn)>.
The notation  means that the random vector on the left-hand side has
the same distribution as the random vector on the right-hand side.
Here, the distribution of function f1,i(C, S1) is the same as that of
random variable A<i,..., m>, which we can write simply as A<i> when the
definition holds.
Now suppose with Definition 6.1, C takes on the value c, where c is a
vector of real numbers, a value for every component of C. Given c,
< X1<i,..., m>,..., Xn<i,...,m> >  < f1,i(c, S1),..., fn,m(c, Sn) >.
Because S1,..., Sn are mutually stochastically independent, so are X1<i,...,

m>,..., Xn<i,...,m> given c. Consequently, the joint distribution of X1<i,...,
m>,..., Xn<i,...,m> conditional on c can be written as the product of their
individual marginal distributions conditional on c. Such conditional
independence not only follows from selective influence, but is equivalent
to it as the following lemma states.
Lemma 6.1 (Dzhafarov, 2003a) Random vectors X1,..., Xn are

selectively influenced by factors Α1,..., Αn respectively, if and only if there
are mutually stochastically independent random vectors C, S1,..., Sn,
whose distributions do not depend on the factors, such that X1,..., Xn are
conditionally mutually stochastically independent given any value c of
C.
For proof, see Dzhafarov (2003a) Proposition 1.
This definition of the selective influence of factors on random

variables is important for the study of process architecture, because it
allows processes to communicate, share resources and otherwise depend
on one another and yet to be selectively influenced.
The definition above is sufficient for this chapter. Over time it has
been written in more general ways (Dzhafarov & Gluhovsky, 2006;
Kujala & Dzhafarov, 2008; Dzhafarov & Kujala, 2010). In particular,
the random vectors in the above definition have been generalized to
random entities, defined in the Appendix. For further consequences see
Chapter 10.
In earlier chapters, we defined selective influence by increments. The
gist was to assume factors influencing processes produced stochastic
dominance. This single assumption had two jobs. First, it ordered the
levels of the factors in accord with the “usual stochastic ordering” of the
process durations. Second, it assured the existence of a common
probability space on which process durations are defined at different
levels of the factors. For the results in earlier chapters on mean reaction
times and mean interaction contrasts, this form of selective influence is

sufficient, with no need to assume stochastic independence of process
durations.
To use distribution functions as a dependent measure, rather than
means, it is useful to define selective influence with Definition 6.1 (or
6.2). This does the job of assuring the existence of a common
probability space on which process durations are defined at different
factor levels. The other job, ordering the levels of the factors, must now
be done with a separate additional assumption, conditional stochastic
dominance. Again, there is no need to assume stochastic independence
of process durations.
We need to explain a phrase used in the following definition of
conditional stochastic dominance. With process dependencies, it is not
obvious what it means to say a random variable is “invariant” when a
factor level changes. For one thing, the correlation of that process’s
duration with other process durations may change with the factor level.
One way to make “invariance” precise is to say a process is selectively
influenced by a factor, but that factor has only one level. The following
definition uses that idea.
Definition 6.3 Suppose random variables <A, B, Z1,..., Zn> are

selectively influenced in the sense of Definition 6.2 by Factors <Α, Β,
Ζ,..., Ζ>, respectively, where Ζ is a factor with one level. With C as in
the definition of selective influence, we say conditional stochastic
dominance is satisfied for levels i* and i of Factor Α if for every value c
of C,
Pi*[A  t|C = c] > Pi[A  t|C = c].
Likewise, we say conditional stochastic dominance is satisfied for levels

j* and j of Factor Β if for every value c of C,
Pj*[B  t|C = c]  Pj[B  t|C = c].

An example proof not requiring independence
The following is an example of how results based on stochastically

independent process durations and stochastic dominance can
immediately be reestablished based on conditional independence and
conditional stochastic dependence. The theorem is Result 2 above, for
selective influence of a single process. All results for task networks
stated above can be established similarly (Dzhafarov, Schweickert &
Sung, 2004). The theorem uses a slightly weaker form of conditional
stochastic dominance than in the definition above. Recall that S(t)
denotes a survivor function.
Theorem 6.4 Suppose random variables A, Z1,..., Zn are the durations

of the processes in a directed acyclic task network whose gates are all
OR gates or all AND gates. Suppose A, Z1,..., Zn are selectively
influenced in the sense of Definition 6.1 by factors Α, Ζ,..., Ζ,
respectively, where Factor Ζ has one level, and suppose A, Z1,..., Zn do
not depend on other factors.
Suppose C as in Definition 6.1 is a random vector with joint density
function f(c). Suppose for levels i* and i of Factor Α , for all t  τ, for
every value c of C,
SAi*(t|C = c)  SAi(t|C = c).
Then for all t < τ, STi*(t)  STi(t).
Proof: Suppose C as in Definition 6.1 is a random vector with joint

density function f(c). Consider any value t, with t  τ. For any value c of
C, for level i of Factor Α
Pi[A > a, Z1 > z1,..., Zn > zn|C = c] =

Pi[A > a|C = c]P[Z1 > z1|C = c] ... P[Zn > zn|C = c].
The analogous equality is true for level i* of Factor Α.

Then by Theorem 6.2, for any value t, with t  τ, for any value c of C,
STi*(t|C = c)  STi(t|C = c).
Then
S Ti* (t )   S Ti* (t | C  c ) f (c) dc 

R
S
R
Ti ( t | C  c ) f (c ) d c  S Ti ( t ) ,
where R is the set of all possible values of the vector c. ∎
Concluding Remarks
Additive effects of factors on mean reaction time provide evidence that

the factors selectively influence processes in series, but the evidence is
not conclusive. In the Alternate Pathways Model of Roberts and
Sternberg (1993), factors selectively influencing processes have additive
effects on mean reaction times, but the processes follow a mixture
distribution and are not in series. Cumulative distribution functions
provide more evidence than means; processes in series and mixtures
make different predictions about cumulative distribution functions.
Ashby and Townsend (1980) showed that if factors selectively influence
processes in series, certain cumulative distribution functions are
predicted to be equal. Roberts and Sternberg (1993) developed an
elegant test of this equality, the Summation Test, and found experimental
support for it. They also developed a test for the Alternate Pathways
Model, the Mixture Test, which failed for the same experiments.
Simulations by Van Zandt and Ratcliff (1995) demonstrate that data
sampled from the Alternate Pathways Model can satisfy the Summation
Test statistically, even though failure is predicted for the population data.
Their simulated data passed the Mixture Test, as predicted. Good power
is needed to use the tests effectively. When processes are in series, their
durations are combined by addition to form the reaction time.
Commutative and associative operations other than addition can be tested
with the Decomposition Test of Dzhafarov and Schweickert (1995),

analogous to the Summation Test. Process arrangements that can satisfy
tests of distribution equality, such as the Summation Test, are greatly
constrained.
The usual way of detecting that two factors interact is by examining
the interaction contrast of means. An analogous interaction contrast can
be formed with cumulative distribution functions, or, equivalently, with
survivor functions. For such interaction contrasts, Townsend and
Nozawa (1995) derived inequalities that distinguish processes in series
from processes in parallel (perhaps preceded or followed by another
process). For the latter arrangement, stopping when any of the parallel
processes is finished (an OR gate) can be distinguished from stopping
when all are finished (an AND gate). For processes in directed acyclic
task networks, such interaction contrast inequalities distinguish
sequential processes from concurrent ones, and networks with OR gates
from those with AND gates (Schweickert, Dzhafarov & Sung, 2004).
Distribution inequalities are based on weaker assumptions than
distribution equalities; however, when they are satisfied less is learned
about process dependencies.
For derivations of the various diagnostic tests, it is convenient to
assume process durations are stochastically independent. There is
evidence that stochastic independence sometimes occurs. But, for many
reasons, it does not always occur. Processes share capacity, and they
communicate producing cross-talk, coactivation and insertions.
Fortunately, stochastic independence of process durations is not
necessary for factors to selectively influence processes. Independence is
not needed for factors selectively influencing processes in series to have
additive effects on mean reaction times (Sternberg, 1969). For
distribution function tests it is often sufficient that process durations are
conditionally independent (Townsend & Nozawa, 1995; Dzhafarov,
2003a). Simulations show that survivor function interaction contrasts
behave as predicted even with highly correlated process durations.
Further, interaction contrasts of means behave as predicted. Simulations
show it is feasible to estimate parameters; when factors selectively
influence two sequential processes, as factor levels increase mean
interaction contrasts approach the coupled slack. Considerations of

process dependence have led to a clearer and more powerful conception
of what it means for factors to selectively influence random variables in
the work of Dzhafarov and Kujala.
Given the complexity of the brain, it is not surprising that some
experimental factors have effects that ramify through the system. In the
experience of the authors, effects of cross-talk can be rather messy
(Dutta, Schweickert, Choi & Proctor, 1995; Lien, Schweickert, &
Proctor, 2003; Schweickert, Fortin & Sung, 2007). Unless cross-talk is
the object of investigation, it is desirable to minimize it. Useful facts are
emerging. Items in memory appear to produce less cross-talk than
displayed items (Dutta, et al., 1995). Cross-talk produced by similar
stimuli is more likely to occur when the tasks to be done with the stimuli
are the same than when different (Logan & Schulkind, 2000; see also
Lien, et al., 2003). It is useful to encourage the subject to make decision
criteria large, to promote responses based on relevant information.
If one factor in an experiment produces process dependencies, it does
not mean that other factors cannot selectively influence processes and
reveal process arrangement. An investigator can sometimes fix a factor
with ramifying effects at one particular level, or average response times
over different levels of the factor. Analysis of other factors can then
proceed. By analogy, noise from stray electrical fields is a nuisance in
experiments measuring evoked potentials, but investigations proceed
despite it.
Appendix
Details of task network simulations
Simulations of the Response Selection Bottleneck Model of Figure 5.2

and the Wheatstone bridge modification of it in Figure 3.5 are illustrated
in Figures 6.6 through 6.17. Simulations were done in MATLAB. For
each combination of factor levels, 5000 simulated trials were run.
In both models the duration of process SWa was fixed at 0 and
process SWb was omitted. The duration of each process consisted of its
unique component, its common component, and the common component
due to every process it is concurrent with. Durations of the SOA and of
SWa had no common components of their own or due to other processes.
For example, process A1 is concurrent with process A2. The duration of
process A1 was
unique component of A1 + common component of A1

+ common component of A2.
And the duration of process C1, which is concurrent with A2, B2 and C2,
was
unique component of C1 + common component of C1

+ common component of A2 + common component of B2
+ common component of C2.
The duration of each entire common component was added (this

corresponds in Equations (6.7) to λ = 1).
Each component was a gamma distributed random variable and these
were independent. A random variable with a gamma distribution has two
parameters, α and β. (When α is a positive integer, the random variable
can be considered to be the sum of a number of α independent
exponentially distributed random variables, each with mean β.)
The Response Selection Bottleneck Model
The SOA was 100. The duration of each common component was a
gamma random variable with α = 8 and β = 4. All unique components
when factors were at their lowest levels were gamma random variables
with α = 4 and β = 4. All the gamma distributed random variables were
independent. A sample of each was taken for each trial. In particular,
independent samples were taken for the duration of each common
component.
When the sequential processes A1 and B1 were selectively
influenced, the factor selectively influencing A1 changed α of the unique
component of A1. Likewise, the factor selectively influencing B1
changed α of the unique component of B1. Values of α are in the
margins of the tables with results. When concurrent processes B1 and A2
were selectively influenced, the situation was the same, except one factor
selectively influenced A2 instead of A1.
The Wheatstone bridge
Parameters when each factor was at its lowest level were the same as for
the Response Selection Bottleneck Model, except that α for process C1
was 28, to make C1 long enough to be important. The processes on
opposite sides of the Wheatstone bridge were B1 and B2; values of α for
them at various levels of the factors selectively influencing them are in
the tables with results. For the Wheatstone bridge with OR gates, the
SOA was 25, α for A2 was 1 and α for C1 was 16.
Table 6.A1
When Factors Selectively Influence
Concurrent Processes B1 and A2 of the
α for B1
α for A2 4 12 36 48 96
4 404 436 533 579 772
12 405 436 532 580 772
36 448 456 531 580 771
48 497 497 537 580 772
96 687 688 687 688 774
α for B1
α for A2 4 12 36 48 96
4
12 -1 -2 0 -2
36 -25 -46 -44 -46
48 -32 -89 -92 -94
96 -31 -129 -175 -282
Table 6.A2
Sequential Processes A1 and B1 of the
α for B1
α for A2 4 12 36 48 96
4 404 404 450 495 688
12 405 408 480 528 720
36 448 480 575 623 818
48 497 528 625 672 864
96 687 721 815 863 1055
From Simulations k(A1, B1) = 84
α for B1
α for A2 4 12 36 48 96
4
12 2 29 31 30
36 32 81 83 85
48 32 82 84 83
96 33 82 84 84
Table 6.A3
OR Wheatstone Bridge
A1 B1 C1 A2 B2 C2
A1 1.0000 0.3925 0.2473 0.6119 -0.0153 -0.0022
B1 0.3925 1.0000 0.2407 0.6164 -0.0272 -0.0143
C1 0.2473 0.2407 1.0000 0.3842 0.5129 0.5245
A2 0.6119 0.6164 0.3842 1.0000 0.2953 0.3037
B2 -0.0153 -0.0272 0.5129 0.2953 1.0000 0.3986
C2 -0.0022 -0.0143 0.5245 0.3037 0.3986 1.0000
Table 6.A4
OR Wheatstone Bridge
Factors Selectively Influencing B1 and B2 at Highest Levels
A1 B1 C1 A2 B2 C2
A1 1.0000 0.1701 0.2530 0.6044 -0.0286 0.0006
B1 0.1701 1.0000 0.1169 0.2609 0.0075 0.0119
C1 0.2530 0.1169 1.0000 0.3989 0.2084 0.5216
A2 0.6044 0.2609 0.3989 1.0000 0.1251 0.3131
B2 -0.0286 0.0075 0.2084 0.1251 1.0000 0.1721
C2 -0.0006 0.0119 0.5216 0.3131 0.1721 1.0000
Table 6.A5
Sequential Processes B1 and B2
On Opposite Sided of an OR Wheatstone Bridge
α for B1
α for B2 4 8 12 36 48 96
4 309 323 333 352 352 352
8 315 328 339 366 368 368
12 317 330 344 382 384 384
36 316 333 349 438 464 481
48 317 334 349 444 485 527
96 317 333 348 445 493 672
Table 6.A5 (Continued)

From Simulations, k(B1, B2) = 323
α for B1
α for B2 4 8 12 36 48 96
4 312 309 304 227 179
8 301 299 295 225 179
12 287 285 283 225 179
36 190 190 192 184 162
48 144 145 145 143 137
96
Note. Interaction contrasts calculated with highest factor levels as baseline. In the upper
left corner they approach k(B1, B2).
Table 6.A6
AND Wheatstone Bridge
A1 B1 C1 A2 B2 C2
A1 1.0000 0.3925 0.2169 0.5798 -0.0178 -0.0183
B1 0.3925 1.0000 0.2156 0.5870 -0.0283 -0.0129
C1 0.2169 0.2156 1.0000 0.3283 0.4645 0.4784
A2 0.5798 0.5870 0.3283 1.0000 0.2841 0.2802
B2 -0.0178 -0.0283 0.4645 0.2841 1.0000 0.4034
C2 -0.0183 -0.0129 0.4784 0.2802 0.4034 1.0000
Table 6.A7
AND Wheatstone Bridge
Factors Selectively Influencing B1 and B2 at Highest Levels
A1 B1 C1 A2 B2 C2
A1 1.0000 0.1777 0.2510 0.5940 -0.0134 -0.0122
B1 0.1777 1.0000 0.0972 0.2456 -0.0203 -0.0078
C1 0.2510 0.0972 1.0000 0.3591 0.1765 0.4722
A2 0.5940 0.2456 0.3591 1.0000 0.1200 0.3265
B2 -0.0134 -0.0203 0.1765 0.1200 1.0000 0.1723
C2 -0.0122 -0.0078 0.4722 0.3265 0.1723 1.0000
Table 6.A8
Sequential Processes B1 and B2
On Opposite Sided of an AND Wheatstone Bridge
α for B1
α for B2
4 12 36 48 96
4 419 441 533 579 772
12 439 452 533 581 773
36 528 529 578 624 816
48 577 577 627 673 864
96 767 769 816 864 1056
From Simulations, k(B1, B2) = -65
α for B1
α for B2
4 12 36 48 96
4
12 -10 -21 -19 -20
36 -20 -64 -63 -65
48 -21 -64 -64 -66
96 -20 -65 -63 -65
Random entities and measurable functions
Recall from Chapter 4 that a probability space is a triple, < Ω', S, P>,
where Ω' is a set, S is a set of subsets of Ω' that form a -algebra, and P
is a probability measure on S. A measurable space is an ordered pair
<Ω, > , where Ω is a set and  is a -algebra of subsets of Ω, (e.g.,
Royden, 1968).
Now let <Ω', S, P> be a probability space and let <Ω,  > be a
measurable space. A measurable function from <Ω', S, P> into < Ω,  >
is a function f from Ω' into Ω such that for any S  ,
{x: f(x)  S}  S.
In Chapter 10 we use the notion of a random entity, a measurable

function from a probability space to a measurable space. Note that a
random entity f from <Ω', S, P> into <Ω,  > induces a probability
measure M on <Ω,  >; namely for every S  , M(S) = P({x: f(x) S}).
For more about these notions, see Dzhafarov (2003a) and Dzhafarov and
Kujala (2010).
A random variable is a special case of a random entity. Random
variables take on values that are real numbers. For the set of real
numbers, the smallest -algebra containing all the intervals of real
numbers is called the Borel -algebra. If Ω is the set of real numbers
and  is the Borel -algebra on the real numbers, then a random entity
defined as above is a random variable. A random vector is an ordered n-
tuple of random variables all from the same probability space, and
random vectors are random entities.
Note
1. We thank Saul Sternberg for kindly providing figures and helpful

information, particularly for the simulations of Roberts and Sternberg
(1993, Note 20).
Chapter 7
Visual and Memory Search,

Time Reproduction, Perceptual
Classification, and Face Perception
Although reaction time distributions can be analyzed whenever means

are analyzed, distributions have been considered more often for some
tasks than others. Tasks in this chapter illustrate analyses of distribution
functions.
Visual Search
A visual search task typically includes one or more known (sometimes

unknown) target objects that observers must identify among non-target
objects, called distracters. Although a simple task, the search itself
requires very different information processing levels, ranging from
perception to cognition, from color discrimination to decision making.
Many different factors affect different processes during a visual search.
At the perceptual level, search performance can be affected by physical
properties of the target and the distractors. At the cognitive level, search
performance can be affected by feedback and amount of reward or
punishment observers receive. This section will focus on the two
important processing stages in visual search, where mean and cumulative
distribution function (or, equivalently, survivor function) interaction
contrast tests have been used to resolve certain theoretical issues.
Visual search has been modeled as having two qualitatively different
processing stages arranged sequentially: a parallel feature extraction
223
stage that operates on all stimuli simultaneously, followed by a serial

attentive processing stage that operates on one stimulus at a time. Feature
Integration Theory (FIT; Treisman & Sato, 1990; Treisman & Gelade,
1980) and Guided Search (GS; Wolfe, 1994) are two well-known models
based on this two-stage structure. According to these models, visual
search starts by extracting basic features (e.g., color, shape, orientation
of a line or shape, etc.) and building “feature maps” that carry the
location information of the features of the objects in the visual field. For
example, if red-colored objects are presented, there will be a red feature
map that stores the location information of the “red” colored objects in
the visual field (or visual memory). Similarly, if there are differently
oriented lines, a line-orientation feature map will be constructed by the
visual system to store the location information of the lines oriented in a
certain way. These feature maps, according to two-stage models, are
constructed without mental effort or, in the models’ term, attentional
resource. Also, since this process does not require attentional resource,
the models further assume that multiple feature maps can be constructed
simultaneously by the visual system, making the feature analysis stage a
parallel processing system. Upon construction of feature maps, visual
attention is required to integrate the individual feature maps into a single
master map that holds location information of object representations
defined by different features (e.g., a red line rotated by 45°). In this stage,
object representations in the master map are examined one by one for
target identification.
Support for a two-stage search structure is mainly based on the
different performances in two search conditions called “singleton” and
“conjunction” searches (e.g., Treisman & Gelade, 1980). In a singleton
search, a target can be effectively discriminated by a single feature (e.g.,
finding a red target object among green distracters). The target usually
pops out in this type of search and the response time (RT) to find a target
tends to remain constant as the number of distractors (i.e., green objects)
increases. Two-stage models explain this RT pattern in the following
way: since only one feature (e.g., the color “red”) is needed to
discriminate the target from distractors, the parallel feature extraction
stage already provides sufficient information about the location of the
Visual/Memory Search, Time Reproduction, Classification, and Face Perception 225
target from the color feature map, which in turn guides the visual
attention directly to that location (Wolfe, 1994; Wolfe, Cave, and
Franzel, 1989). Thus, the number of distractors does not affect RT
much, because features are extracted in parallel without any attentional
resource.
In contrast, in a conjunction search, a combination of multiple
features is needed distinguish a target from distractors (e.g., searching for
a red letter “T” among green “T”s and red “L”s). This is because the
target shares some of its features with the distractors and cannot be
discriminated from the distractors by just one feature. Typically in a
conjunction search, the mean RT tends to increase significantly as the
number of distractors increases (i.e., a significant set-size effect). Two-
stage models explain the set size-effect in this type of search by
assuming serial engagement of attentive processes in the second stage.
Since the objects in a conjunction search are defined by multiple features
and the attentional system is assumed to work on one spatial location at a
time to integrate different features at that location, it follows that the RT
increases as the number of distractors increases. Although there is some
evidence against serial processing in some conjunction search conditions
(Nakayama and Silverman; 1986; McLeod, Driver, and Crisp; 1988;
Pashler, 1987; Wolfe et al., 1989), most recent versions of two-stage
models explain these exceptions fairly well with additional mechanisms
and assumptions (see Treisman & Sato, 1990). The following section
reviews two studies that explicitly test the two different processing stages
of two-stage models by using the method of selective influence and the
mean and cumulative distribution function interaction contrasts of RTs.
Testing of parallel preattentive stage
Egeth and Dagenbach (1991) adapted simple search tasks to investigate

whether two or more objects can be processed concurrently. Search
tasks were simple, in that only two stimuli were presented. The important
experimental manipulation, however, was the visual quality (brightness)
of the stimuli. In three experiments with different letter stimuli, Egeth
and Dagenbach manipulated the visual quality of each stimulus
independently. For example, when no target was presented, there were
four different conditions: both stimuli bright, only the left or right
stimulus bright, and neither of them bright. According to early studies
(e.g., Pashler & Badgio, 1985; Johnsen & Briggs, 1973), manipulation of
the visual quality of a stimulus should selectively influence the feature
extraction process for the stimulus. Since feature extraction processes are
assumed to be parallel according to two-stage models, the effect of visual
quality on overall RT would follow patterns expected from parallel
processing, if the two-stage structure is correct.
Thus, Egeth and Dagenbach looked for patterns of mean RTs
suggested by Schweickert (1978), Schweickert and Townsend (1989),
and Townsend and Schweickert (1989) in an extension of the additive
factor method (Sternberg, 1969). (Chapter 3 describes these patterns.)
Specifically, the time to process two stimuli in parallel should be the
maximum of the times to process each individually. Making the
simplifying assumption that times are constants, not random variables,
suppose the RT for processing a single high visual quality stimulus is t.
Then the RT for processing two high visual quality stimuli is max{t, t} =
t if the stimuli are processed in parallel, but t + t = 2t if processed
serially. When the visual quality of one of the two stimuli deteriorates,
it takes more time to extract the individual features, say ∆t. This is in the
pre-attentive stage; according to two-stage models the second stage is not
affected. The overall RT becomes max{t, t + ∆t} = t + ∆t if processing
is parallel and t + (t + ∆t) = 2t + ∆t if processing is serial. At this point,
RTs do not provide enough information to distinguish serial and parallel
processing, because the two processing systems give us the same change
in overall RT (∆t). The serial and parallel models generate different RT
patterns when we include the condition in which both stimuli are of low
visual quality. The overall RT becomes (t + ∆t) + (t + ∆t) for the serial
processing model and t + ∆t for the parallel processing model.
When two stimuli are of low visual quality, only serial processing
results in a different RT from when only one of the stimuli is of low
quality. Note that this diagnostic used in Egeth and Dagenbach (1991) is
valid only when the processing of both stimuli is guaranteed. If one of
the stimuli is a target and the search is serial, subjects may not process
the second stimulus (distractor) when the first stimulus processed has
been identified as the target (a serial self-terminating search, as in

Sternberg, 1969). Thus, the effect of the experimental manipulation
(visual quality) on the distractor would not be fully reflected in the
overall RTs. This is also true when processing is parallel. When the
search is parallel and the target is present, if the search stops whenever a
target is found the overall RT will only reflect the target processing time.
The diagnostic in Egeth and Dagenbach (1991) is not designed to test
different stopping rules of the processing system (self-terminating vs.
exhaustive search). Therefore, only the target-absent condition is of
interest.
Based on the outcomes of this diagnostic applied to three different
search tasks, Egeth and Dagenbach (1991) found that two stimuli are
processed in parallel when they are displayed simultaneously and when
the task is relatively easy, such as finding an X among Os or a T among
Ls. As shown in Table 7.1, in the target-absent condition, which ensures
exhaustive processing of both stimuli, subjects tend to respond faster
when the two stimuli are of high visual quality (507ms) than when at
least one of them is of low visual quality (e.g., 525ms in low-high
condition). Most importantly, with a change from one low quality
stimulus to two the overall RT remains unchanged (table 7.1), the pattern
expected for a parallel processing system. The mean interaction contrast
(IC) of −18 ms also confirms that the processing of two stimuli was
indeed parallel.
Within the framework of two-stage models, although it was not Egeth
and Dagenbach’s original aim, the results indicate that multiple feature
extraction processes are executed in parallel if the visual quality
manipulation selectively influences processes in the first stage. One
exceptional result of Egeth and Dagenbach (1991) is that when the
search task was fairly difficult and attentionally demanding (finding a
rotated L among rotated Ts or vice versa), serial processing was
confirmed. This exception may be because when the task is difficult the
visual quality manipulation does not selectively influence processes in
the first feature extraction stage, but in the second attentive processing
stage. Alternatively, the visual quality manipulation may still influence
the feature extraction stage but processing now becomes serial in this
stage due to task difficulty. The former seems to be a more plausible
explanation because parallel pre-attentive processing in the first stage is a

less debatable characterization of this processing stage (see Wolfe,
1998).
Table 7.1
Mean Response Times (ms) of Searching for
Upright Ts among Ls, or Ls among Ts, in the Target Absent Condition
(Modified from Table 5 in Egeth & Dagenbach, 1991)
Visual Quality Mean RTs

Stim. 1 Stim. 2
Low Low 524
High Low 524
Low High 525
High High 507
Mean IC : 507 – 524 – 525 + 524 = 18
Testing the serial attentive processing stage
One of the major targets of criticism for two-stage models is the serial
attentive processing assumption in the second stage. FIT explicitly
assumes that when attention is engaged, only one spatial location can be
attentively processed at a time to identify a target. As discussed before,
the set-size effect found in conjunction search tasks was initially thought
to be critical evidence for the serial processing assumption. Although it
is easy to draw this conclusion when one sees linearly increasing RTs as
the number of objects increases, it is now well known that linearity of RT
as a function of set size does not imply serial processing of stimuli
(Townsend, 1971, 1972; Townsend & Ashby, 1983). Linearity can be
explained by special parallel processing models, such as limited-capacity
parallel processing. These models assume that mental processes require a
limited capacity resource, such as attention. When multiple parallel
processes are executed, they share the resource. The consequence is that
as the number of parallel processes increases, the amount of resource
allocated to each process decreases. The set-size-effect is explained by
assuming that process durations are inversely related to the amount of
resource allocated to each process. For example, given a fixed
processing capacity, v, each of n parallel processes may be allocated

capacity v/n, which results in linearly increasing response times as the
number of processes, n, increases (e.g., Townsend & Ashby, 1983, pp.
85-91). The inverse relationship between amount of capacity allocated
and the duration of a process is not the only way of explaining the set-
size effect by parallel processing models (e.g., Fisher, 1982; Pashler,
1987), although the notion of limited capacity is fairly universally
accepted.
Despite this logical possibility, rejection of serial processing has
never been an easy task. The argument against serial attentive processing
(e.g., McLeod et al., 1988; Nakayama & Silverman, 1986; Wolfe et al.,
1989) is mainly that some conjunction searches produce non-significant
set-size effects, contrary to the prediction of the original FIT (Treisman
& Gelade, 1980). However, the non-significant set-size effect in those
conjunction searches does not necessarily reject the serial attentive
processing assumption. This is because, although the search tasks in
such cases are logically conjunction searches, they typically include a
highly salient target feature (e.g., 3-D depth with motion, color, etc.) that
makes targets easily distinguishable from distractors. The modified two
stage models (Treisman & Sato, 1990; Wolfe, 1994) explain these
exceptional cases by assuming an additional mechanism of inhibition of
target-unrelated features (Treisman & Sato, 1990) or activation of target-
related features (Wofe, 1994). According to these models, the inhibition
and activation mechanisms guide our visual attention to the target
stimulus first so that the number of distractors does not matter much in
this type of search task. Thus, those exceptional cases do not necessarily
reject serial attentive processing in two-stage models.
There have been attempts to resolve the serial/parallel issue in visual
search without relying on the set-size effect (e.g., Fific & Townsend,
2003; Sung, 2008; Thornton & Gilden, 2007). Among them, Sung (2008)
adapted the Cumulative Distribution Function Interaction Contrast (CDF
IC) (Townsend & Nozawa, 1995; Schweickert & Giorgini, 1999;
Schweickert, Giorgini, & Dzhafarov, 2000) to examine the assumption of
serial attentive processing. Details of the CDF IC are in Chapter 6,
where it is called C(t). Our focus here is on the reasoning underlying the
experimental manipulation in Sung’s (2008) work and how the

mandatory serial attentive processing assumption can be rejected using
the CDF IC test, without depending on the set-size effect.
Let us discuss the two network models in Figure 7.1 before we
describe the details of Sung’s (2008) experiments. According to the
original FIT (Treisman & Gelade, 1980), the visual system needs to
integrate the individual features in the second processing stage to form a
representation of an object, one spatial location at a time. Suppose four
different stimuli are presented in a display and there are four
corresponding mental processes in the second stage of a two-stage
system. If the object identification process can only be executed one at
a time, as FIT postulates, what happens at this stage can be modeled as in
Figure 7.1A, where all processes are connected in series. If the four
stimuli can be processed in parallel, the processes can be modeled as in
Figure 7.1B. If we can find an experimental factor that selectively
influences a process in the second stage processing, we will be able to
test the serial attentive processing assumption of FIT in a manner similar
to the way Egeth and Dagenbach (1991) tested parallel processing of
feature information.
To achieve this, Sung (2008) manipulated the similarity between a
target and distractors by making the individual features of the distractors
Fig. 7.1. Two possible networks for the second stage of two-stage models (e.g., Treisman
& Gelade, 1980). A. Four processes are connected in series (e.g., FIT). B. Four processes
are connected in parallel. Here, p and r stand for starting and ending of the networks,
respectively.
similar to those of the target. According to Duncan and Humphreys

(1989), searching for a target among distractors takes more time when
the distractors share more features with the target. Since individual
features should be analyzed in parallel and independently in the first
stage, to feed information to the next stage to generate complete object
representations, manipulating target-distractor similarity by varying
shared features should, according to FIT, affect the second stage in
which object identification processes take place. In other words, if the
distractors are more similar to the target, RT will increase because the
decision process will take more time, not because the feature analysis
processes take more time. In his first experiment, Sung (2008) asked
subjects to find a red T among green Ts and red Os (conjunction of color
and shape, Figure 7.2). Comparing two target-absent conditions (Figure
7.2A and 7.2B), we see that a new distractor in 7.2B has replaced one O
in 7.2A. Assuming this substitution does not affect processing of the
unchanged stimuli (two green Ts and the remaining O), if the RT for
display 7.2B is longer than for 7.2A, we reason that it takes more time to
decide that the new distracter in 7.2B (a red “upper-left-corner” shape) is
a non-target than the red O, because the new distractor is more similar to
the target (e.g., Duncan & Humphrey, 1989). Thus, the substitution of
the new distractor serves as a manipulation of an experimental factor that
selectively influences processing of the position of the red O. The same
reasoning applies to the conditions A and C of Figure 7.2. Finally, the
condition D includes two new distractors (an “upper-left-corner” shape
and an “upper-right-corner” shape) that replace one O and one T in the
condition A.
Sung investigated the CDF IC for the four conditions explained above,
in addition to examining the mean RTs (e.g., Egeth & Dagenbach, 1991).
As shown in table 7.2, the mean IC calculated from four target-absent
conditions (i.e., Figure 7.2) was −19 ms, which supports parallel
processing of two special distracters. Note that this result only supports
the parallel processing of the two special distractors that replaced red Os.
It is, however, unlikely that these stimuli are processed in parallel and
others are not.
Fig. 7.2. Four different stimulus presentation conditions of Experiment 1 in Sung (2008).
Only the target-absent conditions are shown here. In these examples, the color of Ts are
green and the rest of the stimuli are red. The target is a red T, which is not shown in these
examples. Adapted from Sung, K. (2008). Serial and parallel attentive visual searches:
Evidence from cumulative distribution functions of response times. Journal of
2008 by American Psychological Association. Adapted with permission.
Table 7.2
Mean Response Times (ms) of Four Target-Absent Conditions
(Modified from Table 1 of Sung, 2008)
Target Absent
Mean RTs
Conditions (Figure 7.2)
A 553
B 630
C 624
D 687
Mean IC : 553-630-624+687 = − 19
The patterns of CDF IC further bolster the conclusion that the stimuli
of this particular conjunction search task can be processed in parallel,
rejecting the serial processing assumption of FIT. As explained in
Chapter 6, the CDF IC remains positive for all times t if two processes
selectively influenced by two factors are indeed connected in parallel in a
critical-path network (e.g., Figure 7.1B). Sung observed this pattern in
his first experiment (i.e., finding a red T among green Ts, red Os, and
two special distracters). As shown in the bottom panel of Figure 7.3, the
CDF IC tended to remain positive for all times t. Also, the stochastic
dominance assumption for the test is clearly satisfied in all four cases
(top and middle panels in Figure 7.3).
Fig. 7.3. Top panel: CDFs from four target-absent conditions of Experiment 1 in Sung
(2008). Middle panel: stochastic dominance tests for four CDFS. Bottom panel: CDF IC
function calculated. A, B, C, and D represent corresponding target-absent conditions in
Figure 7.2. Adapted from Sung, K. (2008). Serial and parallel attentive visual searches:
Evidence from cumulative distribution functions of response times. Journal of
2008 by American Psychological Association. Adapted with permission.
Another conjunction search (finding an upright L among rotated Ts

and Os) was also confirmed to be a parallel search via mean and CDF IC
tests (Sung, 2008). However, the same kind of test confirmed a serial
search in an extremely difficult search task condition (Experiment 3;
finding a rotated T among T-shaped distracters and Os). Overall, these
results support claims by Fisher (1984) and Bundesen (1990) that stimuli
tend to be processed in groups of a small number of items and that only
extremely difficult search conditions involve serial processing. In order
to defend FIT, the only obvious argument against Sung’s (2008)
conclusions would be to show that the experimental manipulations did
not selectively influence processing as assumed. That is, one must
demonstrate that the substitutions of distractors somehow affected the
first parallel processing stage of FIT in Experiments 1 and 2 and the
second serial processing in Experiment 3, given that the two separate
processing stages indeed exist. But this would be a challenging job for
FIT, since the model would be required to show why the same kind of
experimental manipulation works differently for different search tasks.
Memory Scanning
In his seminal study on memory scanning, Sternberg (1966) asked

subjects to memorize a short list of items (e.g., digits). Subjects were
then presented with a probe item and asked to indicate whether it was
one of the items they memorized earlier (a positive response) or not (a
negative response). RT tended to linearly increase as the number of
memorized items increased. Also, the slope of the RT function of the
positive responses did not differ significantly from that of the negative
responses. In some experiments, the mean RTs for positive and negative
responses did not differ significantly. A simple explanation is that the
memorized items are mentally scanned in a serial and exhaustive fashion,
where exhaustive means that the memory scanning does not stop when
the probe is found in the memory set, but continues until all items are
scanned. The serial exhaustive memory explanation is intuitively clear
because, if the search is serial and self-terminating, we would at first
expect the RT slope for the positive responses to be about ½ that for the
negative responses, assuming random position of a target in the memory

list. Also, the overall mean RTs for the positive responses would be
faster than those for negative responses.
This processing model for rapid memory scanning has been widely
considered a concrete finding. However, as discussed in the visual
search section, linearity of RT as a function of the number of memory
items does not theoretically imply serial processing (Townsend, 1971,
1972; Townsend & Ashby, 1983). Moreover, a self-terminating search
can lead to equal slopes for positive and negative responses. The
thorough review by Van Zandt and Townsend (1993) finds that self-
terminating search models can easily predict most results predicted by
exhaustive search models, but not vice-versa. In particular, exhaustive
search models have difficulty predicting slope differences for positive
and negative responses, and serial position effects.
There is considerable evidence against serial processing in memory
scanning (e.g., Ashby, Tein, & Balakrishnan, 1993; McElree & Dosher,
1989; Townsend & Fific, 2004). We focus on Townsend and Fific
(2004), who explicitly utilized the method of selective influence and
survivor function interaction contrast (SIC), equivalent to CDF IC, to
investigate alternative models of the memory scanning process.
In Townsend and Fific’s (2004) experiment, subjects memorized a set
of two pronounceable Serbian pseudo-words presented sequentially for
1,200 ms. These formed the memory set. After an inter-stimulus
interval (ISI; 700 or 2,000ms), subjects were given a probe, which was
either one of the pseudo-words they memorized or not. Each pseudo-
word stimulus consisted of two consonants and one vowel (“A”), with
the vowel always in the middle (e.g., MAL, LAM, NAM, FAS, and
SAV). The important factor manipulated was the amount of dissimilarity
between the probe and memorized items, when the probe was not one of
the memorized items. In the negative response condition, the probe was
constructed in the same way as items in the memory set: two consonants
with one vowel (“A”) between them. Dissimilarity was manipulated by
making the probe share consonants with items in the memory set. For
example, if the two memorized pseudo-words were MAL and NAM and
the probe for the negative response condition was NAL, the probe was
considered low-low dissimilarity, because it shared two consonants with

the memory set items. If the probe shared a consonant with the first
memory item only (e.g., the memory set was NAL, SAV and the probe
was NAM), the probe was considered low-high dissimilarity. If the
probe shared a consonant with the second item only (e.g., the memory set
was NAM, SAV and the probe was VAS), the probe was considered
high-low dissimilarity. Finally, if the probe shared neither of its
consonants with the memory set (e.g., the memory set was VAS, FAV
and the probe was NAL), the probe was considered high-high
dissimilarity.
These four dissimilarity conditions form a 2 by 2 factorial design, the
first factor being the dissimilarity between the probe and the first
memory item and the second factor being the dissimilarity between the
probe and the second memory item. Assuming that manipulation of the
dissimilarity between the probe and one memory item does not affect the
comparison of the probe and the other memory item, the two factors
selectively influence two different memory comparison processes. The 2
by 2 factorial design, with the assumption of selective influence, enables
us to investigate how two different mental comparison processes are
organized.
Townsend and Fific (2004) calculated the IC of survivor functions of
RTs from the four conditions of negative responses to ensure that both
comparison processes occurred. When the ISI was 700ms, memory
search was serial for all subjects, except subject 3, for whom search was
parallel (see Figure 7.4). Note that since the survivor function is 1 – F(t),
where F(t) is the cumulative distribution function. The survivor IC and
CDF IC lead to equivalent patterns, but of opposite sign (see chapter 6).
Thus, for serial processing, the survivor IC would be negative in the
beginning and change its sign at some point, such that the total area
under the IC curve is statistically zero. Subjects 1, 2, 4 and 5 show this
pattern when ISI was 700 ms, as in Figure 7.4. Subject 3 shows the
pattern predicted by parallel processing, a negative survivor IC. Some
subjects (2, 4 and 5) showed a transition from serial processing to
parallel processing as the ISI changed from 700 ms to 2,000 ms. When
the ISI was 2,000 ms, all subjects except subject 1 demonstrated parallel
Fig. 7.4. Survivor ICs of five subjects performing memory search in Townsend and Fific
(2004). Thin solid lines in each plot represent the simple serial or parallel model that fit
best the observations represented by dotted dark lines. From Townsend, J. T., & Fific,
M. (2004). Parallel versus serial processing and individual differences in high-speed
search in human memory. Perception & Psychophysics, 66, Fig. 3(A). Copyright 2004
Springer. Reproduced with permission.
comparison of memory items. Thus, two subjects (subject 1 and subject

3) maintained the same processing form (serial and parallel,
respectively), but the others shifted from serial to parallel processing.
Note that these conclusions are based on the models fitted to the
observed data of each subject. The models fitted are simple parallel and
serial models with only two independent processes, whose durations are
exponentially distributed (see Townsend & Ashby, 1984, p. 50, where
the same models are discussed). The model with better fit (i.e., a better
r2) was chosen and represented as a solid line in each panel in Figure 7.4.
An interesting point is that the ISI value in the original study by
Sternberg (1966) was 2 s and he concluded that memory scanning is
serial and exhaustive, based on the linear parallel RT functions of set
size. However, Townsend and Fific (2004) found that most subjects
showed parallel processing when the ISI was 2 s. One may object that
the two studies cannot be directly compared, because the number of
items in the memory sets were quite different. In Townsend and Fific
(2004) only two items were in the memory set, whereas in Sternberg
(1966) the number of items varied randomly from one to six. It is
reasonable to assume that processing when the memory set is small and
constant is different from when the memory set is larger and variable.
Nonetheless, Townsend and Fific (2004) demonstrated that memory
comparison is possible in both serial and parallel forms. The
circumstances under which each form occurs remain to be specified.
Concurrent Time Reproduction and Visual Search
In a typical time reproduction task, subjects hear two tones separated by

a certain time interval (e.g., 2 s). Then they make two button presses
separated by a time interval that they think matches the time interval just
presented (2 s). What makes this simple task interesting is that the
estimated time is usually affected when a secondary task must be done
concurrently with the time reproduction. When the secondary task has
some temporal aspect such as motion, which is defined by time and
space, the estimated time tends to increase (Brown, 1995). Many studies
have shown that time estimation performance is also affected by non-
temporal tasks, such as memory search and perceptual discrimination.

(See Brown, 1997 for a review of attention.) Attention is limited (e.g.,
Brown & West, 1990). Thus, if another process consumes attentional
resource, the internal clock monitoring process will be interferred with.
The result is lengthened estimated time.
Fortin, Rousseau, Bourque, and Kirouac (1993) reported an
exceptional case in which time reproduction was not affected by a
concurrent non-temporal secondary task requiring attention. In this case,
a secondary visual search task did not interfere with time reproduction;
in fact, there was a slight negative correlation between search and
reproduced times. In terms of process organization, this exceptional case
demonstrates that two processes (or two sets of processes), one each for
time reproduction and visual search, can co-occur without (or with small)
interference between them. Schweickert, Fortin and Sung (2007) retested
this exceptional case using the CDF IC. Schweickert et al. (2007)
manipulated experimental factors, each thought to selectively influence a
process involved in time reproduction and visual search. For time
reproduction, the subject was given three different time intervals to
estimate (2400, 3000, and 3600ms); this manipulation was intended to
selectively influence the time reproduction process. For visual search,
the display size (1, 6, and 12 items) and whether the target was present or
absent were intended to selectively influence the search process.
Stimulus onset asynchrony (SOA) was another factor manipulated (2200
and 2400 ms). This was the interval between the presentation of a screen
at the start of the time reproduction and the onset of the display to be
searched. Figure 7.5 shows a critical path network representing this dual
task when the two tasks, time reproduction and visual search, go on
concurrently.
Subjects were asked to make one response for two tasks. Subjects
heard two tones indicating the target interval to be reproduced. Soon
after, they pressed the “2” key to start time reproduction; this is
represented at o, the start of the network in Figure 7.5. After an SOA
(2200 or 2400 ms), the visual search display was presented. Subjects
then pressed either the “1” or “3” key to indicate the presence (“1”) or
Fig. 7.5. A critical path network representation of the dual-task experiment in

Schweickert, Fortin and Sung (2007) when two tasks, time reproduction and visual
search, go on concurrently.
absence (“3”) of the target in the display and the end of time
reproduction. The response occurs at r, the end of the network.
The visual search stimuli were presented after the time reproduction
started to ensure that the visual search terminated at approximately the
same time as the end of the time reproduction. If the visual search
process goes on concurrently with the time reproduction but always
terminates before the end of time reproduction, the resulting RT
distributions would not reflect the effects of the experimental factors
selectively influencing the visual search process. This SOA before visual
search was the only substantial difference between the experiments in
Schweickert et al. (2007) and in Fortin et al. (1993). In Fortin et al.
(1993), the visual search started at the same time as the time
reproduction.
There are six different combinations of experimental factors available
for the CDF IC test in Schweickert, et al. (2007). Factors behaved as if
they selectively influenced the processes they were expected to in the
model of Figure 7.4; for example, manipulations of time interval and
SOA behaved as predicted for manipulations of concurrent processes.
There was one exception; one interaction contrast of the three tested for
display size and SOA was inexplicably not as predicted, details are in
Schweickert, et al. (2007). For the purpose of this chapter, we focus on
two important factors, display size (three levels) and the time interval to
be reproduced (three levels), to see if the factors selectively influence
two concurrent processes, time reproduction and visual search,

respectively. Mean interaction contrasts confirmed the concurrency of
time reproduction and visual search. (See Chapter 3 for the basis for
these tests.) As shown in table 7.3, all nine possible mean interaction
contrasts for display size and time interval factors are negative,
supporting concurrency of the two tasks.
Table 7.3
Mean Response Times (ms) for 9 Different Conditions
of Display Size and Time Interval Factors
(From Table 1 of Schweickert, Fortin and Sung, 2007)
Display Size Time Interval (ms)

(item) 2400 3000 3600
1 3489(1) 3660(2) 3855(3)
6 3634(4) 3760(5) 3950(6)
12 3798(7) 3916(8) 4080(9)
Mean ICs
IC1 = (5) – (4) – (3) + (1) = -46
IC2 = (6) – (4) – (3) + (1) = -50
IC3 = (8) – (7) – (2) + (1) = -53
IC4 = (9) – (7) – (3) + (1) = -84
IC5 = (6) – (5) – (3) + (2) = - 5
IC6 = (9) – (8) – (3) + (2) = -31
IC7 = (8) – (7) – (5) + (4) = - 7
IC8 = (9) – (7) – (6) + (4) = -33
IC9 = (9) – (8) – (6) + (5) = -26
The same conclusion is supported by the CDF ICs. As shown in

Figure 7.6, CDF ICs tend to be positive for all times t, with exceptions at
small negative dips, which were not significant. Note that the mean and
the CDF ICs reported in Schweickert et al. (2007) were obtained by
pooling RTs over other irrelevant factors (SOA and target
presence/absence).
Although the overall conclusion of Schweickert et al. (2007) was
concurrency of the time reproduction and visual search as evidenced by
CDF ICs, there was possibly crosstalk between the two tasks. This is
Fig. 7.6. CDF ICs for time interval and display size factors. Adapted from Schweickert,
R., Fortin, C., & Sung, K., 2007, Concurrent visual search and time reproduction with
cross-talk. Journal of Mathematical Psychology, 51, Fig. 10. Copyright 2007 Elsevier.
Adapted with permission.
because some small violations of stochastic dominance were found when

CDFs were calculated using RTs that were not pooled over other
irrelevant factors. For example, one of the violations of stochastic
dominance was found between display sizes 6 and 12, when the time
interval was 2400 ms, with an SOA of 2400 ms, and when the target was
present (see Table 7 of Schweickert et al., 2007). Two CDF functions
from these two conditions crossed at the beginning of the time axis (x-
axis). With simulations, Schweickert et al. (2007) showed these
violations are to be expected when two concurrent processes are
negatively dependent. This, in fact, fits very well with what Fortin et al.
(1993) found in their fourth experiment, which further suggests that there
is indeed a cross-talk between two tasks. The only trouble is that the
mean RT actually increased as the display size increased in Schweickert
et al. (2007), whereas it decreased a little in Fortin et al. (1993). It is not
clear what caused the increasing mean RT in Schweickert et al. (2007).
As noted earlier, the only difference between the two experiments was
that the visual search did not start right away in Schweickert et al.
(2007), whereas it did in Fortin et al. (1993). But the mechanism that
would generate different RT patterns in these two experiments is not
known.
Perceptual Classification
Subjects in a classification experiment are typically presented with an

object and asked to name the category to which it belongs. The
categories usually exist in nature, such as animal classes (mammal, fish,
bird, etc.), or are arbitrary ones learned during the experiment. When
subjects make speeded category judgments, it is assumed that they
primarily use perceptual properties (color, length, size, etc.) of the
stimulus to make quick decisions, so we call this a perceptual
classification. Although the task is straight forward, the underlying
psychological processes are not. Here we discuss how multiple feature
processes are organized in category decision making, as found by Fific,
Nosofsky and Townsend (2008), using selective influence and the
survivor function interaction contrast.
Dimensions (or properties) of an object that can be changed
independently, such as the length and width of a rectangle, are called
separable (or independent). But the perceived color of an object with a
fixed value of hue can be affected by two interdependent dimensions,
brightness and saturation. Usually, the perceived brightness of a color
patch is affected by the saturation and vice versa. Interdependent
dimensions of an object are called integral dimensions (Ashby &
Maddox, 1994; Garner, 1974). Fific, Nosofsky and Townsend (2008)
investigated the processing of separable and integral dimensions, by
selectively influencing processing of the dimensions.
In order to examine the organization of dimension processes, Fific et
al. (2008) tested five different models using the survivor IC. Depending
on the stopping rule (cf, Sternberg, 1966; Van Zandt & Townsend,
1993), serial and parallel models each have two sub-categories, leading
to serial self-terminating or exhaustive models and parallel self-
terminating or exhaustive models. As explained earlier, a self-
terminating system, either serial or parallel, generates a response as soon

as the goal is met (e.g., a target is found in a search). An exhaustive
system, either serial or parallel, does not produce a response until all its
processes are finished, even if the goal is met (e.g., the target is found)
before all processes have finished. In Fific et al. (2008), only exhaustive
processing models were tested, since their experiments were designed to
force the subjects to use an exhaustive rule for classification.
In addition to these four models, a special model was tested in Fific et
al. (2008), a coactivation model. The idea underlying a coactivation
model is that processes send activation to each other, increasing each
other’s rates (Miller, 1982). There is considerable evidence for
facilitation between processes (see, e.g., Diederich, 1995). For RT
cumulative distribution functions, one indicator of the presence of
coactivation is a violation of Boole’s Inequality (Miller, 1982) or more
stringent inequalities developed later (Colonius, 1990; Townsend &
Wenger, 2004). Another indicator is an RT survivor function contrast
that is negative for small values of t, and positive for large values, with a
positive mean interaction contrast (Townsend & Nozawa, 1995). These
indicators are discussed in Chapter 6. Neither indicator demonstrates
conclusively that coactivation is present, because other models can
produce them (Fific et al., 2008). Fific et al. (2008) used survivor
interaction contrasts to test for coactivation.
In the first experiment, on separable dimensions, in Fific et al. (2008),
the stimulus had two separate parts: a color patch and a vertical bar.
Fific et al. manipulated the values of two separable dimensions: the
saturation of the color patch and the location of the vertical bar. There
were three different saturation values (s1, s2, s3; s1 being the most
saturated red) for the color patch and three different locations (v1, v2, v3;
v1 being farthest to left) for a small vertical bar surrounded by two
surrounding lines (see Figure 7.7) When the stimulus with color patch
and vertical bar was presented to the subjects, they were required to
classify the stimulus as a member of category A if the saturation value of
the color was either s2 or s3 and the vertical position was either v2 or v3
(see Figure 7.7). If the stimulus possessed either v1 or s1 as its feature,
subjects were asked to classify the stimulus as category B. Fific, et al.
reasoned that it takes more time to classify a stimulus when the color
saturation decreases from s1 to s2 to s3. Similarly, it takes more time to

classify the stimulus when the vertical position is close to the left (i.e.,
v1) of the display than when it is close to the right (v2 or v3). This
reasoning is based on the observation that it takes more time for
observers to make a category decision when a stimulus possesses a
property value close to decision boundary (e.g., s2) than when it does not
(e.g., s3) (Ashby, Boynton, & Lee, 1994). In this way, Fific et al.
claimed that changes in saturation and spatial location selectively
influence two different processes, one for each of the two feature
dimensions.
Fig. 7.7. Two category structure and a sample stimulus, showing its two parts. To be
classified as category A, the color patch must have saturation value of s2 or s3 AND
middle line position of v2 or v3. H and L denote High (easy) and Low (hard) values.
Adapted from Fific, M., Nosofsky, R. M., & Townsend, J. T., 2008, Information-
processing architectures in multidimensional classification: A validation test of the
systems factorial technology. Journal of Experimental Psychology: Human Perception
and Performance, 34, Fig. 5. Copyright 2008 American Psychological Association.
There were nine possible combinations of the two features values, but
Fific et al. examined only four conditions (i.e., s2v2, s2v3, s3v2, s3v3
combinations) to investigate the interaction contrast of the survivor
function of RTs. The reason for investigating these four conditions,
which constitute category A, is that the stimuli in this category require

the logical “AND” operation for decision making. Category B does not
enable this analysis; it is possible that subjects may not process both
features of the stimulus presented to them, since only one feature is
sufficient to classify a stimulus as category B. For category B, the levels
of the two factors are not crossed; each level of location does not occur
with each level of saturation, and vice versa. Note that the survivor
interaction contrast test is mathematically equivalent to the test of the
CDF IC discussed in chapter 6; see Townsend & Nozawa, 1995, for
more details about the survivor function interaction contrast (SIC).
Results are in Figure 7.8. Fific et al. (2008) found that different subjects
use different processing architectures for two-feature processing. For
subjects 4 and 6 (fourth and sixth rows in Figure 7.8), the SICs (right
column) were effectively negative for all times t, indicating parallel
exhaustive processing of two features (recall that the survivor function
at t is 1 – F(t), where F(t) is the cumulative distribution function). For
the rest of the subjects, SICs showed the pattern for serial exhaustive
processing of two features. SICs tend to be negative in the beginning
and change sign so that the total area under the function is zero.
This pattern for serial exhaustive processing is also confirmed by the
non-significant interaction of vertical line position and saturation factors
for subjects 1, 2, 3, 5, and 7 (see left column of Figure 7.8), which
suggests that the total areas under the IC functions are statistically zero.
The exhaustiveness of two-feature processing is not surprising, since the
task is designed to produce it (subjects had to process two features to
make a category judgment). But the different processing schemes, serial
and parallel, used by different subjects is an interesting finding, because
this indicates no single processing architecture is universally employed
for this classification task.
One thing to note is that the stimulus consisted of two distinct
physical objects (a color patch and a vertical bar, either overlapping or
not), thus there could reasonably be two distinct processes, one for each
of the two feature dimensions. One could argue that subjects were
processing two features, each from one of two different stimuli, rather
than two features of a single stimulus. A more compelling argument
Fig. 7.8. Mean RTs (left column), survivor functions (middle column), and SICs (right
column) of 7 subjects (rows) in Experiment 1. Adapted from Fific, M., Nosofsky, R. M.,
& Townsend, J. T., 2008, Information-processing architectures in multidimensional
classification: A validation test of the systems factorial technology. Journal of
2008 American Psychological Association. Adapted with permission.
could be made if the stimulus possessed, in itself, two separable features

(e.g., was a colored vertical bar; see also Fific, Little, & Nosofsky,
2010).
In their second experiment, on integral dimension processing, Fific et
al. (2008) found that the SIC supports the presence of coactivation. They
manipulated the brightness and saturation of a color patch with a fixed
value of hue (e.g., redness). Although the overall idea of the
experimental manipulation is the same as that of the first experiment,
selective influence for integral feature dimensions is a little tricky. The
problem is that the perceived color difference when saturation changes
from, for example, 60% to 80% with a brightness of 50% is not the same
as the difference when saturation changes in the same way (60% to 80%)
with a brightness of 80%. Fific et al. (2008) recognized this problem and
tried to resolve the issue by first conducting multi-dimensional scaling
(MDS) on different combinations of saturation and brightness values,
then selecting combinations equally spaced in the perceptual space as
defined by the two feature dimensions. The goal was to select
combinations of brightness (b1, b2, b3) and saturation (s1, s2, s3) such that
the change in the duration of processing of saturation produced by a
change, say, from (b1, s1) to (b1, s2) was the same as that produced by a
change, say, from (b2, s1) to (b2, s2). Likewise, the change in the
duration of processing of brightness produced by a change from (b1, s1)
to (b2, s1) was intended to be the same as that produced by a change from
(b1, s2) to (b2, s2). Subjects (different from those who participated in the
MDS) were then asked to classify the color patches into two different
categories, in a similar way as in the first experiment. That is, if a
stimulus possessed either b1 or s1 as its feature, subjects were asked to
judge it as category B. If a stimulus possessed one of b2 or b3 and one of
s2 or s3 as its features, subjects were to judge the stimulus as category A.
Fific et al. again examined all nine possible combinations of
saturation and brightness values, but analyzed four conditions that
required exhaustive processing of two features, (b2,s2), (b2, s3), (b3, s2),
and (b3,s3), for the same reason as before. They found that the patterns of
SICs from Experiment 2 supported the presence of coactivation. That is,
the SICs of most of subjects were negative in the beginning and became
positive at later times, as shown in Figure 7.9. Also, importantly, the

positive areas under these curves were significantly larger than the
negative areas, the pattern that a coactivation model would show
(Townsend & Nozawa, 1995; see Chapter 6), except for subject 1. For
subject 1, the survivor function IC showed the pattern for serial
exhaustive processing of two stimulus dimensions. This means that the
net area of the IC for this subject was statistically zero, that is, the
interaction of saturation and brightness manipulations was not
significant. Fific et al. suggested that this exceptional pattern for subject
1 may have been due to his or her unusually high error rate in one
condition (b2, s2). This may not be a satisfactory explanation since there
was another subject with a similar error rate, but with an SIC for
coactivation processing.
The SIC pattern interpreted as evidence of coactivation by Fific, et al.
(2008) is also predicted for a model without coactivation, namely for
factors selectively influencing two serial processes, with a third process
in parallel with the series of two (Figure 7.5). As explained in Chapter 6,
the latter model predicts an SIC that begins negative and then becomes
positive, with net positive area. The latter model predicts that with
several levels of the factors the mean interaction contrasts would
approach a limit as factor levels increase, the coactivation model does
not. The issue cannot be settled with the 2  2 design of Fific, et al.
(2008), but the literature on integral dimensions encourages their
coactivation interpretation.
There is room for argument about how the experimental
manipulations work, if the coactivation model is correct. The question is
that if the coactivation model is correct, then the experimental factors
should have an effect on the processes before the stage that sums the
outputs of these parallel processes, as explained earlier. In other words,
changes in saturation and brightness should affect the durations of
processes for the corresponding features and not the duration of the
decision process that takes place after the merging of outputs from
the two processes.
The question is about where the particular manipulations (brightness
and saturation) in the second experiment of Fific et al. (2008) have their
effects. If manipulations of saturation and brightness only affect the
Fig. 7.9. Mean RTs (left column), survivor functions (middle column), and SICs (right
column) of 5 subjects (rows) in Experiment 2. The first 5 subjects of 8 are shown here.
Adapted from Fific, M., Nosofsky, R. M., & Townsend, J. T., 2008, Information-
processing architectures in multidimensional classification: A validation test of the
systems factorial technology. Journal of Experimental Psychology: Human Perception
and Performance, 34, Fig. 8. Copyright 2008 American Psychological Association.
processing times of the features and not the decision making process that
sums the outputs of these two processes, it means that it takes more time
to detect (not to classify) a stimulus with low saturation and brightness
values than one with high saturation and brightness values, which could
be the case as we saw in Egeth and Dagenbach’s (1991) visual search
experiments. However, it is also possible that the classification of (or
decision making about) the stimulus would also take longer if the
identified features are close to the classification boundary (e.g., [b2, s2]
of experiment 2) than if they are far from the boundary (e.g., [b3, s3])
(e.g., Ashby, Boynton, & Lee, 1994). This means that brightness and
saturation affect not only the different processes that occur before the
summation process of the coactive system, but also the decision making
process that occurs after the summation of the two processes. If decision
making is also changed, why is the pattern for a coactivation model
observed? A possible answer is discussed in Schweickert, Fortin, and
Sung (2007). When a factor influences two processes, not one, and its
effect is much greater in one process than in the other, the experimental
factor may still behave as if it selectively influences the first process
only. In terms of classification, this means that changing brightness and
saturation have major effects on separate feature processes, as well as
some minor effect on the decision making process. Although
Schweickert et al.’s (2004) observation is based on simulation, it is a
reasonable possibility. The issue is conceptual; better understanding of
coactive systems is needed, in particular of what it would mean for a
factor to selectively influence a process with coactivation present.
In summary, data of Fific et al. (2008) indicate that processing of two
separable feature dimensions is serial for most of their subjects, but
parallel for some; on the other hand processing of integral dimensions is
coactive. A study of processing of separable dimensions in the same
object would be informative (e.g., Fific, Little, & Nosofsky, 2010). A
clear specification of the models being tested may help us better
understand results of Experiment 2.
Face Perception
It is widely accepted that the visually-presented human face receives

special treatment by human and other non-human primate brains (e.g.,
Haxby, Hoffman, & Gobbini, 2000; McKone, Crookes, & Kanwisher,
2009; Tsao, 2006) probably due to the social/behavioral importance of
face perception in primate evolution. Certain cells or discrete regions in
temporal lobe respond exclusively to a human face or something like it
(e.g., a cartoonish face) but not to other visual stimuli (Bruce, Desimone,
& Gross, 1981; Tsao, 2006). Despite its special status, the face is still a
visual stimulus that can be understood with individual features (nose,
eyes, etc.) and their configural relationship (e.g., how far the face
features are apart from each other).
One dominant explanation for face perception, the dual-mode
hypothesis (e.g., Bartlett & Searcy, 1993), says there are two separate
and independent processing modes; one for featural information and the
other for configural information. As mentioned above, the featural
information is about individual facial features, such as color or shape of
nose, eyes and mouth. The configural information is about the spatial
relationship between various facial features, such as the distance between
eyes or the ratio of the distance between eyes to the size of eyes (e.g.,
Searcy & Bartlett, 1996). According to the dual-mode hypothesis, the
features and the spatial relationship between features are available
simultaneously when a face stimulus presented. This hypothesis further
says that the processes for two different types of facial information are
independent and concurrent (also see Invalson & Wenger, 2005, for a
summary of this explanation).
Ingvalson and Wenger (2005) adapted the survivor interaction
contrast test (see the previous discussion on perceptual classification;
Fifit et al., 2008) to test the hypothesis that the two types of facial
information processes are concurrent. This was achieved by
independently manipulating two types of facial information in a face
stimulus. Note that the actual experimental manipulations in Ingvalson
and Wenger (2005) include additional factors such as orientation of the
face (upright or inverted) and object type (human face or schematic
face). For the purpose of our discussion, we will focus on manipulations
of featural and configural information in Ingvalson and Wenger (2005).

In their experiment, they manipulated featural and configural
information of face stimuli independently (see Figure 7.10). For the
featural information, they used three different colors for lips (red, pink,
or a color between red and pink) and for irises (blue, green, or a color in
between blue and green). For the configural information, they
manipulated the position of mouth (centered, up-right tilted or a position
in between) and the location of the pupil within each eye (centered,
crossed, or position in between). Here, changes to or from the
intermediate value (e.g., color of lips changed from pink to the color
between pink and red) were called ambiguous modifications because the
color difference was not easy to detect. Changes from one extreme to the
other extreme (e.g., red to pink lips) were called unambiguous
manipulations. Also, although there were two different featural
manipulations (colors of lips and irises), they always co-varied. In other
words, red lips always went with blue irises, pink with green, and so on.
The same is true for configural manipulations. The centered mouth
always went with centered eyes and so on. These variations give 9
different face stimuli. (The reason for co-variation of two featural or
configural changes is not clearly explained in Ingvalson and Wenger
(2005) but it seems it was done to maximize the effect size of each of the
featural or configural manipulations.) Subjects in Ingvalson and
Wenger’s (2005) experiment were required to identify differences
between two successively presented faces. The first stimulus, randomly
chosen from the 9 face stimuli, was presented for 500ms followed by a
blank screen for about 400ms. Then the second face stimulus, also
randomly chosen from the 9 stimuli, was presented for 75ms. The second
stimulus was always of the same identify as the first one and it could
vary featureally, configurally, or not at all.
The overall result for human face stimuli was that subjects required
more time to judge when the second stimulus differed slightly (i.e.,
ambiguous changes) from the first, as expected. Response times were
slowest when featural and configural changes were both ambiguous,
fastest when the changes were unambiguous, and imtermediate when
only one of the changes was ambiguous. This pattern remained the same
Fig. 7.10. Four stimulus examples used in Ingvalson & Wenger (2005). Top-left: human
face without configural changes (or with centered eyes and mouth). Top-right: human
face with unambiguous configural changes from the top-left. Bottom-left: schematic face
stimulus without any configural changes. Bottom-right: schematic face stimulus with
unambiguous configural changes from the bottom-left. From Ingvalson, E. M., &
Wenger, M. J.,2005. A strong test of the dual-mode hypothesis. Perception &
Psychophysics, 67, Fig. 2. Copyright 2005 Springer. Reproduced with permission.
for other conditions such as inverted faces, and schematic faces. The
mean interaction contrasts were all positive for the face and face-like
stimuli, as predicted if the processes for featural and configural
information were concurrent and self-terminating (i.e., in an OR
network, see chapter 4). The survivor function interaction contrasts were
positive for all times t in all four subjects tested. This also is as predicted
if the processes were concurrent, and the response was made whenever
any of the concurrent processes detected a change (featural or configural)
between two face stimuli (i.e., the stopping rule of the parallel system is
self-terminating). Results support the dual-mode hypothesis, according
to which the processes are concurrent and self-terminating. With further
analyses, Ingvalson and Wenger (2005) also find support for unlimited
or super capacity, as predicted, but not for independence of rates of
featural and configural processes, contrary to prediction.
The results reported in Ingvalson and Wenger (2005) are very robust
in that the same parallel self-terminating system seems to work
commonly for the inverted and the schematic face stimuli in all four
subjects tested. One thing to point out, however, is that the evidence
from the survivor function interaction contrast only supports concurrent
processing of featural and configural information but not the concurrent
processing of two features (colors of lips and irises) or two
configurations. That is, assuming there are four processes for each face,
two featural and two configural, it is possible that the two processes for
color of lips and color of irises (i.e., two features) can be sequential
(since they are spatially separated in a face) but concurrent with the two
processes for configural information. Alternatively, all four processes
could be concurrent. Since two features (colors of lips and eyes) always
co-varied in Ingvalson and Wenger’s (2005) experiment (as well as two
configuration manipulations), their results cannot help us differentiate
these two possible alternatives.
Concluding Remarks
This chapter gives several examples of experimental factors that

selectively influence processes or nearly do so. Common findings are
that RT cumulative distribution functions satisfy stochastic dominance
(the usual stochastic order), and RT cumulative distribution function or
survivor function interaction contrasts display patterns predicted for
processing that is sequential or concurrent. But RT cumulative
distribution functions reveal difficulties. Not all subjects use the same
process organization, and the same subject may change from parallel to
serial processing as process difficulty increases. There is considerable
evidence that coactivation and cross-talk prevent factors from pure
selective influence. Further work is needed on robustness: To what
extent can factors violate assumptions and yet reveal structure as if
selectively influencing processes?
Chapter 8
Modeling with Order of Processing Diagrams
In earlier chapters we showed how to represent cognitive tasks as

directed acyclic task networks and how to discover the structure of the
network by selectively influencing its processes. Now we show how to
estimate parameters of the network and moments of the response time.
To compute moments one needs to know not only the precedence
constraints among events, but also their actual realization as processing
evolves. The processes associated with two concurrent arcs in a directed
acyclic task network are allowed to be carried out simultaneously, but
they are not required to be, so one process might complete before the
other starts. Additionally, if capacity is limited, knowing the actual
rather than potential simultaneity of events is critical because the
resources available for each process may be reduced during simultaneous
execution (Townsend, 1972). We now describe the Order-of-Processing
(OP) diagram, a representation more informative than a directed acyclic
task network because it includes both precedence and actual
simultaneity.
AND Networks
OP diagrams have been proposed elsewhere to represent the evolution

over time of processing in PERT networks (Fisher and Goldstein, 1983;
Goldstein and Fisher, 1991, 1992; also see Kulkarni and Adlakha, 1986).
A PERT network is an AND network, that is, a directed acyclic task
network in which each process can start only when all its immediate
predecessors have finished. The time to complete the task is the sum of
the times to complete all the processes on the longest path through the
256
Modeling with Order of Processing Diagrams 257
network. We begin with AND networks because they are the most
common task networks. We show how to use OP diagrams to compute
the moments of response time for them. We then show modifications
required for directed acyclic task networks with OR gates.
Order-of-Processing Diagrams
Assume that a process X with starting event X ' and ending event X" is
associated with each arc in a directed acyclic task network. A process
may be active, such as moving a limb. It may be more passive, such as a
communication delay. A process may indicate a precedence constraint.
The most common precedence constraint is that the ending of one
process must precede the start of another. Some other constraints can
easily be represented by arcs, for example, that the start of one process
must precede the start of another. Such precedence constraint arcs can
be considered processes of duration zero. We gain generality by
associating processes with these arcs as well as the arcs which represent
more standard processes (e.g., encoding, comparison and execution). If
an arc connects the start of process X to the start of process Y, then the
process corresponding to the arc may be identified as X ' Y ' ; other
processes are denoted similarly. Let Z be the set of all processes.
Assume that there are |Z| such processes.
States
At each moment in the performance of a task it is possible to partition the

set Z of processes in a directed acyclic task network into one of three
different sets, the completed, current and pre-active sets. Each distinct
partition will be referred to as a state, s. Let S be the set of all states.
The completed set K(s) of processes in state s is composed of all
processes which have terminated. The current set C(s) of processes in
state s is composed of all processes which are executing at the given
moment. And the pre-active set P(s) of processes is composed of all
other processes in Z. For a directed acyclic task network, there will be
one start state with an empty completed set and one end or finish state
with an empty current set.
Transitions
A transition is made from state s to some state, say t, as soon as a

process, say X, in the current set C(s) of state s completes. The
completed, current and pre-active sets of state t are then constructed as
follows. The completed set K(t) of state t consists of all processes in the
completed set K(s) of state s and process X, i.e., K(t) = K(s){X}. The
current set C(t) of state t is comprised of (a) the processes in the current
set C(s) of state s, except for the process X (which just completed) and
(b) possibly one or more processes in the pre-active set P(s) of state s.
The subset of pre-active processes that become current is determined by
the network, as follows. Suppose processes {Y1,...,Yi} in the pre-active
set P(s) are immediate successors of process X. If when process X in the
current set C(s) of state s completes, all processes that precede {Y1,...,Yi}
have completed, then processes {Y1,...,Yi} are added to the current set
C(t) of state t; otherwise no processes are added to the current set. The
pre-active set P(t) of state t is comprised of whatever processes remain.
Note that we assume here that two or more processes do not complete at
the same time (or equivalently, that the probability of two or more
processes finishing simultaneously is zero). We describe a method for
handling the special case where several processes finish simultaneously
later (also see Goldstein and Fisher, 1991).
Here is an example. Assume there are two processes in series, X and
Y, and suppose process X completes before process Y begins, i.e., event
X" precedes event Y ' . Then, the processing can readily be represented as
a directed acyclic task network (Figure 8.1a). The arc from X" to Y '
represents a communication, and may have 0 duration. The OP diagram
representation of the network is in Figure 8.1b. The current set of
processes is displayed within each state. For example, C(s2) = {X"Y ' }.
The process that completes when a transition is made between states
which are next to one another on a path is displayed on the arc
connecting the adjacent states. For example, process X completes when
a transition is made from state s1 to state s2. Process X"Y ' completes
when a transition is made from state s2 to state s3. Note that the
completed set K(si) can be constructed for a given state si by examining
Fig. 8.1. Two processes, X and Y, in series represented in (a) a task network, and (b) an
OP diagram. The start and end vertices of processes X and Y in the task network are
labeled (Xʹ, Xʹʹ) and (Yʹ, Y′ʹ), respectively.
all processes which complete along any one path to that state. Thus, for
example, the completed set of state s3 will contain processes X and X"Y ' ,
i.e., K(s3) = {X, X"Y ' }.
Computing moments of the response time
Once the task network has been represented as an OP diagram, the

moments of the time to respond can be computed from the joint
distribution of the process durations. We start with the case where the
durations of the processes are independent and have exponential
distributions. Next, we describe the case where the durations are
independent and have general gamma distributions (McGill & Gibbon,
1965). Data analyses by Ashby and Townsend (1980) and Kohfeld,
Santee, and Wallace (1981) support these distributions (i.e., the
exponential and general gamma) for process durations, although
Sternberg (1964) has evidence to the contrary. See Luce (1986) for
further discussion of these distributions. We end this section by
describing some results in the case where both constraints (the
assumption of independence and the assumption that the distributions of
the durations of the processes are general gamma) are removed.
Process durations: Independent exponential
Suppose durations of the processes are independent exponential random

variables (the simplest case of general gamma random variables). The
algorithm is recursive and requires definition of several terms.

Specifically, let |S| be the number of states in the OP diagram. Define Si
as the set of indices of all states which are immediate successors of state
si. Label the last state s|S|. Label remaining states in such a fashion that if
some state, say sj, is a successor of some other state, say si, then the index
j of the successor state is larger than the index i of its predecessor. Let
the duration of process Xk be the random variable Xk (the same symbol
for both). A value of Xk will be denoted as xk. Define T(i) as the time to
complete all processes in the current set of state si or successors of state
si, given that state si is entered. Let qk represent the rate parameter for
process Xk, i.e., let the duration of process Xk have density function,
f ( x k )  qk e (  q k x k ) , 0  x k
(8.1)
f ( x k )  0, otherwise.
Set qij equal to the rate parameter of the process which completes
when a transition is made from state si to state sj. And set qii equal to the
sum of the rate parameters of the processes active in the current set C(si)
of state si. (Note that i and j are used to index states here; previously
they were used to index the levels of different factors.)
Then it can be shown that the mth response time moment E[T(1)m] can
be computed recursively by first computing in order E[T(|S|)], E[T(|S| 
1)],..., E[T(1)], then computing in order E[T(|S|)2], E[T(|S|-1)2],...,
E[T(1)2], and so on, where E[T(i)m] is defined as follows (e.g., see
Howard, 1971, Equation 11.11.13, page 735; Kulkarni & Adlakha, 1986,
Equation 1, page 774):
m
E[T (i ) ] 
mE[T (i )m1 ]   jSi
qij [T ( j )m ]
, 1  i  | S |,
qii (8.2)
 0, i | S |
And for m = 0, define E[T(i)0] = 1 for all i.

For example, suppose that there are only two processes in series, X
and Y, with respective rate parameters α and . Then, there are three
states in the OP diagram (assuming the transmission process, X"Y ' , is a
dummy process): X is current in state s1, Y is current in state s2, and no
processes are current in state s3. The first moment is easily computed:
E[T ( 3)]  0;
1  q23 E[T ( 3)] 1  0 1
E[T ( 2)]    ;
q22  

1
1  q12 E[T ( 2 )]  1 1
E[T (1)]     .
q11   
The second moment follows in a similar fashion:
E[T (3) 2 ]  0;
2
02
2 E[T ( 2)]  q23 E[T ( 3) ]  2
E[T ( 2) 2 ]    2;
q22  
1 1  2 
2      2 
2 E[T (1)]  q12 E[T ( 2) ]  
2
 
E[T (1) 2 ]   
q11 
2 2 2
   .
 2
 2

If α =  = 1, then E[T(1)] = 2, E[T(1)2] = 6 and VAR[T(1)] = E[T(1)2] −

E[T(1)]2 = 2.
Process durations: Independent general gamma
We can use the above methods when the duration of any one or more
processes is a general gamma random variable by representing each
process X as a series of k subprocesses, X1,...., Xk, where the durations of

all the subprocesses are independent and exponentially distributed.
While the assumption that one or more processes has a general gamma
distribution makes the OP diagram more complex, it does not change the
algorithm for computing the moments of response time. And still more
generally, we could assume that the duration of a single process X is
equal to a probability mixture of general gamma random variables,
where the probability mixture is generated by replacing a single process
X by a complex task network of concurrent and sequential processes.
Again, this requires no change in our methods. General gamma random
variables turn out to be quite useful because their distribution often more
closely approximates the distributions of cognitive process durations than
do exponentials.
Process durations: General case
We have shown that when durations of processes are independent,

general gamma random variables we can compute the moments of the
response time. However, the procedure cannot be used when the process
durations are dependent or when the distribution of these durations has
some form other than the general gamma. For example, in an application
to follow, an arc in a task network represents an interstimulus interval
with a constant duration. The constant duration does not have a general
gamma distribution, of course.
Briefly, define a path through an OP diagram as a sequence of states,
beginning at the start state and ending at the finish state, such that if state
t follows state s in the sequence, then state t is an immediate successor of
state s in the OP diagram. Let H be the complete set of paths. Suppose
there are |H| different paths through the OP diagram. Let each path
through the OP diagram be given an index h, 1 < h < |H|. Let P be a
random variable whose value is the index of the path taken on a given
trial. Let RT equal the time it takes to complete all processes in the OP
diagram (RT equals the response time T(1) defined above). Let E[RT] be
the mean response time. Let E[RT|P = h] be the mean time to complete
all processes, given that the hth path is taken. And let Prob(P = h) be the
probability that path h is taken. Then, if there are |H| paths in the OP
diagram, it follows immediately that the moments of the response time

can be written as follows.
|H |
E[ RT m ]   E[ RT m | P  h]Prob( P  h )
h 1
This equation makes intuitively clear the general procedure one needs
to follow in order to obtain the expected completion time. This procedure
is described in detail in Goldstein and Fisher (1991). Here, we want only
to indicate the required computations when the durations are
independent.
To begin, let D represent the vector of process durations <X1,..., X|X|>.
Define fX(x1,..., x|x|) as the joint density function of the durations of the
processes in the task network. We assume that this density function is
known or can be estimated. Define Xhj as the process which completes
first in state shj, the jth state along path h. Define Thj as the duration of the
jth state along path h and set it equal to the duration of the first process to
complete in the jth state, i.e., Xhj, minus the duration of each of the
preceding states in which this process was current:
Th1  X h1
j 1 (8.3)
Thj  X h1   T
i 1
ij hi ,
where aij = 1 if process Xhj is current in state shi; otherwise aij = 0.

The expected completion time can now be written as the quantity:
|H |
E[ RT ]     g h f X ( x1 ,..., x| X | )dx1 ,..., dx|X | ,
h1 ( x1 ,..., x| X | )Rh
(8.4)
where gh = th1 + th2 + ... + thh', where h' is the index of the last state on
path h prior to the finish state, and where Rh is the region where each of
the state durations on path h is positive, i.e., Thj > 0, j = 1,..., h'. In order
to perform the integration, we need to rewrite the quantities thj and the
region Rh in terms of the Xi. First, the quantities thj can be rewritten in
terms of the Xi using Equation (8.3). For example, without loss of
generality, assume that the processes are numbered in such a way that
process Xi completes in the ith state along path h. Then, th1 = x1, th2 = x2 −
a12th1 = x2 − x1 (assuming that the second process is in the current set of
the first state), and so on. Second, note that given Equation (8.3), the
region Rh can be written as a system of inequalities where the duration of
each process must be greater than the duration of a linear function of the
durations other processes in X. For example, Th1 > 0 implies xh1 > 0; Th2
> 0 implies xh2 − a12Th1 = xh2 − a12xh1 > 0; and so on.
It may be instructive at this point to work through an example.
Consider a double stimulation task where some constant time (the
interstimulus interval) intervenes between the presentation of the first
and second stimulus. A very simple representation of processing in this
case as a directed acyclic task network is displayed in Figure 8.2a. The
corresponding OP diagram is in Figure 8.2b. In this figure, X1 represents
the processing of the first stimulus, X2 represents the processing of the
second stimulus, and X3 represents the “processing” or communication
delay of the interstimulus interval. The processing of the second
stimulus cannot be initiated until both the interstimulus interval has
elapsed (a logical requirement) and the first stimulus has completed
processing (the hypothesis). Assume that the durations X1 and X2 of,
respectively, processes X1 and X2 are independent uniform [300, 400]
random variables and the duration X3 of process X3 is a constant equal to
c, where 300 < c < 400.
Then, given the above assumptions we can write the joint density as
follows: fX(X1, X2, X3) = 1/10000 for 300 < X1, X2 < 400 and X3 = c; fX(X1,
X2, X3) = 0 otherwise. Let the top path in the OP diagram (Figure 8.2b)
be path 1. The states on this path all have a positive duration when X3 <
X1 < 400 and 300 < X2 < 400 (these assumptions define R1). Similarly,
consider the bottom path, path 2. The states on this path all have a
positive duration when X1 < X3 < 400 and 300 < X2 < 400 (these
assumptions define R2). From Equation (8.4) we compute the expected
completion time as:
400 400
E[ RT ]    (t
x2 300 x1 c
11  t12  t13 ) f x ( x1 , x 2 , x 3 )dx1dx 2
400 c
   (t
x2 300 x1 300
21  t 22  t 23 ) f x ( x1 , x 2 , x 3 )dx1dx 2 .
Note that for path 1 we have from Equation (8.3), t11 + t12 + t13 = x3 +
(x1 − x3) + x2 = x1 + x2, and for path 2, t21 + t22 + t23 = x1 + (x3 − x1) + x2 =
c + x2. For example, substituting into the above equations, setting c =
300 and integrating, we find E[RT] = 700.
Fig. 8.2. Two concurrent processes, X1 and X3, represented in (a) a task network and (b)
an OP diagram.
Of course, it is not always the case that the integration required in

Equation (8.4) will be easy to carry out, even numerically. In such cases,
simulation affords a straightforward alternative. For example, consider
again the model in Figure 8.2. Here one would choose a distribution for
X1, X2 and X3. The duration of X3 would be a constant, the length of the
interstimulus interval. The simulation of the response time to the second
stimulus can be determined directly from the OP diagram. Let RT2

denote the time from the onset of the first stimulus to the response to the
second stimulus. Then RT2 = X1 + X2 if X1 > X3; otherwise RT2 = X3 + X2.
This information is present in the task network as well, though perhaps
not as easily retrieved.
OR Networks
An OR network is a directed acyclic task network in which each process

can start as soon as any of its immediate predecessors have finished. The
time to complete the task is the sum of the durations on the shortest path
through the network. OR networks can be represented as OP diagrams
embedded in the related representation of the AND network with the
same form. An example can make this embedding clear. Consider the
longest path task network in Figure 8.3a. Both processes X1 and X2 must
finish before process X3 begins. The related OP diagram is displayed in
Figure 8.3b.
Now, suppose the vertex at the start of process X3 were an OR gate
instead of an AND gate. The resulting OR network is in Figure 8.3c (the
OR gate is represented as an open rather than a filled circle). Process X3
now begins as soon as either process X1 or X2 completes. Note that we
can continue to use the same OP diagram (Figure 8.3b) to represent the
OR network with the proviso that when the top path is taken state s2 is
considered a dummy state (i.e., a state of zero duration) and when the
bottom path is taken state s3 is considered a dummy state. This follows
because if process X2 finishes before process X1 then the time it takes to
complete process X1 in state s2 is not relevant. And similarly, if process
X1 finishes before process X2, then the time it takes to complete process
X2 in state s3 is not relevant. More generally, if some path, say h, is
taken, then it must be determined for each state along the path whether it
is a dummy state.
In order to compute the moments of the response time, we apply
Equation (8.4) suitably modified. By suitably modified, we mean that
the sum gh excludes the duration of all dummy states along path h. For
example, since states s2 and s3 are dummy states in the OP diagram for
Fig. 8.3. (a) Longest path task network with two concurrent processes, X1 and X2. (b)
The associated OP diagram. (c) A shortest path task network (the OR gate is indicated by
an open circle).
the shortest path task network in Figure 8.3c, neither T12 nor T13 appear,
respectively, as part of g1 or g2. Thus, we want to evaluate the sum of
multiple integrals below:
  x1
E[ RT ]     (t
x3 0 x1 0 x2 0
11  t14 ) f x ( x1 , x 2 ,x3 )dx 2 dx1dx3
  x2
    (t
x3 0 x2 0 x1 0
21  t 24 ) f x ( x1 , x 2 ,x 3 )dx1dx 2 dx 3 ,
where for h = 1, t11 = x2 and t14 = x3, and for h = 2, t21 = x1 and t24 = x3.
The above computations are quite simple when the durations of the
processes are independent exponentials (or sums of independent
exponentials). Details are in Fisher and Goldstein (1983). Computations
become considerably more complex when distributions are other than

exponential.
Application: The Psychological Refractory Period
In this section we apply the integrative chronometric analysis described

above to two studies, Smith (1969) and Pashler and Johnston (1989).
Smith (1969)
In Chapter 5, we summarized qualitative evidence that processes in the

dual task experiment by Smith (1969) could be represented in an AND
network; see Table 5.2. Specifically, in the Single Central Bottleneck
model in Figure 5.2, increasing the number of alternatives for the first
stimulus increased the duration of process B1, response selection for
Stimulus 1, and increasing the interstimulus interval (ISI) increased the
duration of the “process” between presentation of s1 and presentation of
s2. We turn here to finding parameter values and predicting response
times. We assume durations of processes have independent exponential
distributions, and that the ISI is a constant as given in each condition.
The model has seven processes. With a mean for each, there are far
too many parameters to estimate, so we must make simplifying
assumptions. The experiment provides no data allowing separate
estimates of the duration of processes A1 and B1, so let us concatenate
these and consider them as a single process a. Likewise, let us
concatenate processes B2 and C2 into a single process d. Finally, let us
suppose the duration of process A2 is small enough to treat as 0. To
avoid subscripts and labels with double letters, let us denote process C1
as b and process SWa as c. Let the rate for process a be denoted qa, the
rate for process b, qb, and so on. The value of the ISI is denoted by I and
the number of alternatives is denoted by j. The rate parameter of process
a with j alternatives is denoted qa(j).
Set E[RT1(j)] equal to the time to respond to the first stimulus when
there are j alternatives. Set E[RT2(I, j)] equal to the time to respond to
the second stimulus when there are j alternatives and set the value of the
interstimulus interval equal to I. Then we obtain E[RT1(j)] directly from

the AND network in Figure 8.4a:
1 1
E[ RT 1( j )]   .
qa ( j ) qb
We obtain E[RT2(I, j)] by applying Equation (8.4) to the OP diagram

in Figure 8.4b (see the Appendix for more details):
 
E[ RT 2( I , j )]  I 1  e  Iqc   
qc

e e 
  Iqa ( j )   Iqc  

  qc  q a ( j )  
 qc Iqa ( j )  1   Iqa ( j )   Iqc 

qa ( j )
e  e   Iq  1 e  Iqc  
    c 
qc  q a ( j )  qc 
 e  Iqc    1 
    .
q
 a ( j )   qd 
A grid search was used to find the best fitting parameters. (For no j
was denominator qc − qa(j) = 0, j = 2, 4, 8.) As a check, the partial
derivatives with respect to each parameter were produced with
MACSYMA, and parameter estimates minimizing the sum of the
squared error were found with the nonlinear regression program SAS
NLIN (SAS Institute, 1985). The parameter estimates from the grid
search were used as starting values in the nonlinear regression program.
The following reciprocals of the parameter estimates gave the best fit to
the observations in Table 5.2: 1/qa(2) = 25, 1/qa(4) = 125, 1/qa (8) = 200,
1/qb = 443, 1/qc = 278, 1/qd = 354. These are virtually identical to the
parameter estimates found by the grid search. With these estimates, the
model accounts for 99.4 percent of the variance.
Predicted values are quite close to observed values for RT2, see Table
8.1. For example, the differences between observed and predicted values
Fig. 8.4. Processing in the double stimulation task run by Smith (1969) represented as: (a)
a PERT network and (b) an OP diagram. (Note that the response to the first stimulus is
emitted when process b completes and that the response to the second stimulus is emitted
when process d completes.)
at ISIs of 50, 150, 300 and 500 ms are, respectively, 6, 2, 8 and 9 ms

when the number of stimulus 1 alternatives is 2. Errors are larger for
RT1. For example, the differences between the observed and predicted
values at ISIs of 50, 150, 300 and 500 ms are, respectively, 12, 9, 24 and
9 ms when the number of stimulus 1 alternatives is 2. In the model, the
processes associated with the first stimulus, a and b, were assumed to be
unaffected by the size of the interstimulus interval, since Smith (1969)
found no significant effect of the interstimulus interval on RT1. In fact,
the observed response times in each column are numerically different,
perhaps in a systematic way. The differences may be a modest effect of
expectancy; perhaps the subject is best prepared for the second stimulus
when the warning provided by the first is neither too short nor too long.
Preparation for the second stimulus may interfere with processing of the
first. But the errors in prediction are too small to pursue here.
Simplifying assumptions were made because there are far more
parameters than observations. One path from s1 to B2 in the Single
Central Bottleneck model has duration A1 + B1 + SWa. We could have
Table 8.1
Observed (obs.) and Predicted (pred.) Reaction Times:
Smith (1969)
Stimulus 1 Alternatives
RT1
2 4 8
ISI obs. pred. obs. pred. obs. pred.
50 480 468 594 568 666 643
150 459 468 582 568 639 643
300 444 468 561 568 635 643
500 459 468 571 568 628 643
Stimulus 1 Alternatives
RT2+ISI
2 4 8
ISI obs. pred. obs. pred. obs. pred.
50 665 659 766 757 835 832
150 680 682 769 768 836 839
300 750 758 809 816 889 876
500 913 904 929 936 963 976
made the simplifying assumption that the process SWa is a dummy

process with duration 0. But we did not, and its estimated value is not
small, 1/qc = 278 ms. Using a quite different estimation procedure,
drawing on work by Ulrich and Miller (1997), Schwarz and Ischebeck
(2001) in effect preset the duration of SWa to 0 and obtained a good fit to
the data. A consequence was a larger estimate of A1 + B1 than found
here. With their procedure, they estimate A1 + B1 − A2 has values 235,
352 and 431 ms for number of Task 1 alternatives 2, 4, and 8,
respectively. They assumed the sum (A1 + B1) and A2 have a bivariate
normal distribution.
The two procedures give somewhat close estimates for the expression
A1 + B1 + SWa − A2. With their procedure, SWa = 0, leading to above
estimates 235, 352 and 431 ms. With our procedure, A2 = 0, leading to
corresponding estimates 303, 403 and 478 ms.
Pashler and Johnston (1989)
As another example of how an OP diagram can be helpful, we use one to

represent the Response Selection Bottleneck model of Figure 5.2 and to
estimate the 18 means and variances (averaged across repetitions) in
Experiment 1 of Pashler and Johnston (1989), see Table 5.3. Details of
the modeling are in the Appendix. The major practical problem is that
since each process (except the SOA) has a mean and standard deviation
which must be estimated, the number of potentially free parameters is
large, even when data are available from two responses. To reduce this
number, we concatenated processes A1 and B1 into one process, A1B1,
and we concatenated processes B2 and C2 into one process, B2C2. We
assumed processes A1B1, C1, B2C2, and SWa were unaffected by
manipulations in intensity and SOA. We assumed the duration of
process A2 was affected by the manipulation of intensity (but was not
affected by the manipulation of the SOA). Finally, we assumed the
duration of process I was equal to the SOA. This left us with seven
process durations: A1B1, C1, A2(high intensity), A2(low intensity),
B2C2, SWa, and I. We assumed each of these processes except I has a
gamma distribution. With further reductions described in the Appendix,
the final fit was determined by specification of four parameters: the three
scale parameters for processes A1B1, A2(high intensity), and SWa, and
the shape parameter for process SWa; see the Appendix for details.
The 18 predicted and observed means and the 18 predicted and
observed standard deviations of the response times are in Table 8.2. The
6 predicted means and 6 predicted standard deviations of the response
time RT1 to the first stimulus in the double stimulation task were fit to
equal, respectively, the mean response time and standard deviation of
RT1, averaged over SOA and intensity, because the SOA and intensity
levels had no significant effect on RT1. Similarly, the 3 predicted means
and 3 predicted standard deviations of the response time RT2* to the
second stimulus when it was presented at high intensity and responded to
alone were fit to equal the corresponding mean and standard deviation of
RT2*, averaged over SOA. The same was done for the predicted mean
and variance of RT2* at low intensity. This was done because intensity
had an evident effect on RT2*, while SOA did not (Pashler and Johnston,
1989, Figure 3; no relevant direct statistical tests were reported).

It is clear from Table 8.2 that the model fits well on the whole. The
model explains 99.8% of the variance. The biggest discrepancy (in both
predicted means and standard deviations) occurs when the SOA is 400
ms. However, the absolute differences are moderate, a difference of 43
ms between the observed (1004) and predicted (961) mean response
times and a difference of 24 ms between the observed (160) and
predicted (136) standard deviations.
It is worth noting that this model can explain the large increase in the
variability in the time RT2 to respond to the second stimulus in the
double stimulation task compared with the single task. Consider
conditions where the second stimulus is presented at high intensity, after
a 50 ms SOA. The average observed standard deviation of the time
RT2* to respond to the second stimulus alone (i.e., when no response is
required to the first stimulus) is 67. The observed standard deviation of
the time RT2 to respond to the second stimulus when a response is
required to the first stimulus is nearly tripled, 172. The predicted
standard deviation increases likewise, from 67 to 177. Similar increases
occur in observed and predicted standard deviations of time to respond to
the second stimulus when presented at low intensity.
Generalization to Other Cognitive Networks
Three points of interest remain. First, we show how one can sometimes
represent two alternative networks as directed acyclic task networks,
namely, connectionist and queueing networks. Second, we show how to
incorporate resource constraints into response time modeling. Finally, we
talk briefly about cognitive behavior that cannot be represented as a
directed acyclic task network.
Connectionist networks
Many cognitive tasks are represented as connectionist networks, including

word recognition (McClelland and Rumelhart, 1986; Rumelhart and
Table 8.2
Observed (obs.) and Predicted (pred.) Reaction Times:
Pashler and Johnston (1989)
RT1: First Stimulus

Mean Standard Deviation
High Low High Low
SOA obs. pred. obs. pred. obs. pred. obs. pred.
50 590 589 591 589 139 136 149 136
100 583 589 583 589 124 136 135 136
400 589 589 596 589 131 136 134 136
RT2: Second Stimulus

(With Response to First Stimulus Required)
High Low High Low
50 907 899 911 909 172 177 189 183
100 901 897 912 904 176 169 178 174
400 1004 961 1038 1005 157 140 160 136
RT2*: Second Stimulus (Alone)

High Low High Low
50 505 495 563 549 77 67 73 75
100 493 495 547 549 69 67 84 75
400 487 495 539 549 52 67 66 75
McClelland, 1986), visual search (Phaf, Van der Heijden and Hudson,
1990) and categorization (Busemeyer & Myung, 1988; Gluck & Bower,
1988). Two criticisms of the use of such networks are considered here.
First, the network structure is usually presupposed, and it is not always
clear how to test that two processes in a network were arranged, say, in
parallel rather than in series. Second, only a few studies have used
connectionist networks to model reaction times in detail (e.g., Cohen,
Dunbar, & McClelland, 1990; Liu, Holmes & Cohen, 2008; Ratcliff, Van
Zandt & McKoon, 1999) and it is often not clear how to predict the time
it takes to perform a task represented as a connectionist network. We
address these two problems for a subset of connectionist networks, one
and two layer networks without feedback of the activation, and where the
threshold gates are logically equivalent to AND or OR gates. Although
learning in connectionist networks is one reason they have received so
much attention, we are interested here in the network only after learning
has occurred because our concern is with the final architecture, not the
architecture as it evolves.
A simple one layer net without feedback is constructed as follows.
The net consists of a set of input and output nodes. (The set of input
nodes customarily does not count toward the number of layers.) Each
input node ni (i = 1,..., I) is connected to all output nodes nj (j = 1,..., J).
Associated with the connection or arc between input node ni and output
node nj is a weight wij. Activation is initiated at the input nodes by one
of the K input patterns pk (k = 1,..., K). An input pattern is a vector of I
components, the ith component being the input to node ni. The activation
aj at an output node nj is a weighted sum of the inputs a1,..., aI to nj:
I
aj  w a
i 1
ij i
For example, a one layer net with three input nodes (n1, n2, n3) and
three output nodes (n4, n5, n6) is represented in Figure 8.5a. Each input
node is connected to every output node. If w14 = w24 = w34 = 1 and if a1 =
0, a2 = 1, and a3 = 0, then using the above equation the activation at node
n4 is a4 = 1. Assume that a response is made by output node nj when the
activation at the node exceeds some threshold dj.
Initially, the weights on the arcs are set randomly. The input patterns
are divided into subsets, a separate response being required for each
different subset. Assume for the sake of simplicity that there are J
different subsets of patterns, J output nodes and I = J input nodes. If the
input patterns satisfy certain assumptions, then the network can be
trained to make an appropriate response for each pattern in each subset.
Suppose that the network has been so trained. Then, if learning is
successful, for each input pattern pk in the set of trained patterns the
activation at exactly one of the output nodes, say node nj, will be above
Fig. 8.5. (a) Representation of a one layer connectionist net with three input and three
AND output nodes; (b) the corresponding functional net when the output is n4; c) the task
network representation of the functional net; d) the task network representation of the
functional net if n4 were an OR node.
threshold; the activation at all other nodes will be below threshold. Thus,
the time to complete processing in the neural net will depend only on the
time it takes to increase the activation at node nj above threshold.
The threshold gates used in connectionist models are sometimes
logically equivalent to AND gates (Williams, 1986). For example,
consider the one layer net in Figure 8.5a. Assume that each activation ai
(i = 1, 2, 3) is either a 0 or a 1. And assume that node n4 has a threshold
of 3. Then n4 is equivalent to an AND gate. If an input pattern is
presented which increases above threshold just the activation at node n4,
then only the connections to node n4 are functional for this input pattern.
The resulting functional network is much simpler than the original one
and is formed from the connections in Figure 8.5b. Given the above
assumptions, this functional network can be interpreted as a simple
parallel AND network. To see this, let xij represent the process associated
with the transmission of activation from input node ni to output node nj.
Then, node n4 will not respond until all of the processes x14, x24 and x34
have completed. As an AND network, this is graphed as in Figure 8.5c.
The distribution of the response time in this case requires computing the
maximum of the time to complete processes x14, x24 and x34. It is clear
that the results generalize to single layer nets with any number of input
and output nodes.
One or more of the nodes in the network under consideration could
just as easily have been an inclusive OR node. Such a node fires when
any one or more of its inputs are active. For example, if node n4 were an
inclusive OR node it would fire as soon as it received activation from
node n1, n2 or n3. This relation between an output node and its input
nodes can also be diagrammed as a directed acyclic network (Figure
8.5d). The duration of this process is simply the minimum of the time to
complete processes x14, x24 and x34. For more discussion of activation
functions realizing the Boolean AND and OR functions, see Williams
(1986).
A major problem with one layer nets such as the perceptron is that
they cannot be used to represent an exclusive OR gate. Such gates are
frequently required in order successfully to categorize a set of patterns.
However, a two layer net, unlike a one layer net, can successfully
categorize patterns which require an exclusive OR gate. Briefly, a two
layer net consists of a set of input nodes, hidden nodes (the middle layer)
and output nodes. An example is given Figure 8.6a. Nodes n1, n2, and n3
are the input nodes; nodes n4 and n5 are the hidden nodes; and nodes n6
and n7 are the output nodes. Each of the first two input nodes (n1 and n2)
has a connection to each of the two hidden nodes; each hidden node has a
connection to the two output nodes. In addition, the third input node (n3)
connects directly to the last output node (n7).
The activation ah at a hidden node nh (h = 1,..., H) is computed like
the activation at an output node in a single layer net:
I
ah  w
i 1
ih ai
Fig. 8.6. (a) Representation of a two layer exclusive-or connectionist network which
responds yes (n6) when exactly one of nodes n1 or n2 is activated and responds no (n7)
otherwise; (b) the corresponding functional network when exactly one feature is present;
(c) the PERT network representation of the functional network.
The activation at an output node is computed similarly. Assume that

each of the nodes is a threshold node, i.e., a node does not fire until
activation at the node is above some predetermined threshold.
For example, consider the net in the Figure 8.6a. Assume that it has
been trained to function as an exclusive OR gate (an XOR gate). And
assume that each node in the network will fire when all its inputs have
been received and their total exceeds the threshold value of 0. The
network performs the XOR operation on the inputs a1 and a2. The third
input a3 is used to distinguish the case when a1 and a2 are not presented
from the case when a1 and a2 are presented and both are 0. When no
stimulus is present, assume that the input vector is <0, 0, 0>. No
response is made since the activation at both output nodes is below
threshold. When a stimulus is presented, assume that the input vector is
<a1, a2, 1>, where a1 and a2 are each either 0 or 1 and where a3 is always
1. The correct response is “yes” when either a1 = 1 or a2 = 1, but both do
not equal 1. Otherwise the correct response is “no.”

The network in Figure 8.6a produces just this behavior if it is
assumed that when node n6 is above threshold a “yes” response is made
and when node n7 is above threshold a “no” response is made. To see
this, note that when the input vector is <1, 0, 1>, the output activation is
equal to 1 at the “yes” node and 0 at the “no” node, so a “yes” response
is made. Similarly, when the input vector is <0, 1, 1>, a “yes” response
is made. However, when the input vector is <0, 0, 1> or <1, 1, 1>, then
the output activation at the “yes” node is 0 and the output activation at
the “no” node is 1 and thus the response is “no.” Note that if the inputs
arrive at a node at different, random times, and the node fires as soon as
the instantaneous input exceeds 0, many erroneous responses would be
made.
It remains to show that a two layer network which functions as an
exclusive OR gate can be represented as a directed acyclic task network.
Consider again the network in Figure 8.6a. Assume that the input vector
is <1, 0, 1> and the result is a “yes” response. The time to make this
response depends only on the time it takes activation to flow from node
n1 to node n4 and from there to node n6. The threshold for the “no”
response was not exceeded, so the “no” node, together with all the arcs
terminating at it and all nodes not involved in the “yes” response can be
ignored, since they do not contribute to the response time. The resulting
functional network net is displayed in Figure 8.6b, and is easily seen to
be a directed acyclic task network (Figure 8.6c).
In short, although the complete connectionist network representing a
task may look complicated, only a subnetwork will be relevant for
determining the time to make a particular response given a particular
stimulus. The relevant subnetwork in many cases is a directed acyclic
task network. It can be analyzed by examining the response times for the
particular stimulus-response pair.
Queueing networks
Queueing networks are used in disciplines allied with cognitive science

such as human factors engineering, see, e.g., Rouse (1980), but have
received scant attention in psychology. This is surprising given their

obvious appeal as models of short term memory and resource constrained
processing. Only a few studies known to the authors make significant
use in psychology of queueing theory: The work by Liu (1996) and Wu
and Liu (2008) mentioned in Chapter 5; and Harris, Shaw and Bates
(1979); Hoffman (1978, 1979); Fisher (1982, 1984); Fisher, Duffy,
Young and Pollatsek (1988); and Miller (1993).
Briefly, a queueing network is composed of a set of customers
(stimuli, problems, distractors), nodes, and arcs connecting various
nodes. (Vertices are usually called nodes in queueing theory.) Each
node consists of one or more servers (processors, homunculi). The
number of customers processed per second is the service rate at a server.
The rate may depend on capacity constraints, for example, as more
servers in a node become active, the rate at each server might decrease.
Each node also contains a buffer (memory, queue), typically of finite
length. Once a customer completes processing on a node one of several
things can happen. The customer will be forwarded along an arc to the
next node, if the buffer at the next node is not full. If the buffer at the
next node is full, then the customer is either lost from the system or held
at the current server. Customers can enter a buffer from inside the
network (as above) or from a source outside the network. The priority of
the customer in the buffer can be varied, e.g., the customer could go to
the head of the buffer or could remain at the end. The time a customer
spends in the buffer can also be varied.
For example, in Figure 8.7a the queueing network consists of two
nodes, n1 and n2, with one server at each node (server s1 at node n1 and
server s2 at node n2). Up to 3 customers can be placed in the buffer at the
first node. One customer (c2) is resident there initially in this particular
example. When processing begins, customer c1 starts executing on
server s1. After completing service, customer c1 will be moved to the
server at the second node since this server is not occupied. If the server
were occupied, the customer would be placed in the buffer at the second
node. After completing service on the second node, the customer exits
from the system.
Fig. 8.7. (a) A queueing network with two servers, s1 and s2, and room for three
customers in the buffer for server s1 and two customers in the buffer for server s2
(customer c2 is currently in the buffer at node n1 and customer c1 is on server s1); (b) the
PERT network representation of the processing of two customers through the queueing
network; (c) the corresponding OP diagram.
Processing in a queueing network can be represented in an AND

network in a straightforward way. We show with an example.
Specifically, we represent the queueing network displayed in Figure 8.7a
as the PERT network displayed in Figure 8.7b. Assume that there are
only two customers initially in the system, one on the buffer at the first
node and one on the server at the first node. Let sij be the processing of
customer ci on server sj. Then, customer c1 begins processing on server
s1 (process s11, Figure 8.7b). After customer c1 completes service on
server s1, it begins processing on server s2 (s12) at the same time customer
c2 begins processing on server s1 (s21). Finally, customer c2 can begin
processing on server s2 (s22) as soon as processes s12 and s21 have
completed. The resulting OP diagram is in Figure 8.7c.
Miller (1993) used a queue-series model to represent Discrete

Asynchronous Coding. In the model, a stimulus consists of components.
Each stage processes one component at a time. When the first stage has
finished processing the first component, that component is sent to the
second stage; meanwhile, the first stage begins processing the second
component, as in the queueing model illustrated here. If a stimulus
consists of only one component, when processing of that component is
finished, processing of the stimulus is finished, and the stages are
complete output processes in series. If a stimulus has more than one
component, the model is a partial output model. As the number of
components approaches infinity, the model approaches a continuous flow
model. Miller noted that the queueing network can be represented as an
AND network. He found, through simulations, that factors selectively
influencing different stages occasionally have additive effects on
response times even if the number of components is greater than 1, but
not if this number is much greater than 1. Miller’s work demonstrates
that partial output models are sometimes capable of explaining that
factors selectively influencing different processes have additive effects,
but additivity is not typically predicted by such models.
Generalization to resource constrained systems
Much early work in cognitive psychology focused on the nature of

resource constraints in processing (e.g., Broadbent, 1958; Treisman,
1960; Treisman and Geffen, 1967). Within any given one sensory
modality, clear limitations appear under some conditions, for example, in
visual search tasks if the mapping is varied (Schneider and Shiffrin,
1977) or if targets and distractors are sufficiently similar (Duncan and
Humphreys, 1989; Treisman and Gelade, 1980). Limitations within
other modalities have also been demonstrated (Wickens, 1976).
Limitations generally take one of two forms. In the most typical there
is a limit on the total resources allocated to a given subset of processes.
For example, Rumelhart (1970) proposed that a constant limited capacity
gets divided equally among the distractors in a visual search task. This
limit does not affect the relation among events and thus need not be
incorporated into the task network. In the second, less frequent form,
there is a limit on the number of concurrent processes. For example,
Fisher (1982, 1984) suggested that at most four comparison processes
can execute simultaneously during the standard consistent mapping
visual search task (i.e., letter targets in digit distractors); for recent
discussion see Cowan (2005). Task networks with just AND gates or
just OR gates cannot capture this type of limitation (though OP diagrams
can; see following section).
The most systematic and sustained work on capacity started with
work by Townsend (1971, 1974) addressing capacity issues in
distinguishing parallel and serial processing. The key component is a
measure of capacity (Townsend & Ashby, 1983; Townsend & Nozawa,
1995) that can be used in conjunction with techniques such as selectively
influencing processes to learn whether and how processes share capacity
when working together (Townsend & Wenger, 2004; Wenger &
Townsend, 2000).
We focus here on the situation in which one has learned, through such
an analysis or in some other way, that there is a limit on the capacity
allocated to a subset of processes. Specifically, we want to show how
one can incorporate the limitation into the computation of the moments
of the response time. Full details are available in Goldstein and Fisher
(1992). Here, we assume the work Wx required to complete the
execution of a given process x varies from trial to trial. We assume that
the rate v(x, shi) at which work proceeds on a process x in the current set
of state shi is determined not only by the identity of the process and the
state, but also by the history of the system (i.e., the path) up until the
current state. We assume that no work is accomplished if a process is in
some set other than the current set so that v(x, shi) = 0 if x is not in C(shi).
Finally, we assume that the following relation obtains between the rate at
which work is performed in each of the states, the duration of the states,
and the total work on a given trial where some path, say h, is followed:
h
Wx   v ( x , shi )Thi (8.5)
i 1
where h' is the index of the last state on path h prior to the finish state.
In order to compute the moments of the response time we assume the
joint density of the work requirements fw(w1,...,w|X|) is known or
estimated (this is similar to the assumption above that the joint density of
the process durations was known). The required computations follow
from Equation (8.6) in the form below:
|H |
E[ RT ]      g h f w( w1 ,..., w| X | )dw1 ,..., dw| X | , (8.6)
h 1 ( w1 ,..., w| X | )Rh
where gh and Rh are as defined for Equation (8.4). To integrate, we need

to rewrite the quantities gh = (th1 + ... + thh') and the region Rh = (th1 > 0,...,
thh' > 0) in terms of the wi.
First, quantities thj in gh can be rewritten in terms of the wi using
Equation (8.5). For example, consider the AND network in Figure 8.3a
and corresponding OP diagram in Figure 8.3b. Let the index for the top
path be 1 and that for the bottom path be 2 so that t11 = t1, s11 = s1, t12 = t2,
s12 = s2, t13 = t4, s13 = s4, and so on. Then g1 = (t11 + t12 + t13) = (t1 + t2 +
t4). Similarly, g2 = (t1 + t3 + t4). Let time be in seconds and work
unitless. Assume that v(1, s1) = 1, v(1, s2) = 2, v(2, s1) = 2, v(2, s3) = 4,
and v(3, s4) = 3. Then, from Equation (8.5) we have for path 1 (the top
path): w2 = 2t1, w1 = t1 + 2t2, and w3 = 3t4. Solving for ti, we obtain: t1 =
w2/2, t2 = (w1  w2/2)/2, and t4 = w3/3. Thus, we have g1. Similarly, for
the bottom path (path 2), we obtain: t1 = w1, t3 = (w2  2w1)/4, and t4 =
w3/3. We can now easily obtain the region Rh. Specifically, for the top
path we have t1, t2, t4 > 0 if and only if 0 < w2 < 2w1 and w3 > 0. And for
the bottom path we have t1, t3, t4 > 0 if and only if 0 < w1 < w2/2 and w3 >
0.
We now use Equation (8.6) to compute the expected response time
for a given density function. For example, assume W1, W2 and W3 are
independent, identically distributed uniform [0, 100] random variables.
Then we can write the expected response time as:
100 50 2 w1
 w2 w1 w3 
E[ RT ]    
w3 0 w1 0 w2  0

 4

2

3
 f w ( w2 , w1 , w3 )dw2 dw1dw3

100 100 100
 w2 w1 w3 
    
w3 0 w1 50 w2 0 
4

2
  f w ( w2 , w1 , w3 )dw2 dw1dw3
3 
w2
100 100 2
 w1 w2 w3 
     
w3 0 w2 0 w1  0 
2 4
  f w ( w1 , w2 , w3 )dw1dw2 dw3 .
3 
Completing the above, one obtains as the expected response time 52.17.
More general networks
We have shown how one can represent AND networks, OR networks and
certain connectionist and queueing networks as directed acyclic task
networks, and then as OP diagrams. For each OP diagram, it was the
case that: a) one path was followed through the network on each trial; b)
the response time on a given trial was equal to the sum of the times spent
in each of the states along the relevant path; and c) only the joint density
function of the process durations or work requirements (not the state
durations) was known. Given this information, we then showed that one
could compute the moments of the response time. But to compute the
moments of the response time, one need not begin with a directed acyclic
task network. What general characteristics of a representation make it
possible to carry out the above program?
Goldstein and Fisher (1991) showed that the set of networks which
are OP representable differs from the set of directed acyclic task
networks represented previously in OP diagrams in several important
ways: a) there can be several start states; b) there can be several finish
states; c) there can be several processes which complete simultaneously
when a transition is made from one state to another (simultaneous
completion is allowed in a task network, but was not allowed previously
in the OP representation of it); d) there can be processes which are
interrupted during their execution; e) the transition out of a state can be
determined by the history of the system up until the transition; f) there

can be processes that are not executed on a given trial; g) there can be
paths through the OP diagram of different lengths. This extension made
it possible for Goldstein and Fisher to incorporate into the OP
representation other important networks such as GERT networks and
Petri nets as well as more sophisticated associative, queueing and
connectionist networks (also see Fisher, 1985). We will not pursue these
representations here since neither can be considered a special case of a
directed acyclic network either with all AND gates or all OR gates.
Decomposition techniques have not yet been developed for the full
set of OP representable networks, and it is an open question whether
manipulation of factors selectively influencing processes will reveal the
processing architecture. As a start, we note that moments of the response
time distributions can be calculated using OP diagrams (Goldstein &
Fisher, 1991), and these can be used to test hypotheses about process
arrangement.
Appendix
Fitting the Smith (1969) and Pashler & Johnston (1989) data
Smith
To obtain E[RT2(I,j)] by applying Equation (8.4) to the OP diagram in

Figure 8.4b, we compute for each path the quantity listed in the equation.
For example, for the bottom path, let us call it path 1, the duration is
equal to the sum of the durations of the states s1, s3, s6, s9, s11 and s12,
i.e., T1 + T3 + T6 + T9 + T11. By inspection of Figure 8.4b, it is clear
that this is equal to the duration a of process a plus the duration b of
process b. Thus, in Equation (8.1), g1 = a + b. Since I is a constant and
the durations of processes a, b, c and d are independent exponentials,
the joint density function f(I, a, b, c, d) is simply abcd exp(aa bb
cc dd). Finally, by inspection it is clear that we want to integrate a
between I and infinity, b from c + d to infinity, and both c and d from 0
to infinity.
Once we have a closed form expression for this multiple integral, we
repeat the process for all paths in Figure 8.4b. Gathering the terms
together, we obtain the expression in the text for E[RT2(I, j)].
Pashler and Johnston
Recall that for the data of Pashler and Johnston (1989), there are seven
processes in the reduced model: A1B1, C1, SWa, A2(low), A2(high),
B2C2 and I. The duration of the interstimulus interval is known. Each
of the first six processes is assumed to have a gamma distribution. A
gamma distribution is described by two parameters, one for shape (which
we shall designate by α) and one for scale (). Then there are 12
parameters that we need to estimate.
Parameters
To begin, consider the processes associated with the response to the first
stimulus in the double stimulation task. There are two processes, A1B1
and C1, and thus four potential parameters. In order to reduce the
number of parameters, we assumed (arbitrarily) that the same shape
parameter, α, controlled the distribution of the durations of processes
A1B1 and C1. Thus, the duration A1B1 of process A1B1 has a gamma
distribution with parameters say, α11 and 11, and the duration C1 of
process C1 has a gamma distribution with parameters α12 and 12 where
α11 = α12. Now, note that if we require the mean and variance of our
prediction of the time RT1 to respond to the first stimulus in the double
stimulation task to equal respectively the mean and variance of the
observations, only one of the four parameters α11, α12, 11 and 12 remains
free. Specifically, recall that the predicted mean and variance are
obtained as follows:
E[ RT 1]  E[ A1B1]  E[C1]  (11 )( 11 )  (12 )( 12 )

VAR[ RT 1]  VAR[ A1B1]  VAR[C1]  (11 )( 11 ) 2  (12 )( 12 ) 2
If we set the above predictions equal to the sample mean, say M1, and
sample variance, say V1, we obtain:
M 1  (11 )(  11 )  (12 )(  12 )
V1  (11 )( 11 ) 2  (12 )(  12 ) 2
Once we set 11 to some value, say b11, then we can solve for 12 and α11
= α12. Specifically, we obtain:
V1  V12  4 M 1 (b112 M 1  b11V1 )

 12 
2M1
and
M1
11   12 .
b11  12
In a similar fashion we can use three (instead of four) parameters to

describe the high intensity processes A2(high) and B2C2(high).
Specifically, we assume that A2(high) has a gamma distribution with
parameters α21(high) and 21(high) and B2C2 has a gamma distribution
with parameters α22(high) and 22(high) where α21(high) = α22(high).
Once we assign some value, say b21(high), to 21(high), we can solve for
the remaining two parameters, 22(high) and α21(high), using the mean
M2*(high) and variance V2*(high) of the time it takes to respond to the
second stimulus when it is presented alone. Thus, again, we have one
free parameter.
Next, consider the four parameters for the low intensity processes.
Since, as noted above, we assume that the change in intensity does not
affect process B2C2(low), the shape α22(low) and scale 22(low)
parameters are given by, respectively, α22(high) and 22(high), both of
which were obtained above. The shape and scale parameters for process
A2(low) follow directly if we require that these processes be so chosen
that we fit exactly the mean M2*(low) and variance V2*(low) of the time
to respond to the second stimulus when it is responded to alone.
Specifically,
V2* (low)   22 ( low) 22 (low) 2

 21 (low) 
M 2* (low)   22 ( low) 22 (low)
M 2* ( low)   22 ( low) 22 ( low)

 21 (low)  .
 21 (low)
In this case, note that there are no free parameters.

Finally, we have the two free parameters associated with the
switching process SWa to estimate, say α3 and 3. In short, in our final
model we begin with 14 parameters, but only four are free.
Simulation
Using the directed acyclic task network in Figure 8.4a, it is

straightforward to simulate the response times RT1(i,low), RT1(i,high),

RT2(i,low), RT2(i,high), RT2*(i,low), RT2*(i,high) at each SOA of i ms.
For example, consider just the low intensity responses:
RT1(i, low) = A1B1 + C1 for all i;

RT2(i, low) = max{A1B1 + SWa, i + A2(low)} + B2C2;
RT2*(i, low) = A2(low) + B2C2 for all i.
The responses for the high intensity processes were expressed similarly.
A total of 1000 trials were used in the grid search through the
parameter space to evaluate the goodness of the fit of each combination
of parameters. Those values for the parameters were selected which
minimized the sum of the square of the difference between the 36
predictions and observations, i.e., the 18 means and 18 standard
deviations (two levels of intensity, three interstimulus intervals, three
different responses). The parameter values were: α11 = α12 = 9.5, 11 =
34.0, 12 = 28.3, α21(high) = α22(high) = 45.8, 21(high) = 1.0, 22(high) =
9.9, α21(low) = 8.4, α22(low) = 45.5, 21(low) = 11.9, 22(low) = 9.9, α3 =
1.0, 3 = 134.0.
Chapter 9
Selective Influence with Accuracy, Rate,

and Physiological Measures
A dummy variable is a variable that is flung against the

wall at high speed to test the safety features. An exam
answer.
Evidence that factors selectively influence processes mainly comes from

reaction time. Here we finish our discussion of reaction time with a
couple of examples of factors changing model parameters in ways
expected from interpretations of the parameters. We then survey
evidence of selective influence based on other dependent variables. As
with earlier surveys, we do not attempt to summarize what is known
about the system, rather the emphasis is on methodology. For more
discussion and examples, see Sternberg (2001).
Selectively Influencing Model Parameters
The first example is the Diffusion Model of Ratcliff (1978), which

explains the reaction time and accuracy of a subject deciding which
response to make to a stimulus. Typically one of two stimuli is presented
and one of two possible responses is correct. According to the model, at
any given instant after a stimulus is presented the net evidence accrued
favors one response or the other. For each response there is a threshold,
and at any given instant the net evidence is closer to one threshold or the
other. A particular response is selected when the net evidence reaches
291
the threshold corresponding to it. The selected response is made at

completion of a further nondecision process that involves motor
preparation and so on.
The decision component has several parameters. Evidence accrual
starts at a point z, and has a drift rate with mean v and standard deviation
η. The upper threshold is at a and the lower threshold is set to 0. The
nondecision process duration is uniformly distributed with mean t0 and
width St. In experiments of Voss, Rothermund and Voss (2004), four
factors were manipulated. Stimuli were squares in which orange was
dominant or blue was dominant. The subject indicated the dominant
color by pressing one button or another. Each factor changed the
parameter it was expected to, although some factors had minor effects on
other parameters as well. The drift rate was changed by discriminability.
Emphasizing accuracy increased the distance between the thresholds. An
asymmetric reward for one of the responses changed the starting point.
Finally, making the response more difficult increased the duration of the
nondecision component.
In the second example, Thomas (2006) investigated three models of
two-choice reaction time tasks. One model was Signal Detection Theory
(SDT). By itself, SDT does not predict reaction times, but it was
extended with the Reaction-Time Distance Hypothesis, which says that
decision time increases as the distance decreases from the perceptual
effect of a stimulus (coded internally as some value x) to the decision
criterion (a value xc). The other two models were versions of Stochastic
Signal Detection Theory (Ashby, 2000) and of the Exemplar Based
Random Walk (EBRW) model of Nosofsky and Palmeri (1997).
For each model, Thomas (2006) made reasonable assumptions about
what parameter(s) would be changed by each of three factors, stimulus
quality, stimulus dissimilarity, and stimulus probability. For example, in
each model, the variance of the perceptual effect of a stimulus was
assumed to be greater with lower stimulus quality, and increasing the
dissimilarity between the two stimuli was assumed to increase the
distance between their mean perceptual effects. By examining the way
parameters combine in each model, Thomas determined the predicted
sign of the interaction contrast for each pair of the three factors.
Selective Influence with Accuracy, Rate, and Physiological Measures 293
Predictions were compared with experimental results in the literature.

As an example, additive effects on reaction time of stimulus dissimilarity
and stimulus quality were found by Shwartz, Pomerantz and Egeth
(1977), and (for three of four subjects) by Thomas and Gallogly (1996).
The additivity is consistent with Stochastic Signal Detection theory,
which allows both negative and positive interactions, but contrary to
predictions of SDT and the version tested of EBRW.
Investigators using reaction time to find experimental factors that
selectively influence processes usually analyze the combined effect of
two or more factors, as pioneered by Sternberg (1969). Investigators
measuring accuracy are often able to manipulate a single factor and find
that it selectively influences a single parameter. For example, Chechile
(1977) found that changing the acoustic similarity of items in an
immediate recall experiment changed the probability of storage in a tree
model, leaving retrieval and guessing probabilities invariant.
Use of one dependent variable rather than another may seem
inconsequential, but additional information in models of accuracy comes
from several sources. Typically, more equations are used in models of
accuracy than in models of reaction time, allowing more parameters to be
estimated. With accuracy, investigators are willing to assume that a
moderate change in a task, e.g., from recall to recognition, will leave
certain parameters invariant. With reaction time, there is skepticism
about this, based on difficulties with Donders’ (1868) subtractive
method. Also, incorrect trials are typically discarded for reaction time
analysis, but used in accuracy analysis. Even the treatment of guessing is
different. There is no way to assign the duration of the process of
guessing a priori, but there may be a way to assign the probability a
guess is correct. Knowing a single parameter value or finding one
additional equation may make the difference between estimation of all
parameters and estimation of none.
Ideally one would investigate a task by analyzing both reaction time
and accuracy. The Diffusion Model of Ratcliff (1978) allows this.
Processing trees have been considered by Hu (2001) and processes in
series by Schweickert (1985). A difficulty arises with parallel processes.
Consider two parallel processes, and suppose an experimental factor
selectively influences one of them, prolonging its duration. Does the

other process stop at the same time as before? Would it not take
advantage of the longer duration of the influenced process to continue a
little longer, and thus increase its own accuracy? But then, doesn’t the
factor influence both process? Until this difficulty and others are
resolved, accuracy and reaction time will often continue to be treated
separately, however artificial that is.
Multiplicative Effects
The simplest way two factors can combine is through addition, next
simplest is through multiplication, to which we now turn.
Accuracy
Suppose the processes for performing a task are in series, output of one
being input to the next. The serial arrangement allows analysis of both
reaction time and accuracy. The reaction time will be the sum of the
durations of the individual processes, so two factors selectively
influencing two different processes will have additive effects on reaction
time (Sternberg, 1969). Now consider an individual process to be correct
if it produces the correct output for the input it was given. A simple
model is that the response is correct for the given stimulus if and only if
every process is correct, and that the probability all are correct is the
product of the probabilities that each is individually correct
(Schweickert, 1985). Then two factors that change the probability
different processes are correct will have multiplicative effects on the
probability of a correct response (sometimes called percent correct).
An assumption is made relating the duration of a process to its
probability of being correct. Suppose the probability a process A is
correct is a function πA(A) of its duration A (where we use the same
symbol for a process name and its duration). Then the probability of a
correct response for two processes A and B, with durations A and B,
respectively, is
π(correct) = πA(A)πB(B).
In this model two factors selectively influencing two processes have

additive effects on reaction time and multiplicative effects on percent
correct. Putting multiplication another way,
log π(correct) = log πA(A) + log πB(B),
so such factors are predicted to have additive effects on log percent

correct.
An issue arises in estimation. In many experiments percent correct is
fairly high, .90 or above. The natural log of a probability π in this range
is approximately equal to − (1 − π); for example, ln .90 = − .1053…
Then
π(error) ≈ [1 − πA(A)] + [1 − πB(B)];
that is, factors selectively influencing two processes have approximately

additive effects on the probability of an erroneous response, and
therefore approximately additive effects on the probability of a correct
response. For a fairly high probability of a correct response, additive and
multiplicative models make similar predictions.
To test a multiplicative model, an additional assumption is needed.
Population parameters πA(A) and πB(B) are unknown. For a sample of
trials, let the corresponding estimators be PA(A) and PB(B). For the
sample, let the probability of a correct response be P(correct). Then
P(correct) = PA(A)PB(B).
Taking expected values,
E[P(correct)] = E[PA(A)PB(B)].
A problem arises because the expected value of a product does not

ordinarily equal the product of the expected values of the multiplicands.

A simple assumption is that the probability a process is correct is a
function of its duration, and the durations of the processes are
stochastically independent. This leads to
E[P(correct)] = E[PA(A)]E[PB(B)],
which can be tested.

Additive effects on reaction time with multiplicative effects on
percent correct are together good evidence that factors selectively
influence sequential processes. Examples of factors having such effects
include brightness, similarity and stimulus-response compatibility in the
choice reaction time experiment of Shwartz, Pomerantz and Egeth
(1977); stimulus quality, word frequency, and congruity of word with
context in a lexical decision task of Schuberth, Spoehr and Lane (1981);
and set size and probe type in a memory scanning experiment of Lively
(1972). For details, see Schweickert (1985).
By analyzing accuracy, selective influence can be tested in
experiments where reaction times would be difficult to measure, as in the
following example. If two visual stimuli are presented close together in
time, accuracy in reporting the second stimulus is smaller when the first
stimulus must be reported than when it is not. The phenomenon is called
the attentional blink. In an attentional blink paradigm, Jolicoeur &
Dell’Acqua (2000) displayed for the first stimulus an H or an S, which
required an identification response, or else an “&” or a blank, which
required no response. The second stimulus was a list of five letters,
which always required immediate recall. Each stimulus was followed
quickly by a mask. Exposure duration of the second stimulus was 50,
100, 150, 200 or 250 msec. For percent correct recall of letters in the
second stimulus, two factors, Stimulus 1 report-required-or-not and
Stimulus 2 exposure duration, had multiplicative effects.
The interpretation was that the two factors selectively influence two
sequential processes. Multiplicative effects do not determine the process
order, but from other considerations Jolicoeur and Dell’Acqua suggest
the effect of the requirement to report Stimulus 1 occurred after the effect
of Stimulus 2 exposure duration. One possibility is that Stimulus 2

exposure duration influenced the amount of information available for
processing and the requirement to report Stimulus 1 influenced transfer
of this information to short-term memory (Jolicoeur & Dell’Acqua,
2000).
Analysis of each of the five serial positions of the letters in Stimulus
2 showed that position and requirement to report Stimulus 1 had
multiplicative effects, but position and Stimulus 2 exposure duration did
not. It is instructive to consider what happens when percent correct is
averaged over positions in this situation. Let i denote a level of Stimulus
2 exposure duration, and let πi denote the probability that the process
selectively influenced by this factor is correct at level i. Let j denote a
level of Stimulus 1 report-required-or-not, and let k denote a serial
position. Let πjk denote the probability the process (or processes)
selectively influenced by the latter two factors is (or are) correct, when
their levels are j and k, respectively. (These factors did not have
multiplicative effects, so we use a single symbol to denote their
combined effect.) When the factors have levels i, j and k, let percent
correct be
πijk = πi πjk.
If we now average over the five serial positions
1 1
5

k
ijk  i
5

k
jk
.
Average percent correct is the product of an expression indexed by i

(exposure duration) and an expression indexed by j (report-required-or-
not). The upshot is that two factors can have multiplicative effects when
percent correct is averaged over the levels of another factor.
Rates
In many experiments rates of bar presses, heart beats, and so on are

found to be affected by factors such as amount of food deprivation. The

Multiplicative-Factors Method of Roberts (1987) uses effects of factors
selectively influencing processes to model such rates. In Roberts’ basic
model, a generator produces pulses at a rate of a pulses per unit time.
These are sent to a filter that sends a proportion b of them to a system
that makes a response when each pulse is received. Response rate per
unit time is r = ab.
Suppose when Factor Α is at level i the generator produces pulses at
rate ai and when Factor Β is at level j the proportion of pulses leaving the
filter is bj. For these levels of the factors, the response rate is
rij = aibj.
The combined effect of changing the levels of both factors will be the
product of their separate effects; that is, the factors will have
multiplicative effects.
An example given by Roberts (1987) is an experiment by Clark
(1958). He manipulated reward schedule and food deprivation time for
rats that pressed a lever for food. Each group of rats had one of three
reward schedules, a Variable Interval of 1, 2 or 3 minutes. The groups
were tested after being fed, at delays of 1, 3, 5, 7, 10, 20, or 23 hours.
The effects of reward schedule and deprivation time on rate of lever
pressing were multiplicative. For an excellent review of many
experiments with multiplicative factors see Roberts (1987).
The basic model with a generator followed by a filter consists of two
sequential processes, but Roberts considered several arrangements of
processes. One, his Model 3, combines additive and multiplicative terms:
rij = ci + aibj + dj.
Factors in the model are called multiplicative because interactions are

multiplicative. For reference levels i = 1 and j = 1 an interaction contrast
has the form
rij − ri1 − r1j + r11 = (ai − a1)(bj − b1).
The processes are not in a sequence, but they can be represented in a

processing tree.
Processing Trees
In a processing tree the processing starts at a single vertex, the root. A

process is executed at each vertex, and results in one and only one
outcome on each trial. An arc leaving a vertex represents an outcome of
the process. Associated with each arc is the probability the outcome it
represents occurs; we say the probability is on the arc. (Later we do not
require the parameter on an arc to be a probability.) An arc leaving a
vertex is sometimes called a child. When an outcome occurs, we say the
arc representing it is traversed, and the ending vertex of the arc is
reached. A vertex with no child is called a terminal vertex. Each
terminal vertex is associated with a class of responses. The classes are
mutually exclusive and jointly exhaustive. When a terminal vertex is
reached, a response in the class it represents is made. The probability of
a (directed) path from one vertex to another is the product of the
probabilities on the arcs of the path. The probability of a particular
response class is the sum of the probabilities of all paths starting at the
root and reaching a terminal vertex for that class.
When a factor selectively influences a process in a processing tree, it
changes probabilities on some of the children of the vertex representing
the process. The sum of the probabilities on the children of a vertex is
always 1, because one and only one outcome occurs. Hence, a factor
cannot change the probability on only one child. For example, if a vertex
has two children, one with probability x, the probability of the other must
be 1 − x. A factor changing x changes one parameter, but changes
probabilities on two children. If a factor changes the level of one and
only one parameter, we sometimes say the factor selectively influences
the parameter. When only correct responses are analyzed, as is common
with reaction time experiments, one is typically considering only a single
path from the root to a correct response; this may be a subtree of a larger
tree. Factors selectively influencing parameters on arcs of this path will

have multiplicative effects.
We now turn to a few paradigms for which processing trees are
useful. There are too many for us to survey all. For an introduction to
multinomial modeling, see Riefer and Batchelder (1988). For reviews of
multinomial processing trees, see Batchelder and Riefer (1999) and
Erdfelder, Auer, Hilbig, Aβfalg, Moshagen and Nadarevic (2009).
Tables listing parameters influenced by factors are in Riefer and
Batchelder (1995) and Jacoby, Begg and Toth (1997), the latter reporting
selective influence.
As we said in Chapter 2, multinomial processing trees are not always
applicable. They assume the output of processes is categorical, rather
than continuous as in signal detection theory (Kinchla, 1994; Slotnick,
Klein, Dodson & Shimamura, 2000). Another criticism is that
performance of different processes may be correlated and lead to
problems such as bias in parameter estimation (Curran & Hintzman,
1995, 1997). For responses to critiques, see Jacoby (1998) and Jacoby,
Begg and Toth (1997).
Process dissociation and inclusion-exclusion tasks
The process dissociation procedure of Jacoby (1991) was introduced in

Chapter 2. To briefly review, subjects study two separate lists of items.
Subjects are later presented with test items, to be classified as old or new.
In the inclusion condition, an item is considered old if it was studied in
either list. In the exclusion condition, an item is considered old if it was
in a specified one of the lists. In the model, with probability R the
subject recollects the test item and makes the correct response
(appropriate to the condition). With probability 1−R the subject does not
recollect the test item. Then with probability F the subject finds the test
item familiar and classifies it as old. In a typical experiment, with two
conditions and two parameters, there is just enough data to estimate the
parameters (see Chapter 2).
The model is validated by testing that factors expected to selectively
influence recollection or familiarity do so. Many studies report factors
that change R leaving the other parameter invariant. An example

mentioned in Chapter 2 is the requirement that a secondary task be
performed during testing, another example is study duration (e. g., Hay
and Jacoby, 1996). For lists of factors selectively influencing processes
in the model see Jacoby, Beg and Toth (1997) and Kelley and Jacoby
(2000). For systematic, critical reviews of familiarity and recollection
see Yonelinas (2002) and Yonelinas, Aly, Wang and Koen (2010). The
reviews conclude that familiarity and recollection involve two different
and separably modifiable processes. In the dual-process signal detection
model of Yonelinas (1994), recollection is a process that either occurs
with probability R, or does not occur, so recollection can be considered a
process in a tree. But familiarity, rather than changing a process in a tree
model, changes the memory strength of an item (d' in signal detection
theory).
Source monitoring
In a source monitoring experiment, subjects study items from two

sources, say a male voice and a female voice. At test, subjects attempt to
remember each item and its source, typically by recognition. Several
processing tree models have been proposed, see Batchelder and Riefer
(1990). The Two-High-Threshhold Model (Batchelder & Riefer, 1990;
Batchelder, Riefer, Hu, 1994; Riefer, Hu & Batchelder, 1994) has a
parameter d for source discrimination and another parameter D for item
detection. One would expect that increasing the similarity of the sources
would decrease the parameter for source discrimination, but leave the
parameter for item detection invariant. This was found by Bayen,
Murnane and Erdfelder (1996). Items were brief narratives. Each was
presented in a male voice accompanied by a drawing of a male face or in
a female voice with a drawing of a female face. Source similarity was
increased by making the drawings more similar, and had the predicted
effect, selectively influencing d.
In the process dissociation paradigm a subject is presented with two
different types of lists, and must remember which list an item came from.
Buchner, Erdfelder, Steffens and Martensen (1997) noted that the task
can be considered as source monitoring. Their data showed that the

same tree model could be applied to both paradigms. Further support is
from Yu and Bellezza (2000). They applied the Two-High-Threshold
Model for source monitoring to experiments on both source monitoring
and on process dissociation. They found that source discrimnibility
selectively influenced parameter d, for source discrimination; and
distractor similarity selectively influenced parameter D, for item
detection.
Prospective memory
Remembering to do an action at a future time is called prospective

memory. In experiments by Smith and Bayen (2004), subjects did an
ongoing task in which they indicated whether or not the color of a word
matched that of a rectangle displayed a little earlier. The prospective
memory task was to press the tilde key whenever certain target words
were displayed. In the model proposed by Smith and Bayen, P denotes
the probability the subject carries out preparatory attentional processes,
such as monitoring the environment, needed before a target word is
presented if the prospective memory task is to be carried out. The
probability the subject will recognize a target word when it is presented
is denoted M. Smith and Bayen found that emphasizing the importance
of the prospective memory task increased P, leaving M invariant, and
that giving more time for encoding the target words increased M, leaving
P invariant.
Immediate memory
In an immediate serial recall experiment a subject is presented with a list

of items. After a brief interval, typically about two seconds, the subject
attempts to recall the items in order. Here is a simple model for
immediate recall of a particular item (Schweickert, 1993). With
probability I the trace of the item is intact and the subject reports it
directly and correctly, (cf., Estes, 1991). With probability 1 − I the item
is not intact. Then with probability R the subject reconstructs the
degraded trace (a process classically called redintegration) and reports
the item correctly. If the item is neither intact nor redinetgrated an error
is made. The probability of a correct response is
P(correct) = I + (1 − I)R. (9.1)
The model of Chechile and Meyer (1976) allows fractional storage of an

item, similar to the possibility here that an item is not intact, but enough
remains for it to be reconstructed.
Buchner and Erdfelder (2005) pointed out that parameter values are
not unique. To see this, note that the probability of an error is (1 − I)(1 −
R). But if (1 − I) is divided by c and (1 − R) is multiplied by c, the result
is the same. Consequently, if values of I and R that fit the data are found,
they can be transformed to other values that fit as well. (Transformed
values must be between 0 and 1 to be probabilities).
Behavior of factors often makes sense if I is interpreted as the
probability the trace of an item is intact and R as the probability of
redintegration. For example, in an experiment of Buchner and Erdfelder
(2005), subjects studied words for immediate recall either in silence or
with irrelevant spoken distractors. Distractors were presented during
study, so it is reasonable that their properties would influence whether
the trace is intact. Indeed, word frequency of the distractors changed I,
leaving R invariant. Li, Schweickert and Gandour (2000) found the
probability I that a trace was intact changed with serial position, while
the probability of redintegration, R, changed with phonological
similarity. (An unresolved issue is that with a different paradigm and
model, Chechile (1977) found phonological similarity to selectively
influence the probability of storage.)
With R interpreted as the probability of redintegration, long term
memory is not a source of items themselves, as in a model with an
equation of the same form by Waugh and Norman (1965). Rather, long
term memory contains knowledge about the language and at retrieval this
knowledge is used for trace reconstruction (Hulme, Maughan & Brown,
1991). As usual in cognitive psychology, matters are more complex than
originally thought. If the trace of an item is completely lost it cannot be
redintegrated, so Gathercole, Frankish, Pickering, and Peaker (1999)
proposed that the probability of an incorrect response is
L + (1 − L − I)(1 − R),
where L is the probability the trace is lost. On the other hand, if some
phonemes are present in a partially complete trace, then the item
produced at recall will not be completely incorrect. By scoring items as
incorrect, partially correct and correct, Thorn, Gathercole, and Frankish
(2005) showed that long term knowledge has effects at more than one
locus in the process arrangement. For more details and current thinking
about the relation between long term and short term memory, see Thorn
and Page (2009).
We gain considerable understanding of what a model means about
mental processing when we find an experimental manipulation that
selectively influences a particular component of a model. There is a
concrete connection: This factor changes that parameter. But the further
question of what mental process the factor modifies is not necessarily
easy to answer.
Despite the complications, examining a simple tree structure is
worthwhile. In a later section on Tree Inference, we argue that if a
subject uses a complex tree to produce responses, but the experimenter
uses only a couple of factors that selectively influence processes, the
experimenter will find that a simple tree accounts for the results. So, let
us turn to some further examples of factors selectively influencing
parameters in the simple model of Eq. (9.1).
Effects of Proactive interference and Retention Interval1
When a subject studies a list of items and then studies a second list,
memory for the second list is worse than if the first list had not been
studied, a phenomenon called proactive interference (or proactive
inhibition). The phenomenon is well established in long term memory
tasks. For a while after rapid forgetting was demonstrated in immediate
serial recall by Brown (1958) and by Peterson and Peterson (1959), it
seemed proactive interference did not occur in short term memory tasks.
Then Keppel and Underwood (1962) showed it does occur. In their

Experiment 2, a trigram such as KQF was presented visually for 2
seconds. A three digit number was then presented auditorily, and the
subject counted aloud backwards from it by 3s for a retention interval of
3, 9 or 18 seconds. The subject then attempted to recall the trigram by
speaking. There were 216 subjects; each was tested once in each of the
three retention intervals. Retention intervals were counterbalanced, so the
trigram studied by a subject at a particular retention interval could be the
first trigram the subject studied, the second, or the third. These are called
Trial 1, Trial 2, and Trial 3, respectively.
Results are in Table 9.1. There are 72 observations for each retention
interval and trial number. Keppel and Underwood say that on Trial 1
there is no measurable forgetting across retention intervals. The lower
number of correct responses on Trial 2, and still lower on Trial 3, are
evidence for proactive interference, worse performance due to earlier
trials. Keppel and Underwood explained the larger drop in performance
with larger retention intervals as due to more spontaneous recovery of
prior irrelevant associations over longer retention intervals.
Table 9.1
Keppel and Underwood (1962) Experiment 2
Frequency of Correct Recall Frequency of Incorrect Recall
Retention Trial Number Retention Trial Number

Interval 1 2 3 Interval 1 2 3
obs 71 62 58 obs 1 10 14
3 3
pred 71.2 61.9 57.9 pred .8 10.1 14.1
obs 70 54 49 obs 2 18 23
9 9
pred 70.6 54.7 48.0 pred 1.4 17.3 24.0
obs 71 51 41 obs 1 21 31
18 18
pred 70.2 50.4 41.9 pred 1.8 21.6 30.0
Note: Correct recall frequencies read from Kepel and Underwood (1962), Figure 3.
Retention intervals in seconds. Observed and predicted values are labeled obs and pred,
respectively. Predicted values from Equation (9.1). G2 = .84, df = 4. For predicted and
observed correct frequencies, r2 = .999.
During the retention interval, the subject counts aloud, likely

degrading the verbal memory trace. So suppose in Eq. (9.1) increasing
the retention interval decreases I. Suppose reconstructing a degraded
trace is more difficult the more traces of prior items there are in memory.
Then increasing the trial number decreases R. There is a value Ii for each
retention interval i and a value Rj for each trial j.
We can quickly check two qualitative predictions of the model. They
are analogous to the qualitative predictions for reaction times when
factors selectively influence processes in a directed acyclic task network
(Chapter 3).
First, the probability of a correct response is predicted to strictly
monotonically increase in I and also in R. This is true in Table 9.1;
frequencies increase as one goes right to left, and as one goes down to
up. Trial 1 is an exception, probability does not change much with delay.
Second, consider cells at the four corners of some imaginary
rectangle in the table of correct responses. Certain interaction contrasts
calculated using as baseline the cell with smallest frequency in this
rectangle are all predicted to be negative. To be specific, consider two
levels of I, I1 and I2, with I1 < I2, and two levels of R, R1 and R2 with R1 <
R2. From Eq. (9.1), if Pij denotes the probability of a correct response
when I is at level i and R is at level j,
P22 − P21 − P12 + P11 = − (I2 − I1)(R2 − R1) < 0.
Such interaction contrasts are in Table 9.2 for correct response

frequencies using as baseline the cell in the last row, last column. For
example, for the first and last cell in row 1 compared with the first and
last cell in row 3, the interaction contrast is
71 − 58 − 71 + 41 = −17.
Interaction contrasts in the table are negative. One can quickly see that
some other interaction contrasts are negative by noting that the
interaction contrasts themselves are monotonic in the rows and columns.
A check shows that all other interaction contrasts are negative, with
Table 9.2
Keppel and Underwood (1962) Experiment 2
Interaction Contrasts, Lowest Cell as Baseline
Frequency of Correct Recall

Retention Trial Number
Interval 1 2 3
3 − 17 −6
9 − 9 −5
18
negligible exceptions.
It is worth fitting the model. Parameters were estimated to minimize
G2; estimation details are in the Appendix. Estimated values were I1 =
.723, I2 = .526, I3 = .407, and R1 = .959, R2 = .494, R3 = .296. With these
values the predicted probability of correct recall for Trial 1, Retention
Interval 18 seconds is, for example,
I3 + (1 − I3) R1 = .407 + (1 − .407)(.959) = .976.
The predicted frequency of correct recall is 72  .976 = 70.3. (This

differs slightly from the predicted frequency in Table 9.1 because of
rounding.) Note that parameter values are not unique (Buchner &
Erdfelder, 2005). Clearly agreement between predicted and observed
values is good.
We can interpret a drop in I as a measure of forgetting, and a drop in
R as a measure of increase in proactive interference. Keppel and
Underwood say there is no measurable amount of forgetting across
retention intervals at Trial 1. However, in the model the amount of
forgetting is considerable at the retention interval of 18 seconds, and is
the same for every trial including the first. According to the model, there
appears to be no forgetting on the first trial only because the subject has
an excellent chance of reconstructing a degraded item on the first trial.
Effects of Serial Position, Delay, and Proactive Interference2
Further light on proactive interference comes from an unpublished

experiment by Tehan and Turcotte (1997), for which some data are
available3. For related work, see Tehan and Turcotte (2002). Tehan and
Turcotte (1997) manipulated the amount of proactive interference in
immediate serial recall by changing the similarity between the list to be
recalled and a preceding studied list. On trials to be analyzed, two four-
item lists were presented. The second list was always words. The first
list was words on half the trials (more interference) and letters on the
other half (less interference).
Method
Subjects were 40 introductory psychology students, who received course
credit for participating. Twenty subjects recalled by speaking and 20 by
writing. There were 10 one-list trials, not analyzed, and 40 two-list trials.
Instructions were to recall the most recent list, so when only one list was
presented subjects recalled that list and when two lists were presented
subjects recalled only the second list. On two-list trials subjects studied
the first list because they did not know whether or not they would need to
recall it.
Items were presented on a screen, one per second. The letter pool
was the English consonants, except “y”. For word lists, there was an
open pool and a closed pool. The open pool was 240 one syllable English
words, mostly concrete nouns. The closed pool was 16 words randomly
chosen from the open pool. Items were chosen randomly from the
designated pool, without replacement, to make the lists. Each subject was
presented with one set of trials using the open pool and one set using the
closed pool. The order of the sets was counterbalanced over subjects.
Data are available for the closed pool, but not for the open pool.
On half the two-list trials, a four digit number was displayed after the
second list. The subject was to decide if it was greater than or less than
5000. This task provided a delay prior to recall. Subjects who recalled
by speaking responded to the digit task by pointing up to indicate
“greater” and pointing down to indicate “less”. Those who recalled by
writing wrote + to indicate “greater” and − to indicate “less”. Subjects
were asked to recall the items in order.
Results
Frequency of recall over all subjects for the closed pool is in Table 9.3.
For statistical analysis, Tehan and Turcotte summed over serial positions.
A mixed ANOVA was conducted, with Interference, Delay and Word
Pool as within subject factors and Response Modality as a between
subject factor. Recall was significantly better with less interference, with
no delay, and with the closed word pool. There was a nonsignificant
trend for written recall to be better than oral recall. There were no
significant interactions.
Table 9.3
Frequencies of Recall from Experiment 2
of Tehan and Turcotte (1997), Closed Word Pool

Written Recall
Position Position
Interference Delay 1 2 3 4 1 2 3 4
Less No obs 197 183 173 166 3 17 7 34
pred 194.23 183.30 174.81 168.83 5.77 16.70 25.19 31.17
Less Yes obs 191 180 169 162 9 20 31 38
pred 192.66 178.77 167.98 160.38 7.34 21.23 32.02 39.62
More No obs 196 181 167 163 4 19 33 37
pred 192.98 179.69 169.37 162.10 7.02 20.31 30.63 37.90
More Yes obs 187 172 164 152 13 28 36 48
pred 191.07 174.18 161.06 151.83 8.93 25.82 38.94 48.17
Spoken Recall Position Position

Interference Delay 1 2 3 4 1 2 3 4
Less No obs 191 180 167 169 9 20 33 31
pred 190.73 181.57 168.86 169.03 9.27 18.43 31.14 30.97
Less Yes obs 191 175 157 154 9 25 43 46
pred 186.93 174.01 156.09 156.33 13.07 25.99 43.91 43.67
More No obs 183 166 156 165 17 34 44 35
pred 186.43 173.01 154.40 154.65 13.57 26.99 45.60 45.35
More Yes obs 188 169 135 129 20 31 65 71
pred 180.87 161.95 135.71 136.06 19.13 38.05 64.29 63.94
Discussion
The experiment provides clear evidence of proactive interference in
immediate serial recall. Tehan and Turcotte found the effect on errors of
more interference was almost entirely due to more intrusions from the
prior list. More interference produced no significant differences between
omissions, transpositions or phonemic errors.
Tehan and Turcotte explained their results through reasoning of
Nairne and Kelley (1999), who say immediate serial recall requires two
discriminations, a discrimination of the current list from previous lists
and a discrimination among items in the current list. Between list
discrimination is more difficult when the prior and current lists are both
of words. Discrimination among items in the current list is more difficult
when items are phonologically similar (Tehan & Humphreys, 1995).
Tehan and Turcotte say that although phonological similarity was not
directly manipulated in this experiment, it may be larger in the closed
pool.
Model: Qualitative Tests

There are four factors, Delay, Interference, Serial Position and Response
Mode. Qualitative tests for the Keppel and Underwood data were based
on the model in Equation (9.1), but suppose we do not know what form a
processing tree would take, or whether a processing tree is possible in
which the factors selectively influence processes. We sketch a procedure
here for finding a tree if one is possible; the later section on Tree
Inference has more details.
Suppose each factor selectively influences a different process
represented by a single vertex in a processing tree. That is, changing the
level of one of the factors changes parameter values on children of a
single vertex, perhaps on more than two of its children. (As already
noted, a factor cannot change the parameter value on exactly one arc
because the sum of the probabilities on all children of a vertex sum to 1.)
Then for a given pair of factors there are two possibilities. Either (1)
there is a path that starts at the root, goes through an arc whose parameter
is changed by the first factor, then goes through another arc whose
parameter is changed by the second factor, and ends at a terminal vertex
for a correct response, or (2) there is no such path. If there is such a path
the selectively influenced processes are ordered, if not they are
unordered. With four factors, there are six pairs of factors, and if we
consider the possibility that each pair is either ordered or unordered there
are 64 possibilities to contemplate. Moreover, if two factors are ordered
on a path, they can be ordered in two different ways. We need a quick
way to reduce the number of trees to consider. Qualitative tests are
invaluable for this.
If a factor changes parameters on children of a single vertex, we say
the factor selectively influences the vertex. If two factors selectively
influence two unordered vertices, the factors have additive effects. (This
will be discussed later in the section on Tree Inference) Let mij denote
the observed number of correct responses when Factor Α is at level i and
Factor Β is at level j. Choose any level i* of Factor Α and any level j* of
Factor Β as reference levels. We can consider an interaction contrast for
the number of correct responses in the same form as that previously
considered for mean reaction times, namely
Δ2mij = mij − mij* − mi*j + mi*j*.
If the two factors are additive, then all such interaction contrasts are 0.
In the experiment of Tehan and Tercotte (1997), despite the
nonsignificant ANOVA interactions, no two factors seem to have
additive effects. In particular, the interaction between Interference and
Response Modality is − 58 and the interaction between Delay and
Response Modality is − 42. (The reference levels are oral response, more
interference, and long delay.)
If two factors are not additive, but selectively influence different
vertices in a processing tree, then the two vertices are on a path from the
root to a terminal vertex, and so the vertices are ordered. Suppose the
first vertex is selectively influenced by Factor Α and the second by Factor
Β. Then the data must satisfy the following qualitative tests. These order
the levels of Factor Β (an order separate from the order of the vertices).
Let mij denote the observed number of correct responses when Factor Α
is at level i and Factor Β is at level j.
Condition (a). Levels of Factor Β can be ordered so if j > j' then for
all i,
mij > mij'. (9.2)
Condition (b). Suppose levels of Factor Β are ordered as in Condition

(a). Consider interaction contrasts of the form
Δ2 mij = mij – mij' – mi'j' + mi'j'. (9.3)
Fix levels i and i'. Then for all j and j' with j > j' all such interaction
contrasts have the same sign, negative, zero or positive.
Note that Conditions (a) and (b) do not require the levels of Factor Α
to be ordered. The qualitative Conditions (a) and (b) must be satisfied
for all combinations of levels of the factors other than Α and Β. (The
ordering of the levels of Factor Β and the sign of the interaction contrasts
may change from one combination to another.) With a plausible
additional assumption, Conditions (a) and (b) must also be satisfied when
frequencies are summed over all levels of the other factors; this usually
helps clarify the situation by increasing the sample size. For details, see
the later section on Tree Inference.
In the data of Tehan and Turcotte (1997), Condition (a) is easily seen
to be satisfied. With minor exceptions, the frequency of a correct
response decreases monotonically as Delay, Interference, and Serial
Position increase, and as Response Mode changes from written to oral.
Levels of each factor are ordered in the same way at all levels of the
other factors. For Condition (a) to be satisfied, only levels of the second
factor of every pair need be ordered; in fact, levels of all factors are
ordered.
Because Condition (a) is satisfied we turn to Condition (b). For a pair
of factors such as Delay and Interference, each with two levels, only one
interaction contrast is possible, so Condition (b) is automatically
satisfied. The only factor with more than two levels is Serial Position,
and we can test each other factor with it. Suppose in Condition (b)
Factor Β, the factor that selectively influences the vertex that comes
second on the path, has only two levels. Then Condition (b) is
automatically satisfied. Hence, we only need to test Condition (b) for
cases where Serial Position is in the role of Factor Β.
For Delay and Serial Position, using delayed-recall and serial position
4 as reference levels, the interaction contrasts for serial positions 1, 2 and
3 in that order are
− 50, − 58, − 28.
(Frequencies are summed over the levels of the factors irrelevant to the
contrast.) These particular interaction contrasts have the same sign, so
Condition (b) is passed for them. All such interaction contrasts must
have the same sign. A quick test that some of the others have a negative
sign is whether the above interaction contrasts are monotonic with serial
position. They are not exactly monotonic, but close. Further tests reveal
no serious violation of Condition (b), so we continue to consider these
two factors for the model.
For Interference and Serial Position, using more-interference and
serial position 4 as reference levels, the interaction contrasts for serial
positions 1, 2, and 3 in order are
− 16, − 10, 2.
Two of the three are negative and the third is not far off. The interaction
contrasts are monotonic with serial position, an immediate test that
certain other interaction contrasts are negative. Further checks show no
serious violation of Condition (b) so we continue considering these two
factors for the model.
For Response Mode and Serial Position, using oral-response and
serial position 4 as baseline, the analogous interaction contrasts in order
are
2, 2, 32.
These all have the same positive sign. However, using as reference
levels oral-response and serial position 3, the interaction contrasts for
serial positions 1 and 2 are
− 30, − 30.
The sign of the interaction contrasts changes depending on the reference

level of Factor B, so Condition (b) is violated. If Serial Position and
Response Mode selectively influence two different vertices on a path
from the root to a terminal vertex for correct responses, the vertex for
Serial Position must come first.
We could pursue a processing tree in which Serial Position
selectively influences a vertex preceding a vertex selectively influenced
by Response Mode. But interactions between Response Mode and Serial
Position are behaving in a complex way. We cannot be confident that
pursuing the details will be fruitful because the data for the two different
levels of Response Mode are from two different groups of subjects.
Instead, we will continue modeling with the other three factors, and
make a separate model for each response mode.
A check of interaction contrasts for the three factors, for each
response mode separately, shows that the qualitative tests are satisfied,
with minor exceptions. It makes sense to attempt a tree of the same form
for each response modality. For oral responses, the single interaction
contrast for Delay and Interference is negative, − 27. For written
responses it is − 19. The sign is the same for both response modalities,
encouraging a tree of the same form for both.
Model: Quantitative Test

At this point, we are considering a processing tree in which the three
factors (Serial Position, Delay, and Interference) selectively influence
three different vertices. The vertices are on a path from the root to a
terminal vertex for a correct response. We have found no constraints on
the order in which the vertices occur on the path. A tree of the required
form is in Figure 9.1. Probability A decreases as serial position increases,
probability F decreases when recall is delayed rather than immediate,
Fig. 9.1. A processing tree for data of Tehan and Turcotte (1997).
and probability S decreases when there is more interference. It is easy to

write the probability of an error in recall for an item in serial position i
with delay level j and interference level h. It is
pijh(error) = (1 − Ai )(1 − Fj)(1 − Sh).
Any order of the vertices selectively influenced by the factors would lead
to the same equation. We first discuss model fit, and then consider
interpretations.
Observed frequencies and those predicted by the model in Figure 9.1
are in Table 9.4. Clearly, agreement is good, and there is no need to
modify the model. The model was fit for each response modality
separately. One way to use the two free scaling parameters is to fix two
of the processing tree parameters ahead of time to arbitrary values. That
was done; S1 (for less interference) was set to .6 for each response mode,
as was F1(for no delay). Other parameters were estimated to minimize
G2, see Appendix.
Parameters not fixed are higher for written responses than for oral,
with one exception. At the last serial position I4 is higher for oral than
written responses. This is a slight recency effect, consistent with the
finding that in immediate serial recall a recency effect is commonly found
Table 9.4
Parameter Estimates and Model Fits for Experiment 2
of Tehan and Turcotte (1997)
Factor Level Parameter Oral Written

Position 1 I1 .7103 .8196
Position 2 I2 .4240 .4781
Position 3 I3 .0267 .2128
Position 4 I4 .0321 .0260
No Delay F1 .6000 .6000 fixed
Delay F2 .4360 .4915
Less Interference S1 .6000 .6000 fixed
More interference S2 .4143 .5136
G2 11.05 6.80
df 10 10
r2 .936 .972
Note: Correlation squared between observed and predicted correct
responses is r2.
with auditory presentation but not with visual presentation. This recency
effect is one reason that Serial Position and Response Modality behave in
a complex way.
Interpretation
A few ways to interpret processes in the tree are possible. It is one thing
to know a factor changes a process; it is another thing to know what that
process does. A difficulty in interpreting processes in this model is that
the order in which the factors have their effects is not established; the
model would fit just as well for any permutation of the three processes.
In the data of Keppel and Underwood (1962), we interpreted a longer
retention interval as increasing the probability an item is degraded
and proactive interference as decreasing the probability of redintegration.
Let us keep those interpretations, keeping in mind that other
interpretations are possible.
One way to think of serial positions is to note that because items must
be recalled in order, at the moment of recalling the item in position p, p −
1 items have already been recalled. Serial positions preceding an item
add an additional retention interval, a retention interval at a micro level,

so to speak. With this interpretation, serial position and retention interval
both change the probability an item is degraded (not intact).
We can split the probability an item is intact into two parts. Suppose
the trace of an item has several component codes, phonological, lexical,
semantic, and so on (e.g., Hulme, Maughan & Brown, 1991; Poirier &
Saint-Aubin, 1995; Thorn, Frankish & Gathercole, 2009). The codes
may function at different time scales; for example, a semantic code may
endure longer than a phonological code. Suppose with probability A an
item is intact in one code, and recalled correctly. Suppose probability A
decreases as serial position increases. If the item is not intact in this
code, with probability F it is intact in the other code, and recalled
correctly. Suppose probability F decreases as retention interval
decreases. Then the probability an item is intact is
I = A + (1 − A)F.
Equivalently, the probability an item is degraded (not intact) is
1 − I = (1 − A)(1 − F).
Alternatively, the redintegration process could be split into two parts,

redintegration at a phonological level, less likely to succeed when
retention interval is long, and redintegration at a higher level, less likely
to succeed when there is interference from a prior similar list. Temporal
information may be worse in the higher level code than in the
phonological code, resulting in intrusion of prior list items when higher
level components are the basis for responding.
One could even consider items recalled earlier than a given item as
providing proactive interference. If we consider proactive interference as
decreasing the probability of redintegration, then serial position
influences redintegration, not degredation. The point is that a tree model
may be correct even if the parameter interpretation is wrong. A tree gives
a bone structure; additional knowledge is needed to flesh it. Later, for
example, we discuss evidence of Hulme, Stuart, Brown and Morin
(2003) that Word Frequency influences association rather than

redintegration.
For analysis of reaction times, there are clear process classes of
perception, decision, and motor preparation. Processes are less easy to
classify in memory tasks. One nontrivial difference is that a typical
reaction time task requires about a second; an immediate serial recall
task is at least four times as long, providing ample time to quadruple the
number of processes.
Effects of Serial Position and List Length4
Further information about serial position comes from Experiment 2 of

Poirier, Schweickert and Oliver (2005). There were four experimental
factors. Lists were made entirely of short words (one or two syllables) or
long words (five syllables). All list lengths from 2 to 7 were used. Items
in a list were visually displayed simultaneously. The subject read the list
silently in some blocks of trials and aloud in others, a difference in
presentation modality. After pressing the space bar to indicate reading
was finished, the subject immediately attempted serial recall aloud.
There were 32 subjects, and for each subject eight trials for each list
length, word length and presentation modality. Serial position was also a
factor. Results are in Table 9.5.
Model: Qualitative Tests

Consider a processing tree in which each factor changes parameter
values on children of a single vertex. Each pair of factors must either
have additive effects or satisfy the qualitative Conditions (a) and (b)
above, in Equations (9.2) and (9.3). Although it is not necessary that
levels of all factors be ordered, let us try. For a particular factor, we
choose a reference combination of levels of the other factors. For this
combination we order the levels of the particular factor so frequency of
a correct response increases monotonically with its levels. Then for
any other combination of levels of the other factors, we check whether
the frequency of a correct response is monotonically increasing or
decreasing with this ordering of levels of the particular factor. For three
Table 9.5
Data for Experiment 2 of Poirier, Schweickert & Oliver (2005)
Short Words Read Aloud: Frequency of Correct Recall

Serial Position
List
Length 1 2 3 4 5 6 7
2 obs 255 254
3 obs 254 254 255

pred 255.28 252.73 R3 0.8722
4 obs 253 252 240 241
pred 254.84 250.73 239.49 R4 0.7941
5 obs 254 244 219 198 227
pred 253.69 245.53 223.18 193.41 R5 0.5907
6 obs 252 239 193 146 117 172
pred 251.85 237.21 197.10 143.65 117.44 R6 0.2654
7 obs 253 228 183 100 69 56 117
pred 250.40 230.61 176.40 104.18 68.75 56.00 R7 0.0073
I1 I2 I3 I4 I5 I6
0.9780 0.9001 0.6868 0.4026 0.2632 0.2130
G2 = 9.32, df = 10, r2 = .998.
Short Words Read Silently: Frequency of Correct Recall
List Serial Position

Length 1 2 3 4 5 6 7
2 obs 252 253
3 obs 253 251 248

pred 253.63 250.37 R3 0.8356
4 obs 250 239 215 205
pred 249.84 241.34 213.07 R4 0.5716
5 obs 250 235 194 123 117
pred 247.26 235.23 195.17 123.51 R5 0.3930
6 obs 246 232 173 78 46 60
pred 244.15 227.82 173.46 76.24 48.24 R6 0.1764
7 obs 239 223 166 59 31 27 40
pred 243.09 225.29 166.07 60.14 29.63 27.00 R7 0.1026
I1 I2 I3 I4 I5 I6
0.9438 0.8663 0.6085 0.1474 0.0146 0.0032

G2 = 4.52, df = 10, r2 = .999.
Long Words Read Aloud: Frequency of Correct Recall

Length 1 2 3 4 5 6 7
2 obs 255 254
3 obs 255 254 256

pred 255.51 253.49 R3 0.9500
4 obs 255 241 205 209
pred 252.88 239.96 207.64 R4 0.6802
5 obs 253 240 165 127 161
pred 250.61 228.25 172.37 139.36 R5 0.4470
6 obs 246 209 133 57 34 94
pred 247.46 212.05 123.53 55.41 37.16 R6 0.1240
7 obs 244 198 106 32 13 19 56
pred 246.47 206.99 108.29 32.34 11.98 19.00 R7 0.0233
I1 I2 I3 I4 I5 I6
0.9619 0.8040 0.4093 0.1055 0.0241 0.0522

G2 = 16.58, df = 10, r2 = .997.
Long Words Read Silently: Frequency of Correct Recall

Length 1 2 3 4 5 6 7
2 obs 255 255
3 obs 251 232 218

pred 249.36 233.61 R3 0.6519
4 obs 251 227 148 108
pred 244.79 218.21 158.37 R4 0.4125
5 obs 244 204 130 66 46
pred 240.84 204.90 123.98 70.00 R5 0.2055
6 obs 240 199 111 46 14 24
pred 238.32 196.38 101.98 39.00 19.66 R6 0.0731
7 obs 225 185 87 24 9 6 13
pred 237.37 193.19 93.72 27.37 6.99 6.00 R7 0.0234
I1 I2 I3 I4 I5 I6
0.9255 0.7488 0.3509 0.0855 0.0040 0.0000

G2 = 26.73, df = 10, r2 = .995.
of the factors ordering is straightforward. Recall is better with short

words, with reading aloud, and with shorter list lengths. As for serial
position, with negligible violations recall decreases monotonically as
serial position increases, except at the last serial position, where recall is
sometimes considerably better.
This recency effect happens with longer list lengths, and is more
pronounced with reading aloud. When subjects read aloud, they give
themselves an auditory presentation, and in immediate serial recall, as
noted above, a recency effect is commonly found to be stronger with
auditory than visual presentation. A recency effect could be incorporated
in a tree model, but the effect is different at different list lengths.
Incorporating the recency effect seems to require a special parameter
value for each value of the effect, so there is no way to test this part of
the model. To keep the model testable, we consider all serial positions
except the last. But then, without the last serial position, list length 2 has
a single observed value in position 1. A model for this particular list and
position would have one parameter for one observation, and be
untestable. Hence, we consider a model for list lengths greater than 2.
The following analysis is for list lengths greater than 2 and serial
positions except the last.
Inspection of the data shows no pair of factors has a close
approximation to additive effects. If the factors selectively influence
different vertices in a processing tree, the vertices are all on a path
together. We turn to considering Conditions (a) and (b) for pairs of
factors. Because the levels of each factor can be ordered so frequency of
a correct response is monotonic with its factor levels for all combinations
of levels of the other factors, Condition (a) is satisfied. If ordering failed
for some factor, we would learn something about the possible order of
the vertices selectively influenced by the factors, but the test is not
informative.
Condition (b) is that for each pair of factors, certain interaction
contrasts all have the same sign, either negative, zero, or positive. Word
Length and Presentation Modality each have two levels. For these
factors there is one interaction contrast, for which Condition (b) is
automatically satisfied. We now consider other pairs of factors.
For the serial position and list length, the upper right cells of the data
matrix are empty, so interaction contrasts can only be examined
piecemeal in the lower cells. For Serial Positions 1 through 4 and List
Lengths 5 through 7, interaction contrasts are in Table 9.6. Interaction
contrasts in the table were calculated as follows. Let mij be the observed
frequency of correct responses for serial position i, list length j, summed
over Word Length and Presentation Modality. (Summing over levels of
irrelevant factors requires that a plausible assumption be met, as
explained in the section on Tree Inference.) For the levels of the factors
in the table, the lowest frequency is for serial position 4, list length 7.
The interaction contrast for serial position i and list length j is
mij − mi7 − m4j + m47.
Interaction contrasts in the table are all negative, so the qualitative test
Condition (b) is passed for these levels of the factors. Further, the
interaction contrasts are monotonic with Serial Position and List Length,
indicating that certain other interaction contrasts have the same negative
sign. All possible such interaction contrasts need to be tested; with
negligible exceptions, all are negative. We consider Serial Position and
List Length as candidate factors in a model. If these factors selectively
influence two different vertices, the test is not informative about their
order.
Table 9.6
Interaction Contrasts for Serial Positions 1 through 4 and List Lengths 5 through 7
Summed over Word Length and Presentation Modality
Data from Experiment 2 of Poirier, Schweickert & Oliver (2005)
Serial Position
List Length 1 2 3 4 5 6 7
2
3
4
5 −259 − 210 − 133
6 − 89 − 67 − 44
7
In a moment we will fit a model for Serial Position and List Length,
keeping separate each combination of Word Length and Presentation
Modality. Before doing so, we need to check for each combination
whether interaction contrasts analogous to those just considered all have
the same sign. It turns out they all do, negative, with negligible
exceptions. The combination that fares the worst is long words read
silently. Interaction contrasts for this combination are displayed in Table
9.7, where it can be seen that violations are slight.
Table 9.7
Interaction Contrasts for Serial Positions 1 through 4 and List Lengths 5 through 7
Long words, read silently
Data from Experiment 2 of Poirier, Schweickert & Oliver (2005)
Serial Position
List Length 1 2 3 4 5 6 7
2
3
4
5 − 23 − 23 1
6 − 7 − 8 2
7
Qualitative tests for the other pairs of factors do not eliminate any
factors as candidates for selectively influencing vertices in a processing
tree. However, we will see in a moment that results of fitting a separate
model to each combination of Word Length and Presentation Modality
suggest it would not be fruitful to incorporate these two factors in a
processing tree model. Before turning to model fitting we briefly
continue with the qualitative tests, which are informative about some
factor pairs.
For Word Length and Serial Position, interaction contrasts were
calculated as follows. Let mi,w,j be the observed frequency of correct
responses at serial position i with word length w and list length j,
summed over presentation modalities. For each list length the last serial
position is not considered, so the lowest observed frequency is in the
penultimate position. Label this position with i = p. Then a contrast for
serial position i and list length j is
mi,short,j − mi,long,j − mp,short,j + mp,long,j.
List length is irrelevant for this test. Every position does not occur in
every list length, so for a given position i the expression above was
averaged over the list lengths in which the position i occurred. (We are
interested in the sign of the interaction contrast, which would be the
same whether we add or average over list lengths. Averaging makes it
easier to compare the results for various serial positions.)
Results for serial positions 1 through 5 in that order are
− 76, − 54, 32, 26, 20.
Clearly, these do not have the same sign. This does not indicate that
these two factors do not selectively influence two different vertices. But
if the vertices are on a path that starts at the root, the vertex for Serial
Position cannot follow the vertex for Word Length.
Now consider Presentation Modality and List Length. For each
position, the cell with silent reading and list length 7 was the reference
for the interaction contrasts. Let mi,h,j be the observed frequency of
correct responses at serial position i, with presentation modality h (aloud
or silent), and list length j, summed over word length. For each serial
position the lowest observed frequency is at list length 7. Then a contrast
for serial position i and list length j is
mi,aloud,,j – mi,silent,j – mi,aloud,7 + mi,silent,7.
Because Serial Position is irrelevant for this test, for a given list length j
the above contrasts were averaged over the serial positions occurring at
that list length, to form an interaction contrast for list length j. Resulting
interaction contrasts for list lengths 3, 4, 5, and 6, in that order, are
− 11, 10, 30, 2.

These do not have the same sign. Further, they are not monotonic, which
implies that other such interaction contrasts do not have the same sign.
We conclude that if Presentation Modality and List Length selectively
influence different vertices on a path that starts at the root, the vertex for
List Length is not the later one.
There are two more pairs of factors to consider, Word Length with
List Length, and Presentation Modality with Serial Position. It turns out
that there are only minor violations of Condition (b), i. e., Inequality
(9.3), for these pairs. If these factors selectively influence different
vertices on a path, Condition (b) is not informative about the order in
which the vertices occur.
Modeling: Quantitative Test

For each combination of Word Length and Presentation Modality, the
model in Equation (9.1) was fit to the frequencies of correct and
incorrect responses. The probability I that the trace of an item is intact
was assumed to be selectively influenced by the serial position of the
item, and the probability R that a degraded item is redintegrated was
assumed to be selectively influenced by the list length. Parameters were
estimated using Excel solver to minimize G2. Predicted and observed
frequencies are in Table 9.5. Agreement between predicted and observed
values is good, best with short words read aloud and worst with long
words read silently.
The goodness of fit for long words read silently suggests attempting a
single model for all conditions together would not be fruitful. For long
words read silently, predicted and observed values are close, but using
the chi square distribution for G2, 26.73, is significant at the .01 level
with 10 df. The significance level may be somewhat inaccurate because
multiple observations come from each subject. Nonetheless one would
not want a model with a worse fit in this condition. Parameter values I
and R are optimal for this combination of Word Length and Presentation
Modality, and intuitively a model using the same parameter values for I
and R for all combinations would not do better here, even with additional
parameters. Word Length and Presentation Modality have complex
effects, for example, Nairne, Neath and Serra (1997) found that word
length effects do not occur on the first few trials. It would be difficult to
produce a model in which these factors change parameters in a simple
way.
The generally good fit of the model supports the interpretation that
serial position selectively influences degradation, also supported by the
data of Tehan and Turcotte (1997). The generally good fit also supports
the interpretation that list length selectively influences redintegration.
From the parameter values, one can see that an increase in list length has
a more harmful effect on redintegration for long words than short words.
But as noted earlier, a model may be correct although interpretation of its
parameters is false. We now turn to experiments on that lead to a
reinterpretation of the parameter interpreted as redintegration probability.
Effects of Serial Position and Word Frequency
Hulme, Maughan and Brown (1991) found better immediate serial recall
for words than nonwords. They explained this by saying subjects have a
representation of the features of a word in long term memory, and this
knowledge supports recall. No such representation is available for
nonwords. A nonword can be considered an extreme type of rare word,
suggesting that the long term representation of an infrequent word would
be less useful at recall than that of a frequent word. If so, redintegration
would be better for high than low frequency words, producing better
recall for high frequency words.
Lists of high frequency words and of low frequency words were
tested in immediate serial recall by Hulme, Roodenrys, Schweickert,
Brown, Martin and Stuart (1997). Recall was indeed better for high
frequency words. Typical serial position curves were found. Recall
decreased as serial position increased, but rose sharply at the last two
serial positions, a recency effect. Serial position and word frequency
interacted, with a smaller effect of word frequency at early serial
positions than in the middle. The combined effect of an increase in recall
due to serial position change and an increase due to word frequency was
smaller than the sum of their individual effects, a negative interaction.
Suppose as serial position increases, an item becomes more degraded,
perhaps through output interference, and when word frequency increases,

redintegration is better. With the model in Equation 9.1, for serial
position i and word frequency level j, correct response probability is
pij = Ii + (1 − Ii)Rj.
The model accounted well for the data, except for the recency effect in
the last two serial positions; see Hulme et al. (1997) for details.
Further support for the redintegration interpretation comes from an
experiment by Hulme, Stuart, Brown and Morin (2003), in which lists of
alternating words and nonwords were tested in immediate serial recall.
As predicted, words were always recalled better than nonwords; even
when alternating in the same list.
But another experiment of Hulme et al. (2003) challenges the
interpretation of Word Frequency having its effect through
redintegration. In this experiment, lists of alternating high and low
frequency words were tested in immediate serial recall. Also tested were
pure lists of all high frequency and all low frequency words. Suppose
the frequency of a word produced its effect by making redintegration of
that particular word better. Then with a list of alternating high and low
frequency words, recall should alternate between high and low, in
correspondence with the frequency. Instead, recall for a word in
alternating lists did not depend on its frequency. In an alternating list,
recall for a word in a particular serial position, say position 4, was about
the same whether the word was high or low frequency. Recall for words
in alternating lists fell between recall for pure high frequency and pure
low frequency lists. Hulme et al. (2003) explained their results by noting
that inter-item associations are higher for high frequency than low
frequency words. The availability in memory of words in a list depends
on the associations between all of them, i.e., on their combined inter-item
associations. It follows that recall for words in alternating lists is
between that of high and low frequency words.
With this interpretation, R in Equation (9.1) is the probability of
retrieving a word through its associations with other list words. If so, it
is worth noting that in the experiment of Hulme et al. (1997) overall

associations do not appear to change with serial position, that is, R does
not change with serial position. Recent reviews of the redintegration
concept are given by Roodenrys (2009) and Stuart and Hulme (2009).
At first it may seem there is a contradiction to resolve; parameter R
must either index redintegration or association, but not both. But R need
not stand for the same quantity in different paradigms; R is simply the
label for a process sequential with a process labeled by I. Suppose the
probability of an error is the product of three parameters, A(ST), but the
model is fit in the form AR. A factor in one experiment may change S,
leaving A and T invariant and another factor in another experiment may
change T, leaving A and S invariant. Analysis would simply show that
both factors change R. Further experiments would be needed to split the
process labeled R into process S followed by process T.
Effects of sleep and retroactive interference5
Two recent studies show a benefit of nocturnal sleep on recall when new
interfering associations were learned before testing of old ones
(Ellenbogen, Hulbert, Stickgold, Dinges & Thompson-Schill, 2006;
Ellenbogen, Hulbert, Jiang & Stickgold, 2009). A tree model allows
testing between two possibilities. One is that sleep decreases the
probability interference occurs. Another is that interference occurs with
the same probability with or without sleep, but given that interference
does occur, sleep decreases the probability it leads to a recall error.
In both experiments all subjects learned a list of A-B word pairs.
Some learned in the morning (the Wake group), some in the evening (the
Sleep group). All were tested 12 hours after learning, so those who
studied in the evening had a night of sleep before testing. Immediately
before testing, half the subjects learned a new list of A-C associations.
Learning new associations leads to poorer recall of old ones, attributed to
retroactive interference. Recall of B items was indeed worse when new
associations to C items were learned, whether subjects studied in the
morning or evening. But the decrement was less for subjects having a
night of sleep.
For both experiments, the authors report a significant interaction for

the effects of interference and sleep. Our analysis shows these effects on
errors are almost exactly multiplicative; that is, the combined effect on
errors is predicted well as the product of the separate effects (see Tables
9.8, 9.9).
In terms of correct responses rather than errors, not having
interference and having sleep both increase correct responses. These
factors have a negative interaction; the combined effect of both is less
Table 9.8
Data of Ellenbogen et al. (2006)
Retroactive Sleep Retroactive Sleep

Interference Yes No Interference Yes No
obs 226 197 obs 14 43
No No
pred 225.2 197.7 pred 14.8 42.3
obs 182 77 obs 58 163

Yes Yes
pred 182.7 76.7 pred 57.3 163.3
Note: Parameters in Equation (9.4): PYes = .7565, PNo = .3064, QYes = .7461, QNo
= .0191. G2 = .079, df = 1.
Table 9.9
Data of Ellenbogen et al. (2009)
Retroactive Sleep Retroactive Sleep

Interference Yes No Interference Yes No
obs 364 292 obs 86 158
No No
pred 365.8 290.6 pred 84.2 159.4
obs 319 198 obs 131 252
Yes Yes
pred 317.4 199.0 pred 132.6 251.0
Note: Parameters in Equation (9.4): PYes = .6040, PNo = .2501, QYes = .5275, QNo =
.2561. G2 = .102, df = 1.
than the sum of their separate effects. In a simple multinomial processing

tree model for the task, at recall two processes are carried out one after
the other. The data do not determine the order of the processes, but the
following interpretation is reasonable. Cued recall begins with
presentation of an A word. With probability P a single association leads
directly to the correct B word. This probability is larger when there is no
interference. Otherwise, several associations are candidates, and with
probability Q an association that leads to the correct B word is selected.
This probability is larger when subjects have sleep; that is, sleep benefits
source discrimination. The probability of correct recall of a B word is
then
P + (1 − P)Q. (9.4)
The model separates effects of sleep and of interference. The effect of

interference is the same with or without sleep. Sleep benefits recovery
from interference. The model has the form of that in Equation (9.1), but
interpretation of the parameters is different. See the Appendix for details
about the experiment and fitting the model. Note that parameter values
are not unique.
Tree Inference6
We have seen examples of tasks in which processing trees fit data well.
But if the brain is complex, why are the trees simple? The answer is that
if a few factors are found that selectively influence processes in a tree,
the tree will be equivalent to a simple tree. An investigator often begins
with a processing tree based on intuition and tests it through goodness of
fit. But one can start with data from a factorial experiment and test
whether any processing tree exists that can account for the data, under
the assumption that the factors selectively influence processes in the tree.
If any such tree exists, the data impose a simple equivalent tree. The
approach is called Tree Inference (Schweickert & Chen, 2008;
Schweickert & Xi, 2011).
Consider an experiment with two response classes, say, correct and
wrong. Suppose responses are produced through a processing tree. A

process is represented by a vertex in the tree, so we say an experimental
factor selectively influences a process if it changes probabilities
associated with the children of a single vertex. Note that because the
sum of the probabilities associated with the children of a vertex sum to 1,
if changing the level of a factor increases the probability associated with
one child, it must decrease the probability associated with at least one
other child. (Later we will enlarge the notion of selective influence to
allow a factor to change probabilities at more than one vertex.) Two
processing trees are equivalent for a set of experimental factors if for
every combination of levels of the factors the trees predict the same
probability for every response class. Suppose two factors selectively
influence processes represented by two different vertices in a processing
tree t. No matter what the tree t may be, it is equivalent to one of the two
standard trees in Figure 9.2 (Schweickert & Chen, 2008).
There are exactly two standard trees because there are exactly two
ways the two selectively influenced vertices can be arranged in a tree.
Two vertices are ordered if there is a directed path from the root of the
tree to a terminal vertex that goes through both vertices. Two vertices
are unordered if they are not ordered. Two processes represented by two
vertices are ordered or unordered as the vertices are ordered or
unordered.
In the standard tree for two unordered processes, the probability of a
correct response when Factor Α is at level i and Factor Β is at level j is
pij = αxi + (1 − α)yj. (9.5)
This case is illustrated in the left panel of Figure 9.2. One sees
immediately that if this model is true, Factors Α and Β have additive
effects on the probability of a correct response. This turns out to be
necessary and sufficient for a standard tree for unordered processes to
predict the probability of a correct response. In other words, if two factors
have additive effects on the probability of a correct response, parameter
values in the standard tree for unordered processes can be found that
predict the probability of a correct response (and thereby, of course, the
Fig. 9.2. Left: standard tree for unordered processes. Right: standard tree for ordered
processes.
probability of a wrong response). It may be that the subject actually uses

a processing tree different from the standard tree for two unordered
processes. No matter, if Factors Α and Β selectively influence two
unordered vertices in that processing tree, it is equivalent to the standard
tree. It may be that every subject uses a different processing tree. Again,
no matter. If Factors Α and Β selectively influence two unordered
vertices in each processing tree in the mixture, the mixture is equivalent
to the standard tree.
In the standard tree for two ordered processes, the probability of a
correct response when Factor Α is at level i and Factor Β is at level j is
pij = wi + xi yj. (9.6)
This case is illustrated in the right panel of Figure 9.2. The tree of
Gathercole, Frankish, Pickering and Peaker (1999) has this form for
incorrect responses. Note that parameter values in Equations (9.5) and
(9.6) are not unique, see Schweickert and Chen (2008).
Necessary and sufficient conditions for two factors to selectively
influence different processes in the standard tree for two ordered
processes are the following (Schweickert & Chen, 2008, Theorem 11).
Let pij be the probability of a correct response when Factor Α is at level i
and Factor Β is at level j. Matrix (pij) is produced by Factor Α and Factor
Β selectively influencing two vertices in the standard tree for ordered
processes, with the vertex indexed by i preceding the vertex indexed by j,
iff
1. The columns of (pij) can be numbered so j > j' implies for every i
that pij > pij'.
2. There exist levels i* and j* such that for every i there is a number
ri  0 with the property that for every j
pij − pij* = ri(pi*j − pi*j*).
We assume two details. First, in the equation pij = wi + xi yj there are at

least two levels i and i' of Factor Α with xi not equal to xi'. Otherwise, the
factors would be additive. Second, if in matrix (pij) two rows are equal,
one row is removed, and the same is done if two columns are equal.
The conditions treat differently levels i of Factor Α and levels j of
Factor Β. If the conditions hold as stated above, but do not hold when i
and j are interchanged, then process order is revealed: The process
selectively influenced by Factor Α precedes the process selectively
influenced by Factor Β. If the conditions hold when i and j are
interchanged, then two tree models account for the data, with different
orders of the processes.
The earlier qualitative tests of Condition (a) and Condition (b) in
Equations (9.2) and (9.3) follow immediately from the conditions above.
Condition 1 above is the same as Condition (a). The reason the levels of
Factor Α need not be ordered is that as level i changes wi might go up
while xi goes down. Consequently, the rows of (pij) need be in no special
order.
The qualitative test in Condition (b) follows immediately from
Condition 2 above. To see this, suppose levels of Factor Β are ordered as
in Condition (a). Suppose j > j'. For a given i consider the interaction
contrast
pij  pij'  pi'j+p i'j'

=( pij  pij* )  ( pi j'  p i j* )  ( pi'j  pi'j* )+( pi'j'  pi'j* )
=ri ( pi*j  pi*j* )  ri ( pi*j'  pi*j* )  ri' ( pi*j  pi*j* )+ri' ( pi*j'  pi*j* )
=ri ( pi*j  pi*j' )  ri' ( pi*j  pi*j' ) = ( ri  ri' )( pi*j  pi*j' ).
For any pair of levels for which j > j', (pi*j − pi*j') > 0. If ri − ri' < 0, all
interaction contrasts of the form above will be negative, and for fixed j'
decreasing in j. If ri − ri' = 0, all will be 0. If ri − ri' > 0, all will be
positive, and for fixed j' increasing in j. This establishes Condition (b).
If the qualitative tests in Conditions (a) and (b) are satisfied, it is
worth estimating parameters and fitting the model. Parameter values are
not unique. Suppose the probability of a correct response at factor levels
i and j is given by Equation (9.6), for probabilities wi, xi, and yj, that is
pij = wi + xi yj.
An equation of the same form holds for other parameters, w*i, x*i, and
y*j, that is,
pij = w*i + x*i y*j,
if and only if there are constants c and d such that
x*i = cxi,
w*i = wi − cdxi,
and
y*j = yj/c + d.
The scaling constants c and d must satisfy certain inequalities to assure

that the new parameters are between 0 and 1, see Schweickert and Chen
(2008) for details.
It may be that the subject uses a processing tree different from the
standard tree for two ordered processes. Provided that Factor Α
selectively influences a vertex followed on a path by a vertex selectively
influenced by Factor Β, the tree the subject is using is equivalent to the
standard tree for two ordered processes.
Further, suppose every subject uses a processing tree equivalent to the
standard tree for ordered processes, with the vertex selectively influenced
by Factor Α preceding that selectively influenced by Factor Β. Suppose
each subject’s parameter values are different. The mixture of trees may
be equivalent to the standard tree for two ordered processes. The key
requirement is that for all trees in the mixture, the probabilities of
reaching the vertex selectively influenced by Factor Α are proportional,
the proportion not changing when the level i of Factor Α changes.
To see this, suppose for tree t when Factor Α is at level i and Factor Β
is at level j the probability of a correct response is
pijt = wit + xityjt.
Suppose the probability tree t is used is πt. Suppose further there is a

common value of xi so that in every tree t, xit   t xi . Then averaged
over the mixture of trees, the probability of a correct response is
produced by the standard tree for ordered processes; that is, Equation
(9.6) applies
pij  wi   xi y j  .
This is so because averaged over the mixture of trees the probability of a

correct response is
pij    p
t
t ijt   w   x
t
t it
t
t it y jt   w
t
t it  xi   y
t
t t jt .
One implication is about combining data over subjects. Suppose each

subject uses a different tree t, but each is equivalent to the standard tree
for ordered processes and the parameters xit are proportional. Then
subjects can be combined because the mixture is equivalent to the
standard tree for ordered processes. Another implication is about
combining data over levels of irrelevant factors. Suppose an experiment
has factors other than Α and Β, and for each combination of levels of the
other factors, the subject performs the task using a processing tree t
equivalent to the standard tree for ordered processes and the parameters
xit are proportional. Then the mixture is equivalent to the standard tree
for ordered processes. If combining data over subjects or over levels of
irrelevant factors does not lead to one of the standard trees, one would
consider, of course, trees for individual subjects or combinations of
factor levels.
An important point is that if Factors Α and Β are not additive and
Conditions 1 and 2 are not satisfied, then the factors do not selectively
influence two different vertices in any processing tree. The standard tree
for unordered processes and the standard tree for two ordered processes
are ruled out, but so are all others. For more details, see Schweickert and
Chen (2008).
Generalization to Rates, More Response Classes and More Influenced

Vertices
So far we have assumed a tree with three restrictions, (a) parameters are
probabilities, (b) there are two response classes, and (c) Factors Α and Β
each selectively influence a single vertex. It would be useful to
overcome these restrictions. Instead of probabilities, Roberts (1987)
used rates. Two response classes are not enough for tasks such as source
monitoring. Finally, in some tree models the same parameter appears in
several places; an example is the probability of guessing the correct
response.
Removing the first two restrictions is straightforward, but analysis is
complicated if factors change parameters at multiple vertices. Results
have been derived for trees in which factors are either additive or interact
multiplicatively (Schweickert & Xi, 2011). Beyond this, interactions in
the form of sums of products can sometimes be treated with the matrix
factorization model (Bamber & van Santen, 1978, 1980; Ollman, 1990)
described later.
Suppose a tree has a single root and each terminal vertex is labeled
with one of K classes of responses. Suppose the parameters on the arcs
are nonnegative numbers, not necessarily probabilities. Suppose when
the level i of Factor Α changes, parameters on some set of arcs change,
and when the level j of Factor Β changes parameters on a different set of
arcs change. Suppose these sets are mutually exclusive. There may be
some arcs whose parameters do not change when i or j change. If the
parameter on an arc changes when the level i of Factor Α changes, we
say the arc is indexed by i. An arc indexed by j is defined similarly.
When Factor Α is at level i and Factor Β is at level j, let pij.k denote the
value of the dependent variable for class k. The dependent variable is a
nonnegative number such as the probability of a response in class k or
the rate at which responses of class k are made. It may be larger than 1.
The situation is straightforward if no path from the root to a terminal
vertex contains both an arc indexed by i and an arc indexed by j. Factors
Α and Β will have additive effects, and the tree is equivalent to the
standard tree for unordered processes, enlarged to allow more than two
response classes. There is a parameter α, and for every response class k
there are parameters wi.k and zj.k such that
pij.k = αwi.k + (1 − α)zj.k,
 
K K
with wi.k  z  1 if the dependent variable is probability.
k 1 k 1 j .k
The standard tree for two multiplicatively interacting factors is in

Figure 9.3. Each response falls into exactly one response class k, k =
1,..., K. When Factor Α is at level i and Factor Β is at level j, the
dependent variable for class k is
pij.k = (1 − b)wi.k + (1 − b)xiyj.k + bzj.k,
  
K K K
with wi.k  y j .k  z j .k  1 if the dependent variable is
k 1 k 1 k 1
probability. In this tree, Factor Α is allowed to have effects at more than

one vertex, and the same is true for Factor Β. Parameters in the above
two trees are not unique, see Schweickert and Xi (2011) for admissible
transformations.
Suppose the subject performs the task using an arbitrary tree.
Suppose when the level i of Factor Α changes, parameters on some set of
arcs change, and when the level j of Factor Β changes parameters on
another set of arcs change, and these sets are mutually exclusive. Finally,
Fig. 9.3. Standard tree for multiplicatively interacting factors.
suppose there is a vertex v such that a path from the root to v contains
arcs indexed by i and no arcs indexed by j, and one or more paths from v
to terminal vertices contain arcs indexed by j and no arcs indexed by i.
Suppose no other paths from the root to a terminal vertex contain both
arcs indexed by i and arcs indexed by j. Then the tree is equivalent to the
standard tree for multiplicatively interacting factors.
Under certain conditions, a tree will be equivalent to the standard tree
for multiplicatively interacting factors even if it has several paths from
the root to a terminal vertex containing arcs indexed by i and also arcs
indexed by j. Further, if every tree in a mixture of trees satisfies certain
conditions, the mixture of trees will be equivalent to the standard tree for
multiplicatively interacting factors. See Schweickert and Xi (2011) for
details.
The following are necessary and sufficient conditions for the standard
tree for multiplicatively interacting factors (Schweickert and Xi, 2011).
The key condition is that interaction contrasts can be written as a product
in which one multiplier depends only on i and the other multiplier
depends only on j. As before, interaction contrasts are calculated with
respect to reference levels. For Factor Β the reference level for one
response class may not be the same as that for another response class.
To emphasize this, when a level j of Factor Β is used as a reference level
for class k the level is written j(k). An expression such as pij*(k).k denotes
the value of the dependent variable for class k, when Factor Α is at level i
and Factor Β is at level j*(k).
1. There exists a level i* of Factor Α and for every k' there exists a
level j*(k') of Factor Β such that for every i, j and k there exist ri,
0 < ri < 1, and sj.k, 0 < sj.k < 1, such that
pij.k > pij.k − pij*(k).k − pi*j.k + pi*j*(k).k = ri sj.k.
2. For two levels i and i', ri  ri' and for some k, for two levels j and
j', sj.k  sj'.k.

K
3. For every j, s  a , where a is a constant, 0 < a < 1.
k 1 j .k
Condition 2 is needed, otherwise the factors would be additive. If

parameters are not bounded above by 1, as probabilities are, then
Condition 3 is not needed nor is it required that ri and sj.k be bounded
above by 1.
Note that it may not be possible to order the levels of Factor Α in such
a way that i' < i implies for all j and k, pi'j.k < pij.k. The reason is that as i
increases wi and xi need not both increase nor both decrease. Likewise,
the analogous ordering of the levels of Factor Β may not be possible.
Nonetheless, Condition 1 leads to a qualitative condition for
interaction contrasts. Choose an arbitrary level i' of i, choose any class k
and for this class choose an arbitrary level of j, which we will denote as
j'(k). Using these as reference levels, calculate interaction contrasts
pij .k  pij' ( k ).k  pi'j .k+pi'j' ( k ).k

=pij .k  pij *( k ).k  pi* j .k  pi* j *( k ).k  ( pij ( k ).k  pij* ( k ).k  pi* j' ( k ).k  pi* j* ( k ).k )
 ( pi'j .k  pi'j* ( k ).k  pi* j .k  pi* j *( k ).k )  pi j ( k ).k  pi j *( k ).k  pi*j' ( k ).k  pi*j* ( k ).k
=( ri  ri' )( s j .k  s j' .k ).
Levels i and j.k can be ordered so the interaction contrasts fall into four
quadrants, positive when ri < ri' and sj.k < sj'.k, negative when ri < ri' and
sj.k > sj'.k, and so on. To be more specific, choose a level i of Factor Α. In
the matrix of interaction contrasts, (pij.k − pij'(k).k − pi'j.k + pi'j'(k).k), order the
columns so interaction contrasts in row i are monotonically increasing.
Then the interaction contrasts in all other rows should be monotonically
increasing or monotonically decreasing. Likewise, choose a level j of
Factor Β. For class k call this level j.k. Now in the matrix of interaction
contrasts order the rows so the interaction contrasts in column j.k are
monotonically increasing. Then the interaction contrasts in all other
columns should be monotonically increasing or decreasing. When an
ordering of levels i is found for some response class k it should be
possible to use that same ordering for all classes. The analogous
statement need not be true for ordering the levels j of Factor Β.
A plan for finding a tree with Tree Inference is the following. For a
given response class consider two factors. Are both ordered? If so, and
they are additive, fit the standard tree for unordered processes. If both
factors are ordered but they are not additive, are they multiplicative? If
so, consider a simple tree in which the probability of the response class is
pq. Is one factor ordered, but not the other? Check Conditions (a) and
(b), in Equations (9.2) and (9.3). If these are satisfied, try fitting the
standard tree for ordered processes. Is neither factor ordered? Choose a
reference level of Factor Α and a reference level of Factor Β. Calculate
interaction contrasts with respect to these levels. Do they form a 2  2
checkerboard, with positive interaction contrasts in upper right and lower
left, negative in lower right and upper left? If so, try fitting the standard
tree for multiplicatively interacting factors.
Contingency Matrices7
Consider a categorization experiment with a beginning biology student

as subject. The student is presented with a picture of an animal. Then
the student picks the name of the genus of the animal from a list. Then
another animal picture is presented, and so on.
Suppose there are S stimuli and R responses. Results can be put in an
SR matrix, with row g for stimulus sg. The entry in column k of this row
is the probability of response rk when stimulus sg was presented. Such a
matrix is a contingency or confusion matrix.

A matrix factorization model for contingency tables was developed
by Bamber and van Santen (1978, 1980) and later, independently, by
Ollman (1990). In the model, a stimulus produces a mental state and
then the mental state leads to a response. Suppose the subject has M
mental states. In the experiment just described, these correspond to
concepts of the genera with which the student is familiar. When stimulus
sg is presented, a process A produces as output a mental state mh with
probability pgh. The same mental state is not always produced by a
particular animal picture. When the subject is in mental state mh a
process B produces response rk with probability phk. The same response
is not always produced by a particular mental state. Then the probability
the subject makes response rk to stimulus sg is
M
pgk  p
h 1
gh phk .
The response is assumed conditionally independent of the stimulus given

the mental state; that is, P(response = rk | mental state = mh & stimulus =
sg) = P(response = rk | mental state = mh). Each process can be
considered as having its own contingency matrix. Let the matrix for
process A be
A = (pgh).
The entry in row g column h is the probability stimulus sg produces

mental state mh. For process B, let the matrix be
B = (phk).
The entry in row h column k is the probability mental state mh produces

response rk. Denote the stimulus-response contingency matrix as
C = (pgk).
Then
C = AB.
Each of A, B, and C is a probability matrix; that is, the entries are

nonnegative numbers and the sum of the entries in each row is 1.
Now suppose each of two experimental factors selectively influences
a different process. Factor Α changes the probabilities with which
pictures of animals lead to mental states. In the biology experiment, for
example, the animal pictures might be photographs at one level of Factor
Α and drawings at another. Factor Β changes the probabilities with
which mental states lead to responses. In the biology experiment, the list
of genera for responses might be alphabetized at one level of Factor Β
and randomized at another.
Let Factor Α have levels i = 1,..., I. At level i let the contingency
matrix for process A be Ai. Let Factor Β have levels j = 1,..., J, with
contingency matrix Bj for process B at level j. Then when Factor Α is at
level i and Factor Β is at level j the stimulus-response contingency matrix
is Cij = AiBj.
For simplicity, suppose there are two levels for Factor Α and two for
Factor Β. Consider the block matrix
C11 C12   A1B1 A1B 2   A1 

C*     B1 B2   A * B * , (9.7)
C 21 C 22   A 2 B1 A 2 B 2   A 2 
where
 A1 
A*    and B*  B 1 B 2 .
A 2 
Equation (9.7) uses block multiplication. A block matrix is a matrix

whose cell entries, blocks, are themselves matrices. Multiplication of one
block matrix by another is done with the usual matrix multiplication
procedure applied to the blocks. When the symbols for blocks are
replaced with the matrices they stand for, and brackets within brackets
are removed, the result is the usual matrix multiplication.
A direct test of the model is to factor matrix C* as indicated in

Equation (9.7). Each factor must be a matrix with nonnegative cell
entries; see Lee and Seung (2001) for algorithms. A further requirement
is that the left factor A* is itself a probability matrix so its entries in each
row add to 1. Such a factorization is called a canonical factorization,
and need not be possible.
If a canonical factorization exists, it can provide important
information about the order in which the processes are executed. Matrix
multiplication is not commutative, so the order of the multiplicands
matters in Equation (9.7). Suppose the blocks that form the matrix C*
are assembled differently, to form the matrix
C C 21 
C * *   11 .
C12 C 22 
This matrix will not ordinarily have a canonical factorization. Although

the block arrangement in C* is transposed to obtain C**, the matrix in
each block is not transposed. Hence, C** is not the transpose of C*. If a
canonical factorization can be found for C* but not for C** the order of
the processes selectively influenced by Factors Α and Β is established.
An indirect test of the model in Equation (9.7) is based on the rank of
matrix C*. Consider a particular column of a matrix X, say column xk.
Column xk is linearly independent of a subset of the other columns, say,
{xm,..., xn}, if there do not exist numbers cm,..., cn, not all 0, such that
xk = cmxm + ... + cnxn.
The rank of a matrix is the largest number of columns that are linearly
independent of all other columns. If X is a matrix with r rows and c
columns, then rank(X) < min{r, c}. For any level i of Factor Α, matrix
Ai has one row for each different stimulus and one column for each
mental state. Hence, rank(Ai) < min{S, M}. (The entries in each row
sum to 1, but this does not prevent the columns from being linearly
independent.) Likewise, for any j, matrix Bj has one row for each mental
state and one column for each different response. Hence, rank(Bj) <
min{M, R}.
Going further, matrix
A 
A*   1 
A 2 
has 2S rows and M columns. Its rank is less than or equal to {2S, M}.
With I levels of Factor A the rank of this left factor is less than or equal
to min{IS, M}. Matrix
B*  B1 B2 
has M rows and 2R columns. Its rank is more complicated. Each of B1

and B2 is a probability matrix, so in each the row entries sum to 1. The
last column of B2 equals 1 minus the sum of its other columns. But 1
equals the sum of all the columns of B1. Hence the last column of B2 is a
linear combination of the other columns of B2 and the columns of B1.
Hence, the rank of B* is less than or equal to {M, 2R−1}. With J levels
of Factor Β the rank of B* is less than or equal to min{M, J(R−1) + 1}.
Finally, the rank of a product of two matrices is less than or equal to
the minimum rank of the multipliers. Hence, rank(C*) ≤ min{IS, M,
J(R−1)+1}. The upshot is that if M is considerably less than I times the
number of stimuli and J times the number of responses, the rank of
matrix C* will be M, considerably less than the number of its rows or
columns. Then C* may have a canonical factorization and behavior may
be described by the model in Equation (9.7).
Another indirect test was proposed independently by Ollman (1990).
The determinant of the product of two matrices is the product of the
determinants of the multipliers. That is, for every level i of Factor Α and
every level j of Factor Β
det(Cij) = det(Ai) det(Bj).

Hence, if the model is true, Factors Α and Β will have multiplicative

effects on the determinant of the contingency matrix.
A statistical problem is that with noisy data, a contingency matrix
will have random error in every cell. An exact canonical factorization
may be impossible, rank may be large, and determinants will have error.
One way to proceed is to consider several canonical factorizations that
approximate matrix C* as the product of matrices A* and B* for various
values of m, the proposed number of columns of A* and number of rows
of B*. The model is supported if factorizations of C* for values of m
less than IS and J(R−1)+1 give good approximations to C*. Two
additional results would add support. First, by analogy with a scree plot
in factor analysis, there may be a value of m that gives a good
approximation to C* with still higher values making negligible
improvements. Second, for each value of m, factorization of matrix C**
may have worse goodness of fit than factorization of matrix C*.
Physiological Measures
Evoked Potentials and The Additive Amplitude Method
Electrical potential (voltage) at any point in space is the sum of potentials

at that point due to all sources. Potential at the point due to a source
depends on the value of the potential at the source’s location as well as
factors such as the distance from the source to the point, the medium
filling the space between the source and the point and so on. Suppose
while a subject performs a task a group of neurons generates a potential
whose value at time t is A(t). Suppose another group of neurons
generates a potential with value B(t) at time t. An electrode placed on
the scalp at a certain point will be some distance away from the groups of
neurons, which are sources, and will register a potential which is a
weighted sum of A(t), B(t), and contributions from other sources (see,
e.g., Mochs, 1988). That is, at time t the potential at the electrode is
v(t) = b1A(t) + b2B(t) + C(t), (9.8)

where C(t) is the total contribution from sources other than the two
groups of neurons.
Suppose manipulating one experimental factor changes activity of the
neurons producing signal A(t), changing that signal, but leaving B(t) and
C(t) invariant. Suppose manipulating another experimental factor
changes activity of the neurons producing signal B(t), changing that
signal but leaving A(t) and C(t) invariant. Finally, suppose manipulating
neither factor changes the coefficients b1 and b2. Then the combined
effect of manipulating the two factors will be the sum of their individual
effects. Because potentials are additive, the factors are additive. Factors
will be additive at every electrode location satisfying these assumptions;
each location will have its own coefficients b1 and b2, and its own
residual signal C(t). Note that if the factors both have effects at the same
time t the neurons that the factors are influencing are active at the same
time; that is, the factors are influencing simultaneous processes, not
sequential processes.
Testing factors for such additivity using voltage as the dependent
variable is called the Additive Amplitude Method. Additive effects of
two factors were found by Holcomb and Kounios (1992) in an
experiment on sentence processing. Subjects were presented with a
sentence, such as “No dogs are animals,” and responded whether it was
true or false. The following two factors were among those manipulated:
Α. The subject of the sentence was related to the predicate or not.
Β. In a “superset” sentence, the subject was more general than the

predicate; in a “subset” sentence, the subject was less general than
the predicate.
Holcomb and Kounios examined a time interval enclosing the N400, a

negative peak in the evoked potential produced by a semantic anomaly
(Kutas & Hillyard, 1980). In this interval, they found each of the two
factors alone produced a significant negative effect on the Evoked
Response Potential (ERP). The ERP was more negative when the
subject and predicate were unrelated and for a superset sentence, but
there was no interaction. It is noteworthy that the factors had additive
effects not only at the peak of the N400, but over the time interval
surrounding it. Holcomb and Kounios concluded that the factors affected
processes executing in parallel, one for relation and one for generality.
Looking for additive and interactive effects of factors on ERP was
also a goal of Gondan and Röder (2006), who investigated integration of
multi-sensory information. At a particular point on the scalp at time t let
the ERP amplitude be V(t) when a visual stimulus alone is presented, A(t)
when an auditory stimulus alone is presented, and AV(t) when the
auditory and the visual stimulus are presented together. Barth, Goldberg,
Brett and Di (1995) proposed that sensory integration is occurring at
times t at which
AV(t) − A(t) − V(t)
differs from 0. If the sensory information were processed separately, the

potential for the combined stimuli would equal the sum of the potentials
due to individual stimuli, and the above expression would be 0.
Although seemingly straightforward, the reasoning was criticized by
Teder-Sälejärvi, McDonald, Russo and Hillyard (2002). There could be
brain activities common to all conditions, contributing a potential C(t) as
in Equation (9.8). If this common potential is included for each term in
the expression above it becomes
AV(t) + C(t) − A(t) − C(t) − V(t) − C(t).
If AV(t) is the sum of A(t) and V(t) the above expression equals − C(t),
not 0.
Gondan and Röder (2006) introduced a new procedure. Typical
experiments on sensory integration can be considered 2  2 factorial
designs; one factor is presence or absence of the auditory stimulus, the
other is presence or absence of the visual stimulus. One condition is the
baseline, with no visual or auditory stimulation. To ignore the baseline is
to implicitly assume there is no common potential. Gondan and Röder
(2006) note that simply including a condition in which no stimulus is

presented does not solve the problem, because an omitted stimulus may
elicit a special ERP, due perhaps to violation of expectation (Simson,
Vaughan & Ritter, 1976). Instead, they propose presenting a tactile
stimulus on every trial. Trials on which the tactile stimulus alone is
presented serve as the baseline condition. Let TAV(t) denote the evoked
potential at an electrode location at time t when the tactile, auditory and
visual stimuli are all presented; other notation is analogous. Gondan and
Röder (2006) propose examining the interaction contrast
TAV (t) − TA(t) − TV(t) + T(t). (9.9)
Because of superposition of potentials, if at a time t there is no sensory

integration, the combined effect of two or more stimuli is simply the sum
of their separate effects. For example,
TAV(t) = T(t) + A(t) + V(t),
and so on. At such times the expression in Equation (9.9) equals 0.

Hence sensory integration is occurring at times t when the expression in
Equation (9.9) is nonzero.
In the experiment of Gondan and Röder (2006), stimuli were tactile,
visual, auditory or any combination of these. The stimuli in various
modalities were presented simultaneously. Stimuli were presented
simultaneously once on 90% of the trials (standard) and they were
presented simultaneously twice, separated by a short gap, on 10% of the
trials (target). Subjects were instructed to respond on target trials (an
oddball procedure).
A reaction time analysis was one source of evidence for sensory
integration. The race inequality was described in Chapter 6. Gondan
and Röder (2006) tested it in three forms; that originally proposed by
Miller (1982) for two modalities; a form for three modalities, proposed
by Diederich (1992),
FTAV(t) < FT(t) + FA(t) + FV(t);
and a form developed by Gondan and Röder (2006),
FTAV(t) + FT(t) + FA(t) + FV(t) < FTV(t) + FTA(t) + FAV(t).
Here FTAV(t) is the cumulative distribution function for the reaction time
when tactile, auditory and visual stimuli are all presented; other notation
is similar. Violations of the race inequality showed evidence of sensory
integration (coactivation) between auditory and visual stimuli, but no
evidence of it between tactile and other stimuli.
Analysis of the ERP interaction contrast in Equation (9.9) showed
that prior to 84 msec, auditory and visual stimuli had additive effects on
ERP. Starting at 84 msec the interaction contrast was significantly
different from 0 at central electrode locations, evidence of sensory
integration. The time at which the interaction contrast differed from 0
and the shape of it depended on electrode location. In this experiment,
the expression proposed by Barth et al. (1995) and that proposed by
Gondan and Röder (2006) were similar for about the first 200 msec; in
this interval they lead to the same conclusions about whether sensory
integration is occurring. Later, after about 350 msec, the expressions
differ. That of Gondan and Röder (2006) is zero, interpreted as absence
of sensory integration, while that of Barth et al. (1995) is nonzero,
interpreted as an estimate of a potential common to all conditions.
Additive Areas of Evoked Potentials
Van Lankveld and Smulders (2008) found additive effects of two factors
on areas bounded by ERP curves. (Note that additive areas under the
ERP curves do not imply additive amplitudes of ERPs, or vice versa.)
Participants (all males) were asked to rate pictures in terms of felt
intensity (arousal) and experienced pleasure (valence). The main
question was whether erotic stimuli produce a specific ERP response. A
secondary question of the study is of primary interest here, whether
processing of arousal and of valence are processed independently
physiologically.
One special set of pictures was erotic (e.g., nude, heterosexual
behaviors); these were considered high arousal and positive valence.
Four other sets of pictures formed a 2  2 design of high and low arousal
crossed with positive and negative valence. Sports pictures (e.g., rafting
and parachute jumping) were considered high arousal and positive
valence. Pictures of traffic accidents, snakes, weapons, and horror were
considered high arousal and negative valence. Pictures depicting babies,
flowers, etc. were considered low arousal and positive valence
(low/positive). Finally, low arousal and negative valence pictures
(low/negative) showed crying people, garbage, cemeteries, etc. After
electrodes were attached, pictures were presented one by one. At offset
of each, the participant rated the picture on experienced pleasure and felt
intensity.
Van Lankveld and Smulders looked at the area under ERP curves in
two different time windows, which they called P300 (300-500ms) and
Positive Slow Wave (PSW; 500-700ms). The P300 ERP component is
thought to reflect the effect of perceptual saliency that attracts attention
(e.g., in the oddball paradigm) and PSW is thought to reflect the decision
or evaluation process; each component reflects other processes as well.
Figure 9.4 shows two critical dependent measures, the areas bounded
by ERP curves in intervals around P300 and PSW. Analyses showed
that the erotic pictures differed from the other high arousal positive
valence pictures, those of sports. Van Lankveld and Smulders interpret
this as indicating a specific ERP response to erotic stimuli.
In analyses of the nonerotic stimuli, in the P300 interval there was a
significant interaction between valence and arousal. However, in the
PSW interval, there were main effects of valence and arousal, but no
interaction. Van Lankveld and Smulders interpret the additivity as
evidence for independent processing of valence and arousal in this
interval.
fMRI and Additive BOLD Signals
To be active, neurons require oxygen from hemoglobin. When a specific

Fig. 9.4. Mean areas under ERP curves in five different stimulus sets. Clear additivity is
shown when non-erotic (sports) images used as high arousal and positive valence
stimulus (bottom panel). From van Lankveld, J. J. D. M., & Smulders, F. T. Y., 2008,
The effect of visual sexual content on the event-related potential, Biological Psychology,
79, Fig. 4. Copyright 2008 Elsevier. Reproduced with permission.
brain area becomes active for information processing, blood around the
area has a ratio of oxyhemoglobins (hemoglobins with oxygen) and
deoxyhemoglobins (hemoglobins without oxygen) different from that of
brain areas not involved in the specific information processing. The
magnetic properties of hemoglobin depend on the level of oxygen in it.
The Magnetic Resonance Imaging (MRI) machine can detect the
different ratio of oxy- and deoxyhemoglobins, called the Blood-Oxygen-
Level-Dependence (BOLD) signal. Usually, a stronger BOLD signal
from an area indicates an increase in brain activity over some baseline;
neurons there fire more rapidly or more neurons there fire than before.
An additive effect (i.e., no interaction) of two factors on BOLD signals is

a strong indication that two brain processes are selectively influenced by
the two factors. It is plausible that the two processes overlap in time.
But BOLD signals are sampled at relatively low temporal resolution
(about 1 to 2 seconds between successive scans), so the simultaneity of
two processes is not established by additive BOLD signal changes.
Epstein, Parker and Feiler (2008) examined the BOLD signal
produced by the repetition suppression (RS) effect. The RS effect is that
the brain’s response to a stimulus is reduced when the same stimulus has
just been presented (e.g., Wiggs & Martin, 1998). Previous studies (e.g.,
Henson, Rylands, Ross, Vuilleumeir, & Rugg, 2004) showed that the
size of the RS effect depends on the time gap between two successive
presentations of stimuli, with a smaller RS effect at long intervals.
Epstein et al. (2008) investigated whether the effect is governed by the
same brain mechanism at long and short intervals.
Undergraduates at the University of Pennsylvania were asked to
judge if images presented were specific streets or buildings; all images
were obtained on campus. These judgments were made in phase 1,
before participants entered the MRI scanner. Participants were exposed
to 24 on-campus locations, each with two views, a total of 48 images for
phase 1. After about a 20 minute break, phase 2 started with participants
inside the fMRI scanner. There were 48 on-campus locations, each with
four viewpoints, and foil images taken at Temple University. During this
phase, two images were presented briefly in series (500ms SOA).
Participants were asked to judge if both images depicted on-campus
locations or not. To respond correctly, participants needed to examine
both images.
Two factors were manipulated, each with three levels. The short-
interval RS effect was manipulated in phase 2 by presenting two
identical images (no change), images of the same location with different
views (view change), or images of two different locations (place change).
The long-interval RS effect was manipulated by changing the image
relationship between phases 1 and 2. Two images in phase 2 could be
ones that participants had seen in phase 1 (old view), ones that depict the
same location as in phase 1 but with different viewpoints (new view), or
ones that participants had not seen before (new place). The long and
short-interval RS factors could be manipulated independently.
For example, two identical images in phase 2 (a no change condition
for the short-interval RS effect) could be an image seen in phase 1 (old
view condition for the long RS effect), an image that depicts a location
seen in phase 1 but from a different viewpoint (new view), or an image
not seen in phase 1 (new place).
Epstein et al. found that as the levels of long- and short-interval
repetition factors changed, the BOLD signal changed in the
parahippocampal region, which is partly responsible for memory
function, see Figure 9.5. Importantly, they found no interaction between
long and short-interval RS factors and concluded that two different RS
effects may be governed by two different brain functions that operate
independently (even though the functions are the results of neuronal
activities at the same physiological locations). Additivity of the two RS
factors can be clearly seen in Figure 9.5B.
Additional evidence for the conclusion of Epstein et al. (2008) was
that the long-interval RS effect was relatively invariant across the old
view and new view conditions and their activation changes were smaller
than those of the new place condition (see Figure 9.5B). This means that
the long-interval RS effect works as long as the images depict the same
location, regardless of the viewpoint. However, the short-interval RS
effect was much stronger in the no change condition compared to other
conditions (view change and place change). That is, the BOLD signal
change was smaller in the no change condition than in the place change
and view change conditions. This suggests that the short-interval RS
effect is mainly determined by the literal physical similarity of two
images presented successively and the long- interval RS effect by the
identity of the location that images show, even though in different
viewpoints.
Concluding Remarks
It is not necessary that factors selectively influencing processes have

effects that combine according to a simple rule. But for several measures
Fig. 9.5. A. Brain region (parahippocampal place area; PPA) that is selectively responsive
to scenes. The activated region is the result of region of interest analysis (ROI), which
identifies a region that is more responsive to experimental stimuli (scenes) than to others
(objects). B. fMRI response changes (%) in PPA for all nine conditions of RS factors.
The x-axis represents long-interval RS conditions. Different lines represent short-interval
RS conditions. (Adapted from Epstein, R. A., Parker, W. E., & Feiler, A. M., 2008,
Evidence for Dissociable Neural Mechanisms: Two Kinds of fMRI Repetition
Suppression? Journal of Neurophysiology, 99, Fig. 2. Copyright 2008 Society for
Neuroscience. Reproduced with permission.
there are reasons to expect simple combination rules, addition for

reaction time, voltage, and the BOLD signal, multiplication for
probability and rate. Many experiments now demonstrate such simply
combining factors for one measure or another. Further progress will
come from finding more measures, deeper progress from more use of
multiple measures in the same experiment.
Appendix
Fitting the data of Keppel and Underwood (1962)
The model of Equation (1) was fit to the data using Excel solver.
Parameters were estimated to minimize G2. Let the observed frequency
in condition k be xk and the predicted frequency be mk. Then
G2 = − 2 Σ xk ln (mk/xk).
The sum is over both correct and incorrect frequencies in all conditions.
The goodness of fit statistic G2 is the log likelihood. For a given data set,
it is close in value to chi square, a commonly used alternative,
X2 = Σ(xk − mk)2/mk.
The degrees of freedom for G2 and X2 are the same. To calculate

degrees of freedom, note there are 18 observations. Nine of these for
incorrect responses are determined by the 9 for correct responses, leaving
9 independent observations. Six parameters are estimated. However, as
pointed out by Buchner and Erdfelder (2005), the six parameters are not
completely determined. Consider a constant c and let
I' = 1 − (1 − I)/c
R' = 1 − c + cR.
A little algebra shows Equation (9.1) can be written as
P(correct) = I' + (1 − I')R' = I + (1 − I)R.
In other words, the values of I and R are not unique; one is free to
choose a parameter c for change of scale. Parameter c must be chosen so
the transformed parameters I' and R' are between 0 and 1. If some
observed probability is 0 or 1 such a change of scale is impossible, but
this is not the case here. The scaling free parameter c adds 1 to the
degrees of freedom. The degrees of freedom are 9 − 6 + 1 = 4.
Fitting the data of Tehan and Tecotte (1997)
For degrees of freedom for the experiment of Tehan and Turcotte, for
each response modality there are 32 observed frequencies. Because the
observed frequency of an error in a certain treatment is 200 minus the
frequency of a correct response, only 16 observed frequencies are
independent. The model has 8 parameters. But when calculating
degrees of freedom it is necessary to take into account that parameter
values are not unique; one set of valid parameters can be transformed
into another by multiplying by two scaling coefficients. In other words,
given x, y, and z that predict the probability of an error as a product, xyz,
transformed values x/a, y/b and abz for positive a and b will predict just
as well, provided a and b are within a range that leaves the transformed
values between 0 and 1. Another way to put it is that two parameters can
be fixed ahead of time (to values not too extreme for practical
computation). The number of parameters that need to be estimated is 6.
Hence the degrees of freedom are 16 − 6 = 10.
Fitting the data of Ellenbogen et al. (2006) and Ellenbogen et al. (2009)
Each experiment had two crossed factors, Wake/Sleep and

Interference/No Interference, producing four treatment conditions. In
Ellenbogen et al. (2006) a different group of 12 subjects was in each
condition. Each subject learned 20 A-B pairs, those in the interference
conditions learned 20 new A-C pairs. There were 12  20 = 240 trials in
each condition. At testing, each subject was given a list of all A words.
Subjects were asked to recall all words paired with each A word, writing
the B words in one column and (for those learning C words), C words in
a different column.
Proportion of correct recall was reported in each condition, these
were multiplied by 240 and rounded to the nearest integer to obtain
frequencies of correct recall. Predicted frequency of correct recall was
predicted with Equation (9.4). For example, with sleep and no
interference, frequency of correct recall was predicted as
240(PYes + (1 − PYes)QNo)= 240(.7565 + (1 − .7565)  .7461) = 225.16.
Parameters were estimated with Excel solver to minimize G2.

In Ellenbogen et al. (2009), there were 45 subjects, randomly
assigned to the Wake group or the Sleep group. The actual number
assigned to each group was not reported, so to fit the model, we assumed
22.5 subjects in each group. All subjects learned 60 A-B word pairs.
For both the Wake and Sleep groups, twenty word pairs were tested 10
minutes after learning to check that the groups learned the pairs equally
well, which they did. When subjects returned 12 hours after their initial
session, they were tested on recall of 20 of the remaining A-B pairs. In
these tests, subjects were cued with an A word and asked to recall the
corresponding B word. Then for the 20 remaining A words, all subjects
learned a new list of 20 A-C pairs. They were then cued with A words
and asked to recall both the corresponding B and C words. (It is not
clear whether subjects indicated which recalled words were B and which
were C.) Performance on the B words was reported. For these, there
were 22.5  20 = 550 trials. Parameter estimation was as in the previous
case.
Notes
1. Material in this section was developed in discussions with James

Nairne and Ian Neath.
2. Material in this section was developed in discussions with Gerry
Tehan.
3. We thank Gerry Tehan for kindly providing information about the
experimental procedure.
4. Material in this section was developed in discussions with Marie
Poirier.
5. Material in this section was developed in discussions with Hye Joo
Han.
6. Material in this section was developed in discussions with

Zhuangzhuang Xi.
7. We thank Donald Bamber and Jan van Santen for kindly providing
material for this section.
Chapter 10
Selective Influence of Interdependent

Random Variables
Suppose random variables A and B are positively correlated. Consider

the claim that a factor selectively influences A, increasing its mean, say.
Because A and B are correlated, won’t B change? But if B changes, can
we say the factor selectively influences A? Is it possible that a factor
selectively influences one random variable, and a different factor
selectively influences a different random variable, yet the two random
variables are dependent?
Clearly dependence between random variables does not make it
impossible for factors to selectively influence them. The Additive Factor
Method (Sternberg, 1969) does not assume process durations are
stochastically independent. The expected value of a sum is the sum of
the expected values of its terms, whether the terms are dependent or not.
Consequently, additive effects of factors on mean reaction time is
evidence that the factors selectively influence processes in series,
whether the process durations are dependent or not. Predictions in
Chapter 3 about mean reaction times do not depend on independence of
the process durations. But other measures are less forgiving. For
example, predictions in Chapter 6 about reaction time cumulative
distribution functions were derived assuming independence or, less
strongly, conditional independence.
The perplexing issue of factors selectively influencing dependent
processes was considered by Townsend and Ashby (1983). They
concluded that factor additivity is logically independent of stochastic
independence (Proposition 12.2). Townsend (1984) pursued the issue
359
further, and Townsend and Thomas (1994) demonstrated that subtle

problems can readily arise. Considerable progress on the problem has
stemmed from the notion of conditional independence (Dzhafarov,
2003a; Townsend and Nozawa, 1995). Many questions remain open, but
many are settled by the theory of Dzhafarov (2003a), Dzhafarov and
Gluhovsky (2006), Kujala and Dzhafarov (2008, 2010) and Dzhafarov
and Kujala (2010).
The theory has wide application, including psychophysics,
information processing, and mental testing, so it is formulated in a
general way. When a visual stimulus is presented in a psychophysical
task, the subject forms an image. The same stimulus presented on
different occasions produces slightly different images, so the images are
random. But it is not obvious that all important features of an image can
be represented by a real number, or a vector of real numbers. In other
words, images might not be satisfactorily modeled as random variables
or random vectors. The theory is formulated more generally, in terms of
arbitrary sets of random entities selectively influenced by arbitrary sets
of factors, each having arbitrary sets of values. Random entities are
defined in the Appendix of Chapter 6. When two stimuli are presented
in, say, a same-different judgment task, two images are produced. The
theory allows consideration of, e.g., whether a physical change in one
stimulus selectively influences only one image. Use of selective
influence in such psychophysical tasks is described in Dzhafarov (2003b,
2003c) and Dzhafarov and Colonius (2006).
For modeling mental architecture, we consider here finite sets of
random variables and random vectors. They are special cases of random
entities. In the exposition below we follow closely Dzhafarov and
Kujala (2010), focusing however on the special case when the random
entities are random variables and random vectors, and the factors have
finite numbers of levels. We assume here that random variables have
finite means and variances, and that correlations referred to exist. More
generality can be found in the original papers.
To continue with terminology in earlier chapters, a random vector
<X1,..., Xn> is an ordered list of random variables X1,..., Xn that have a
Selective Influence of Interdependent Random Variables 361
joint distribution. Marginal distributions of all nonempty subsets of the

random variables are defined. If we speak of a list of random variables
[Y1,..., Yn] , we do not assume Y1,..., Yn are defined on the same sample
space or have a joint distribution. Notation is analogous for lists of
random entities. We continue to write lists of elements x, y,... that are
not random variables or random entities as <x, y,... >.
Let  be a nonempty finite list of m factors <Α, Β,..., Z>. Factor Α
has levels i = 1,..., I ; levels of other factors are denoted similarly. A
treatment is a list  of levels, <i, j,..., m>, where i is a level of Factor Α, j
is a level of Factor Β,... and m is a level of Factor Z. A factor can be
considered to be the set of its levels. For example, if Factor Α is stimulus
intensity, with levels dim and bright, we can say Α = {dim, bright}.
Then a treatment is an element of the cross product Α  Β  ...  Z.
It is convenient to speak informally of the duration of a certain
process, P, as a random variable, A, without mentioning levels of factors.
We might say, for example, that the duration of visual search is a random
variable. But the duration of a visual search cannot take on a value until
factor levels have been assigned and a set of items displayed. We need
to make this informal way of speaking precise. What we mean more
precisely is that for each treatment  the duration of process P is a
random variable A. Similarly, if there are two processes, P and Q, and
two factors Α and Β, we might say informally that the durations of
processes P and Q form the random vector <A, B>. What we mean more
precisely is that when Factor Α is at level i and Factor Β is at level j the
duration of process P is a random variable Aij and the duration of process
Q is a random variable Bij; further, these two random variables are
defined on the same sample space and have a joint distribution. Because
the two random variables have a joint distribution, <Aij, Bij> is a random
vector. (When the list of levels <i, j> is written as a subscript, the
brackets are sometimes omitted to avoid clutter.)
When the experimenter presents the subject with a stimulus in a
particular treatment , a sample value for the random vector D is taken
by the subject. The components of the random vector D are all defined
on the same sample space and have a joint distribution. For each
treatment  there is a random vector D = < A, B,..., Z>. Note we are
assuming no levels of factors outside Φ need be specified. We say the
distribution of the random vector <A, B,..., Z> depends only on factors in
the list Φ.
However, we do not assume the sample spaces for different
treatments are the same. In one treatment the outcome of the experiment
might be a button press and in another the outcome might be a spoken
word, forming different sample spaces. If in one treatment a stimulus is
bright and the duration of a perceptual process takes on a value, no value
is taken for what the duration of that process would be if the stimulus
were dim instead of bright. The random vectors for two different
treatments are not sampled together; that is, random vectors for different
treatments are not necessarily defined on the same sample space and do
not have a joint distribution. We say they are unrelated.
Selective Influence
Recall from Chapter 4 that two random variables may be defined on

different probability spaces, yet have the same distribution. For example,
after a coin toss, suppose X is set to 0 if a head occurs, and to 1 if a tail
occurs. After a die is cast, suppose Y is set to 0 if the number of dots is
even, and to 1 otherwise. The probability spaces for X and Y are
different, but P[X = 0] = P[Y = 0] and P[X = 1] = P[Y = 1]. If random
variables X and Y have the same cumulative distribution function, we
write X  Y. If random vectors V and W have the same joint cumulative
distribution function, we write V  W.
The definition of selective influence in the theory expresses random
variables as functions of other random variables. As an example of
expressing one random variable as a function of another, consider a
random variable X with an arbitrary distribution. A random variable with
the same distribution as X can be defined as a function of a random
variable U with a uniform distribution. Suppose random variable X has
cumulative distribution F(x) = P[X ≤ x]. There may be intervals of x
values over which F(x) is constant, so F(x) may not have an inverse. We
define a function similar to an inverse as follows. For every p  [0, 1]
let F − 1(p) = inf{x|F(x) = p}. Now let U denote a random variable
uniformly distributed between 0 and 1. The random variable F − 1(U) has
the same distribution as X, that is, F − 1(U)  X. This is so because given
any x, P[F − 1(U) < x] = P[U < F(x)] = F(x). For one random variable to
be expressed as a function of another random variable the latter need not
have a uniform distribution, of course.
The definition of selective influence uses the notions of random
entities and measurable functions, which are defined in the Appendix of
Chapter 6. To avoid a possible source of confusion, note that symbol C
in the definition and symbols C1,..., Cp in Chapter 4 are unrelated.
For the special case where all random entities are random variables or
random vectors, the definition says the n random variables in the list
[A, B,..., Z ] are selectively influenced by the n Factors <Α, Β,..., Ζ>,
respectively, if there exists a random vector C (defined on some
probability space), and for every level k of every factor α in <Α, Β,..., Ζ>
there exists a real valued function fα,k such that for every treatment  =
<i, j,..., m>
<A, B,..., Z> ≈ <f1,i(C), f2,j(C),..., fn,m(C)>. (10.1)
For random variables, when the definition applies we have random

variables
Ai = f1,i(C), Bj = f2,j(C),..., Zm = fn,m(C),
all defined on the same sample space, with
<A, B,..., Z> ≈ <Ai, Bj,..., Zm>.
In more generality, the definition for random entities follows.

Definition 10.1 (Dzhafarov & Kujala, 2010) The n random entities

[A, B,..., Z] are selectively influenced by the n Factors <Α, Β,..., Ζ>,
respectively, if there exists a random entity C (defined on some
probability space), and for every level k of every factor α in <Α, Β,..., Ζ>
there exists a measurable function fα,k such that for every treatment  =
<i, j,..., m>
<A, B,..., Z> ≈ <f1,i(C), f2,j(C),..., fn,m(C)>.
When the definition applies we sometimes write for short [A, B,..., Z]
↫ <Α, Β,..., Ζ>.
Remark 1
The definition of selective influence in Chapter 6, based on Dzhafarov

(2003a), is equivalent to the one above. Recently, Dzhafarov and Kujala
(2011) showed that if the definition here is satisfied, C can always be
chosen to be a random vector. The gist of the equivalence is easy to see
when C is considered a random vector; for proof in more general terms,
see Dzhafarov and Gluhovsky (2006). Suppose random entity C exists
as specified in Definition 10.1 of selective influence here, and C is a
random vector. To see that the definition in Chapter 6 holds, choose any
random vectors SA,..., SZ defined on the same probability space as C is
defined on, so that C, SA,..., SZ are mutually independent. Clearly, for
every level i of Factor Α, the function f1,i(C) can be considered a function
of C and SA (the value of SA having no role in determining the value of
the function). The analogous statement is true for every level k of every
factor α  <Α, Β,..., Ζ>. On the other hand, suppose the definition of
Chapter 6 holds. Suppose C, SA,..., SZ are mutually independent random
vectors defined on the same probability space such that for every
treatment  = <i, j,..., m>
<A, B,..., Z> ≈ <f1,i(C, SA), f2,j(C, SB),..., fn,m(C, SZ)>.
Form a random vector C* whose component random variables are those

of C, SA,..., and SZ. Then
<A, B,..., Z> ≈ <f1,i(C*), f2,j(C*),..., fn,m(C*)>,
as required in Definition 10.1 here.
Remark 2
If random variables in the list [A, B,..., Z] are selectively influenced by

factors Φ = <Α, Β,..., Ζ>, respectively, then for any treatment  = <i,
j,..., m>, A only depends on level i of the first factor, so we can write A
as Ai. Likewise, we can write B as Bj and so on. Then if [A,B,..., Z] are
selectively influenced by the factors Φ, there exists a random entity C
such that given a value c of C, for any treatment  = <i, j,..., m> random
variables Ai, Bj,..., Zm are mutually independent, i.e., they are mutually
conditionally independent given c.
Remark 3
Suppose random variables in the list [A, B,..., Z] are selectively

influenced by factors Φ = <Α, Β,..., Ζ>, respectively. Transformations of
the individual random variables are also selectively influenced by Φ. Let
hi(A) be a measurable function of A, and so on. Then, by composition
of functions, [hi(A),..., km(Z)] are selectively influenced by factors Φ =
<Α, Β,..., Ζ>, respectively. Note that the transformations are indexed by
the levels of the factors.
Remark 4
The definition is written for the case in which there is a one-to-one

correspondence between factors and random entities, but other cases can
be covered. Suppose some random variable of interest is not influenced
by any factor in an experiment. One can indicate this by adding a Factor
X to the list and saying the random variable is selectively influenced by
Factor X, which never changes, i.e., has one level. Suppose a random
variable is changed by both Factor Α with levels denoted by i and Factor

Β with levels denoted by j. One can replace Factors Α and Β with a new
Factor C having a level for every element <i, j> of the cross product of
levels of Factor Α with levels of Factor Β. A general and flexible
notation can be found in Dzhafarov and Kujala (2010).
Marginal selectivity
Intuitively, if a factor selectively influences a random variable A, but

does not influence random variable B, changing the level of the factor
should not change the mean of B, or other aspects of B considered alone.
The marginal distribution of B should be invariant with changes in the
level of a factor not influencing it (Townsend & Schweickert, 1989).
Here is an example of factors not changing marginal distributions,
from Dzhafarov and Kujala (2010). There are two factors, each with
levels 1 and 2, and two random variables, each with values 0 and 1. In
treatment <1, 1> the first factor is at level 1, as is the second. The upper
left cell in the subtable for treatment <1, 1> has the joint probability that
in this treatment A = 0 and B = 0; other cells are analogous.
<1, 1> B=0 B=1 <1, 2> B=0 B=1

A=0 .6 0 A=0 0 .6
A=1 0 .4 A=1 .4 0
<2, 1> B=0 B=1 <2, 2> B=0 B=1

A=0 .3 .2 A=0 .25 .25
A=1 .3 .2 A=1 .15 .35
In a fuller notation, in treatment <1, 1>, the first random variable is

denoted A11 and the second is denoted B11; notation for other treatments
is similar. The marginal probability of the first random variable does not
depend on the level of the second factor. That is,
P[A11 = 0] = P[A12 = 0] = .6 and P[A11 = 1] = P[A12 = 1] = .4.

Further,
P[A21 = 0] = P[A22 = 0] = .5 and P[A21 = 1] = P[A22 = 1] = .5.
Likewise it can easily be checked that the marginal probability of the

second random variable does not depend on the level of the first factor.
With more than two random variables this condition can be
generalized by saying that if random variables B and C are not influenced
by a factor then their joint distribution is invariant with changes in level
of the factor, and so on. The general condition is complete marginal
selectivity, defined in a moment, and, according to the lemma that
follows, it occurs whenever factors selectively influence random
variables in the sense of Definition 10.1. For the case of two random
variables a definition of marginal selectivity was proposed by Townsend
and Schweickert (1989). The following definition is more general.
Suppose random variables in the list [A, B,..., Z] depend only on a list
of factors Φ. Consider a one-to-one mapping from random variables to
factors, with random variable A associated with some Factor Α, random
variable B associated with some Factor Β, and so on. Denote the
random variable associated with an arbitrary Factor α as X α .
A sublist of a list L is a list of some elements of L in the same order
as they appear in L. Consider a sublist Φ1 of the list of factors Φ. Let
[Xα|α  Φ1] be the sublist of [A, B,..., Z] containing those random
variables in [A, B,..., Z] associated with factors in the sublist Φ1.
Definition 10.2 (Dzhafarov, 2003a; Dzhafarov & Kujala, 2010) Let

Φ1 be a sublist of the list of factors Φ. Let 1 be a list of levels, exactly
one from each factor in Φ1. Suppose for every treatment  containing 1
as a sublist, < Xα|α  Φ1> is a random vector and for every treatment the
distribution of < Xα|α  Φ1> is the same. If the preceding statements
hold for every sublist Φ1 of Φ and every corresponding list 1 of levels,
one from each factor in Φ1, then the dependence of [A, B,..., Z] on Φ
satisfies complete marginal selectivity.
Lemma 10.1 (Dzhafarov & Kujala, 2010) Suppose the random

variables in the list of random variables [A, B,..., Z] are selectively
influenced, respectively, by the factors in the list Φ. Then the
dependence of [A, B,..., Z] on Φ satisfies complete marginal selectivity.
The lemma follows immediately from the definition.
Complete marginal selectivity is a strong condition, and might seem

to imply selective influence as in Definition 10.1. But the example
above from Dzhafarov and Kujala (2010) shows it does not. Complete
marginal selectivity is satisfied in the example. From the treatment
subtables, the joint distributions must satisfy the following:
(a) A11  B11

(b) A12  1 − B12
(c) A21 is stochastically independent of B21
(d) A22 is not stochastically independent of B22.
A contradiction arises when we try to assign the random variables Ai and

Bj that follow if the definition of selective influence applies. If they
exist, each of these new random variables requires only one subscript.
From (a) and (b),
A1  B 1
A1  1 − B 2
Hence
(e) B1  1 − B2.
And from (c) and (d)
A2 is stochastically independent of B1
A2 is not stochastically independent of B2.
But it is not possible for A2 to be stochastically independent of B1 without

also being stochastically independent of B2, because of (e).
The Joint Distribution Criterion
The following key theorem gives a condition equivalent to selective

influence. It is stated here for selectively influenced random variables,
and stated more generally in Dzhafarov and Kujala (2010).
Theorem 10.1 (Dzhafarov & Kujala, 2010) Random variables in the

list [A, B,..., Z] are selectively influenced by factors in the list Φ,
respectively, if and only if there is a random vector <H1,1,..., H1,I, H2,1,...,
H2,J,..., Hn,1,..., Hn,M> such that for every treatment  = <i, j,..., m>
<H1,i, H2,j,,..., Hn,m>  <A,B,..., Z>.
Proof: Suppose [A, B,..., Z] ↫ Φ. Then a random entity C exists, as

described in Definition 10.1 of selective influence. For every level k of
every factor α, let
Hα,k = fα,k(C).
Random entity C is defined on a probability space, so when an

observation of C is taken, for every level k of every factor α, fα,k(C) takes
on a value. Then random variables H1,1,..., H1,I, H2,1,..., H2,J,..., Hn,1,...,
Hn,M have a joint distribution. Then <H1,1,..., H1,I, H2,1,..., H2,J,..., Hn,1,...,
Hn,M> is a random vector, and for every treatment  = <i, j,..., m>,
<H1,i, H2,j,,..., Hn,m>  <A,B,..., Z>.
On the other hand, suppose the random vector <H1,1,..., H1,I, H2,1,...,
H2,J,..., Hn,1,..., Hn,M> exists as described in the statement of the theorem.
Let C = <H1,1,..., H1,I, H2,1,..., H2,J,..., Hn,1,..., Hn,M>. For every level k of
factor α   let
fα,k(C) = Hα,k.
Then for every treatment  = <i, j,..., m>
<A,B,..., Z>  <f1,i(C), f2,j(C),..., fn,m(C)>.
Hence, [A, B,..., Z] ↫ Φ. ∎
The Cosphericity Test
Relationships among random variables having a joint multivariate

normal distribution are determined by the correlations between each pair.
Factors selectively influence such random variables in the sense of
Definition 10.1 if and only if their correlations have a certain form. It
turns out that correlations of this form are needed, but not sufficient, for
factors to selectively influence random variables regardless of their
distributions. The required form of the correlation was stated by
Dzhafarov (2003a). The cosphericity test for it was developed by Kujala
and Dzhafarov (2008), whose discussion we summarize here.
To simplify discussion, consider two factors, Factor Α with levels i =
1, 2 and Factor Β with levels j = 1, 2. For each pair of levels <i, j>
consider bivariate normal random variables <Aij, Bij>. When Factor Α is
at level i and Factor Β is at level j let their correlation be ρ(ij).
It is convenient to consider standardized random variables, each with
mean 0 and variance 1. Suppose the random variables are not
standardized to begin with. Suppose [A, B] ↫ <Α, Β>. Complete
marginal selectivity must be satisfied, so the marginal distribution of Aij
depends only on level i and the marginal distribution of Bij depends only
on level j. We can write
E[Ai] = μA(i), V[Ai] = σA2(i) and E[Bj] = μB(j), V[Bj] = σB2(j),
where E denotes expected value and V denotes variance. The

corresponding standardized random variables are
Ai   A (i ) B j   B ( j)
and .
 A (i )  B ( j)
This linear transformation does not change the correlation between Ai

and Bj. Further, the transformed random variables are selectively
influenced by Factors Α and Β respectively. We suppose the
transformation has been done if necessary, so we are dealing with
standardized random variables having mean 0 and variance 1.
The form the correlation must have for random variables A and B
above to be selectively influenced by Factors Α and Β, respectively, is in
the following theorem.
Theorem 10.2 (Dzhafarov, 2003a; Kujala & Dzhafarov, 2008)

Standard bivariate normal random variables <A, B> are selectively
influenced by <Α, Β > in the sense of Definition 10.1 if for every level i
of Factor Α there are numbers a1(i),..., an(i) and for every level j of Factor
Β there are numbers b1(j),..., bn(j) such that the correlation between Aij
and Bij has the form
n
 (ij )   a k (i )bk ( j ) (10.2)
k 1
for some n > 1, with

n
a 2
k 1
k 1 (10.3)
n
b
k 1
2
k  1.
Proof: Suppose Equations (10.2) and (10.3) hold. Let C1,..., Cn, SA
and SB be independent standard normal random variables. Let
n n
Ai  1   a (i ) S A   ak (i )Ck  f1i (C1 ,, Cn , S A )
2
k
k 1 k 1
n n
B j  1   bk2 ( j ) S B   bk ( j )Ck  f 2 j (C1 ,, Cn , S B ) .
k 1 k 1
It is straightforward to check that E[Ai] = E[Bj] = 0. Also
 n
 n
V [ Ai ]  1   ak2 (i )V [ S A ]   ak2 V [Ck ]  1.
 k 1  k 1
Likewise, V [ B j ]  1.
Because Ai and Bj are standard, their correlation equals their
covariance,
E[( Ai  E[ Ai ])(Y j  E[Y j ])]  E[ Ai B j ]

n 
 E  ak (i )bk ( j )Ck2 
 k 1 
n
  ak (i )bk ( j )   (ij ).
k 1
Bivariate normal random variables are completely determined by their

two means, two variances and their correlation. Hence <Aij, Bij>  <Ai,,
Bj > and thus can be expressed as in Equation (10.1). ∎
For bivariate normal random variables, the condition that correlations

satisfy Equations (10.2) and (10.3) is necessary and sufficient for them to
be selectively influenced by factors Φ, as discussed below.
Cosphericity
Testing Eq, (10.2) with constraints (10.3) requires estimating

intermediate quantities a1,..., bn. The equivalent cosphericity condition

can be tested directly with the correlations. (The condition tested is
called cosphericity because if it holds the list of quantities in Eq. (10.2)
can be extended to a1,..., an, an+1, an+2 and b1,..., bn, bn+1, bn+2 that form
coordinates of points on the surface of a unit hypersphere.)
Definition 10.3 (Kujala & Dzhafarov, 2008) Correlations ρ(i, j), i, j

= 1, 2, satisfy cosphericity if
|  (11)  ( 21)   (12 )  ( 22 ) |  (1   (11) 2 (1   ( 21) 2 

(1   (12 ) 2 (1   ( 22 ) 2 .
Theorem 10.3 (Kujala & Dzhafarov, 2008, Proposition 3)

Correlations ρ(i, j), i, j = 1, 2 satisfy cosphericity if and only if they
satisfy Eq. (10.2) with constraints (10.3).
For proof, see Kujala & Dzhafarov (2008).
For an arbitrary pair of random variables to be selectively influenced

by a pair of factors, each factor with two levels, the correlations between
the random variables need the required form, or, equivalently, need to
satisfy cosphericity.
Theorem 10.4 (Kujala & Dzhafarov, 2008, Proposition 5) Suppose

Factor Α has levels i = 1, 2; Factor Β has levels j = 1,2; and [A, B] is a list
of random variables. Then [A, B] ↫ Α, Β only if correlations ρ(i, j)
satisfy cosphericity, where ρ(i, j) is the correlation between Aij and Bij, for
i, j = 1, 2.
For proof, see Kujala & Dzhafarov (2008, Proposition 5).
The following examples from Kujala and Dzhafarov (2008) can be

easily checked. The cosphericity test is passed for the following
correlations
ρ(11) = .7299 ρ(12) = .7299

ρ(21) = .7299 ρ(22) = −.6322
but not for
ρ(11) = .7743 ρ(12) = .7742

ρ(21) = .7742 ρ(22) = −.7742.
Hence, it is possible that random variables with the first array of

correlations are selectively influenced by two factors, each with levels 1
and 2. But such is not possible for random variables with the second
array of correlations.
If [A, B] are selectively influenced by <Α, Β>, respectively, then
transformations of A and B that preserve complete marginal selectivity
must also satisfy the cosphericity test. Suppose for i = 1, 2, hi is a
measurable function of Ai and for j = 1, 2, kj is a measurable function of
Bj. Then from Definition 10.1 of selective influence, by considering
composition of functions, random variables hi(Ai) and kj(Bj) are
selectively influenced by <Α, Β >. But satisfaction of the cosphericity
test for A and B does not guarantee satisfaction of it for transformed A
and transformed B. Hence, an infinite number of tests are available,
failure of any one of which rejects selective influence of the factors on
[A, B], even if the test is passed for A and B themselves. For details
about appropriate transformations, see Kujala and Dzhafarov (2008) and
Dzhafarov and Kujala (2010).
Suppose random variables A and B can be transformed appropriately
to bivariate normal <Aij*, Bij*>. Then cosphericity for only this
transformed pair need be tested. Cosphericity is a necessary but not a
sufficient condition for factors to selectively influence arbitrary random
variables. But for the special case of bivariate normal random variables,
it is both necessary and sufficient, summarized as follows:
Suppose Factor Α has levels i = 1, 2 and Factor Β has levels j = 1, 2.
Suppose for every pair of levels i and j, random vector <Aij, Bij> has a
bivariate normal distribution, with E[Aij] = μA(i), E[Bij] = μB(j), V[Aij] =
σA2(i), V[Bij] = σB2(j), The following three statements are equivalent:
1. There are independent random variables C1,..., Cn and functions f1i

and f2j such that
Aij ≈ Ai = f1i(C1,..., Cn) and Bij ≈ Bj = f2j(C1,..., Cn),
and <Ai , Bj, C1,..., Cn> have a multivariate distribution (which

need not be multivariate normal).
2. Correlations between Aij and Bij, ρ(i, j), i, j = 1,2, have the form of
Equation (10.2) with constraints (10.3).
3. Correlations between Aij and Bij, ρ(i, j), i, j = 1,2 satisfy

cosphericity.
For equivalence of Conditions 1 and 2 see Dzhafarov (2003a).

Equivalence of the three conditions is established in Kujala and
Dzhafarov (2008), Propositions 1, 2, and 7.
Remark 5
Results above about cosphericity are stated for a 2 × 2 design, that is,
each factor has two levels. In a design with more factor levels, for the
factors to selectively influence random variables it is necessary that
cosphericity hold in every 2 × 2 sub-design. This is not sufficient,
however; see Kujala & Dzhafarov (2008, p. 142). Definition 10.1 might
apply for <i, j>  {1, 2} × {1, 2} with a particular random entity C1 and
it might apply for <i, j>  {2, 3} × {1, 2} with a different particular
random entity C2. It is not known whether one can always find a single
random entity C allowing Definition 10.1 to hold for the union of these
levels, <i, j>  {1, 2, 3} × {1, 2}.
The Distance Test
The hypothesis that certain factors selectively influence certain random

variables can be empirically tested if observations of the random
variables can be made. The Distance Test was developed for this
purpose by Kujala and Dzhafarov (2008); discussion here follows theirs.
The Distance Test is a consequence of the Minkowski inequality (e.g.,
Royden, 1968). For two random variables F and G defined on the same
probability space with a probability measure λ, the inequality has the
form, for any p > 1,
p
  F (c )  G(c )
p p
p
F ( c )  G ( c ) d ( c )  p
d ( c )  p d ( c ) .
The integral is over all values of c. It follows that for any p > 1,
p p p
p
E [ F  G ]  p E [ F ]  p E[ G ] .
Because the Minkowski inequality holds if some integrals are infinite,

the above inequality holds if some expected values are infinite.
For two random variables P and Q defined on the same probability
space,
p
p
E[ P  Q ]
is a distance. The function is 0 if and only if P = Q almost everywhere, it

is symmetric, and satisfies the triangle inequality. (When the function is
0, P may not equal Q everywhere, so the formula provides a distance
between equivalence classes of random variables. To simplify
discussion, we assume here that when the function is 0, P = Q.)
To simplify notation, consider a simple situation. Consider a Factor
Α with two levels, i = 1, 2 and a Factor Β with two levels, j = 1, 2. For
every pair <i, j> of a level i of Factor Α and a level j of Factor Β consider
a pair of jointly distributed random variables Aij and Bij. Consider the
random vector <A, B> to be a member of the family of random vectors
{< Aij, Bij>| i = 1, 2; j = 1, 2}. For every i and j, there is a probability
space on which the pair < Aij, Bij> have a joint distribution. For any p > 1
we can let
p
Dij  p E[ Aij  Bij ] .
Now suppose <A, B> is selectively influenced by <Α, Β >, with A

selectively influenced by Factor Α and B selectively influenced by Factor
Β. Then the following inequality must hold,
max{D11, D12, D21, D22} < (D11 + D12 + D21 + D22)/2. (10.4)
This is an example of the distance test.

To see that Inequality (10.4) holds, start with the triangle inequality
for random variables P, Q and R,
p p
p
E[ P  Q ]  p E[ ( P  R )  ( R  Q ) ]
p p
 p E[ P  R ]  p E[ R  Q ].
The triangle inequality leads to
p p p
p
E[ A11  B11 ] < p
E[ A11  A21 ] + p
E[ A21  B11 ]
and
p p p
p
E[ A11  A21 ] < p
E[ A11  B12 ] + p
E[ A21  B12 ] ,
so
p p p p
p
E[ A11  B11 ]  p E[ A11  B12 ]  p E[ A21  B12 ]  p E[ A21  B11 ].
(10.5)
At this point, without the assumption that <A, B> is selectively

influenced by <Α, Β > expressions on the right hand side need not be
distances as defined above, and Inequality (10.4) does not follow. But if
we assume <A, B> is selectively influenced by <Α, Β >, then there exists
a random entity C such that for every pair of levels <i, j>
<Aij, Bij> ≈ <f1i(C), f2j(C)>.
Subscript i is not needed for random variable Aij, so we can denote it Ai

(for every j), and subscript j is not needed for random variable Bij, so we
can denote it Bj (for every i). Then we can write for every i and j
p p
Dij  p E[ Aij  Bij ] = p
E[ Ai  B j ] .
Inequality (10.5) becomes
p p p p
p
E[ A1  B1 ]  p E[ A1  B2 ]  p E[ A2  B2 ]  p E[ A2  B1 ] ,
so
D11 ≤ D12 + D21 + D22.
Then
2 D11 ≤ D11 + D12 + D21 + D22.
Similar reasoning leads to
2 D12 ≤ D11 + D12 + D21 + D22,

2 D21 ≤ D11 + D12 + D21 + D22,
2 D22 ≤ D11 + D12 + D21 + D22,
and Inequality (10.4) follows (Kujala & Dzhafarov, 2008).

Inequality (10.4) holds more generally, for transformations of the
random variables A and B. Suppose as before <A, B> are selectively
influenced by <Α, Β >, with A selectively influenced by Factor Α (with its

two levels i = 1, 2) and B selectively influenced by Factor Β (with its two
levels, j = 1, 2). Suppose for i = 1, 2, hi is a measurable function of Ai
and for j = 1, 2, kj is a measurable function of Bj. Then from Definition
10.1 of selective influence, by considering composition of functions,
random variables hi(Ai) and kj(Bj) are selectively influenced by <Α, Β >,
respectively.
Now for any p > 1, for every pair of levels i, j let
p
s ij  p E[ hi ( Ai )  k j ( B j ) ] .
Reasoning as before leads to The Distance Test (Kujala & Dzhafarov,

2008, Proposition 8),
max{s11,s12,s21,s22} ≤ (s11 + s12 + s21 + s22)/2.
The Distance Test is not a sufficient condition, so even if it is

satisfied for all transformations of A and B for all p ≥ 0, one cannot
conclude that <A, B> are selectively influenced by <Α, Β >. Nonetheless,
an infinite number of tests are provided, and because the test is a
necessary condition, a violation for any value of p for any
transformations of A and B leads to rejection of selective influence.
The Distance Test is described here for two random variables and two
factors, each with two levels. It is extended to arbitrary sets of random
variables and factors by Dzhafarov and Kujala (2010). It is shown that in
a design with more factor levels, for the factors to selectively influence
random variables, it is not only necessary but sufficient that the distance
test is passed for every 2 × 2 sub-design.
Concluding Remarks
The knotty problem of what it means for experimental factors to

selectively influence random variables that are dependent on one another
is disentangled in the work summarized here, a part of the theory of

Dzhafarov (2003a), Dzhafarov and Gulhovsky (2006), Kujala and
Dzhafarov (2008) and Dzhafarov and Kujala (2010). The Joint
Distribution Criterion at the core of the theory is elegant yet powerful
and fundamental. The theory can be tested with observations of the
random variables themselves or their correlations. A natural result of the
theory is that if random variables are selectively influenced by factors, so
are individual transformations of the random variables. Hence, the
theory can be tested with transformed random variables. Because of its
generality the theory is widely applicable. For discovery of cognitive
architecture, ancillary assumptions must be made about the rules by
which the random variables are combined, and these lead to testable
predictions (Chapter 6).
Conclusion: Selectively Influencing Mental Processes
By focusing on a few carefully chosen factors an investigator can bring

an outline of the entire system into view. If the factors selectively
influence processes, interactions of the factors reveal the arrangement of
the component processes. The function of a process is discerned by
examining the kinds of factors that selectively influence it.
For some theories about factors selectively influencing processes,
representation and uniqueness theorems have been proven. Details of
these are beyond the scope of this work, but likely to be important in the
future. The theorems state that data satisfy certain conditions if and only
if a mathematical structure of a particular class could have generated the
data. The theories are falsifiable; if the conditions are not satisfied, then
no structure of that class could have generated the data. Of course, if the
conditions are satisfied an investigator would want to know whether the
structure really exists and really was used to generate the data. These
questions cannot be settled by data.
At this time, the architectures that can be revealed by selectively
influencing processes are few. One can complain that with discrete
starting and finishing points, absence of feedback, and so on, the
processes investigated are so constrained as to be almost crystallized.
Indeed, it is hard to see how to selectively influence processes in the

more fluid arrangements that surely exist. But the restriction to nearly
crystalline structures has its benefits; one is that they can be firmly
established.
References
Anderson, J. R., & Bower, G. H. (1974). Human associative memory. New York: Wiley.
Arbuckle, J., & Larimer, J. (1976). The Number of Two-Way Tables Satisfying Certain
Additivity Axioms. Journal of Mathematical Psychology, 13, 89-100.
Ashby, F. G. (2000). A stochastic version of general recognition theory. Journal of
Mathematical Psychology, 44, 310-329.
Ashby, F. G., Boynton, G., & Lee, W. W. (1994). Categorization response time with
multidimensional stimuli. Perception & Psychophysics, 55, 11-27.
Ashby, F. G., & Maddox, W. T. (1994). A response time theory of separability and
integrality in speeded classification. Journal of Mathematical Psychology, 38,
423–466.
Ashby, F. G., Prinzmetal, W., Ivry, R., & Maddox, W. T. (1996). A formal theory of
feature binding in object perception. Psychological Review, 103, 165-192.
Ashby, F. G., Tein, J.-Y., & Balakrishnan, J. D. (1993). Response time distributions in
memory scanning. Journal of Mathematical Psychology, 37, 526-555.
Ashby, F. G., & Townsend, J. T. (1980). Decomposing the reaction time distribution:
Pure insertion and selective influence revisited. Journal of Mathematical
Psychology, 21, 93-123.
Backus, B. T., & Sternberg, S. (1988, November). Attentional tradeoff across space early
in visual processing: New evidence. Paper presented at the meeting of the
Psychonomic Society, Chicago.
Bamber, D., & van Santen, J. P. H. (1978). A general method for the analysis of
conditional response frequencies. Unpublished manuscript, V. A. Hospital, St.
Cloud, MN.
Bamber, D., & van Santen, J. P. H. (1980, August). Testing discrete state models using
conditional probability matrices. Paper presented at the Mathematical
Psychology Meeting, Madison, WI.
Barth, D. S., Goldberg, N., Brett, B., & Di, S. (1995). The spatiotemporal organization of
auditory, visual and auditory-visual evoked-potentials in rat cortex. Brain
Research, 678, 177-190.
Bartle, R. G. (1964). The elements of real analysis. New York: Wiley.
Bartlett, J. C., & Searcy, J. (1993). Inversion and configuration of faces. Cognitive
Batchelder, W. H., & Riefer, D. M. (1986). The statistical analysis of a model for storage
and retrieval processes in human memory. British Journal of Mathematical and
Statistical Psychology, 39, 120-149.
383
384 References
Batchelder, W. H., & Riefer, D. M. (1990). Multinomial processing models of source

monitoring. Psychological Review, 97, 548-564.
Batchelder, W. H., & Riefer, D. M. (1999). Theoretical and empirical review of
multinomial process tree modeling. Psychonomic Bulletin & Review, 6, 57-86.
Batchelder, W. H., Riefer, D. M., & Hu, X. (1994). Measuring memory factors in source
monitoring: Reply to Kinchla. Psychological Review, 101, 172-176.
Bayen, U. J., Murnane, K., & Erdfelder, E. (1996). Source discrimination, item detection,
and multinomial models of source monitoring. Journal of Experimental
Psychology: Learning, Memory and Cognition, 22, 197-215.
Borger, R. (1963). The refractory period and serial choice reactions. Quarterly Journal of
Experimental Psychology, 15, 1-12.
Broadbent, D. E. (1958). Perception and Communication. New York: Pergammon.
Brown, J. (1958). Some tests of the decay theory of immediate memory. Quarterly
Journal of Experimental Psychology, 10, 12-21.
Brown, S. W. (1995). Time, change, and motion: The effects of stimulus movement on
temporal perception. Perception & Psychophysics, 57, 105-116.
Brown, S. W. (1997). Attentional resources in timing: Interference effects in concurrent
temporal and nontemporal working memory tasks. Perception &
Psychophysics, 59, 1118-1140.
Brown, S. W., & West, A. N. (1990). Multiple timing and the allocation of attention. Acta
Psychologica, 75, 103-121.
Bruce, C., Desimone, R., & Gross, C. G. (1981). Visual properties of neurons in a
polysensory area in superior temporal sulcus of the macaque. Journal of
Neurophysiology, 46, 369-384.
Buchner, A., & Erdfelder, E. (2005). Word frequency of irrelevant speech distractors
affects serial recall. Memory & Cognition, 33, 86-97.
Buchner, A., Erdfelder, E., Steffens, M. C., & Martensen, H. (1997). The nature of
memory processes underlying recognition judgements in the process
dissociation procedure. Memory & Cognition, 25, 508-517.
Bundesen, C. (1990). Theory of visual attention. Psychological Review, 97, 523-547.
Busemeyer, J. R., & Myung, I. J. (1988). A new method for investigating prototype
learning. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 14, 3-11.
Card, S. K., Moran, T. P., & Newell, A. (1993). The psychology of human computer
interaction. Hillsdale, NJ: Erlbaum.
Cattell, J. M. (1947). Attention and reaction. In R. S. Woodworth (Ed. & Trans.) James
McKeen Cattell, Man of Science. (Vol. 1, pp. 252-255). Lancaster, PA: The
Science Press. (Original work published 1893.)
Chechile, R. (1977). Storage-retrieval analysis of acoustic similarity. Memory &
Cognition, 5, 535-540.
References 385
Chechile, R. A. & Meyer, D. L. (1976). A Bayesian procedure for separately estimating

storage and retrieval components of forgetting. Journal of Mathematical
Chen, M.-S., & Chen, J.-Y. (2003). Scheduling of mental processes in the Stroop task:
The critical path method approach. Chineese Journal of Psychology, 45, 379-
400.
Cinlar, E. (1975). Introduction to stochastic processes. Englewood Cliffs, NJ: Prentice-
Hall.
Clark,. F. C. (1958). The effect of deprivation and frequency of reinforcement on
variable-interval responding. Journal of the Experimental Analysis of Behavior,
1, 221-228.
Cohen, J. D., Dunbar, K., & McClelland, J. L. (1990). On the control of automatic
processes: A parallel distributed processing account of the Stroop effect.
Psychological Review, 97, 332-361.
Coles, M. G. H. (1989). Modern mind-brain reading: Psychophysiology, physiology, and
cognition. Psychophysiology, 26, 251-269.
Colonius, H. (1990). Possibly dependent probability summation of reaction time. Journal
of Mathematical Psychology, 34, 253–275.
Colonius, H., & Diederich, A. (2006). The race model inequality: Interpreting a geometric
measure of the amount of violation. Psychological Review, 113, 148-154.
Colonius, H., & Diederich, A. (2009). Time-Window-of-Integration (TWIN) model for
saccadic reaction time: Effect of auditory masker level on visual-auditory
spatial interaction elevation. Brain Topography, 21, 177-184.
Colonius, H., & Vorberg, D. (1994). Distribution inequalities for parallel models with
unlimited capacity. Journal of Mathematical Psychology, 38, 35-58.
Conway,R. W., Maxwell, W. L., & Miller, L. W. (1967). Theory of scheduling. Reading,
MA: Addison-Wesley.
Cowan, N. (2005). Working memory capacity. New York, NY: Psychology Press, Taylor
& Francis Group.
Craik, K. J. W. (1948). Theory of the human operator in control systems, II. British
Journal of Psychology, 38, 142-148.
Curran, T., & Hintzman, D. L. (1995). Violations of the independence assumption in
process dissociation. Journal of Experimental Psychology: Learning, Memory,
and Cognition, 21, 531-547.
Curran, T., & Hintzman, D. L. (1997). Consequences and causes of correlations in
process dissociation. Journal of Experimental Psychology: Learning, Memory,
and Cogntion, 23, 496-504.
Davis, R. (1957). The human operator as a single channel information system. Quarterly
de Jong, R. (1993). Multiple bottlenecks in overlapping task performance. Journal of
Experimental Psychology: Human Perception and Performance, 19, 965-980.
386 References
Dehaene, S. (1996). The organization of brain activations in number comparison: Event-

related potentials and the additive factors method. Journal of Cognitive
Neuroscience, 8, 47-68.
Diederich, A. (1992). Probability inequalities for testing separate activation models of
divided attention. Perception & Psychophysics, 52, 714-716.
Diederich, A. (1995). Intersensory facilitation of reaction time: Evaluation of counter and
diffusion coactivation models. Journal of Mathematical Psychology, 39, 197-
215.
Diederich, A., & Colonius, H. (1991). A further test of the superposition model for the
redundant-signals effect in bimodal detection. Perception & Psychophysics, 50,
83-86.
Dodin, B. (1985). Reducibility of stochastic networks. Omega International Journal of
Management Science, 13, 223–232.
Dodson, C. S., Prinzmetal, W., & Shimamura, A. P. (1998). Using Excel to estimate
parameters from observed data: An example from source memory data.
Behavior Research Methods, Instruments & Computers, 30, 517-526.
Donders, F. C. (1868). Die Schnelligkeit Psychischer Processe. Archiv fur Anatomie und
Physiologie, 657-681. [On the speed of mental processes.] In W. G. Koster (Ed.
and Trans.), (1969), Attention and performance II (pp. 412-431). Amsterdam:
North Holland.
Duncan, J. (1979). Divided attention: The whole is more than the sum of its parts.
Journal of Experimental Psychology: Human Perception and Performance, 5,
216-228.
Duncan, J., & Humphreys, G. (1989). Visual search and stimulus similarity.
Dutta, A., Schweickert, R., Choi, S., & Proctor, R. (1995). Cross-task cross-talk in
memory and perception. Acta Psychologica, 90, 49-62.
Dzhafarov, E. N. (1992). The structure of simple reaction time to step-function signals.
Journal of Mathematical Psychology, 36, 235-268.
Dzhafarov, E. N. (1996, August). A canonical representation for selectively influenced
processes and component times. Paper presented at the Society for
Mathematical Psychology Meeting, Chapel Hill, NC.
Dzhafarov, E. N. (2003a). Selective influence through conditional independence.
Psychometrika, 68, 7-25.
Dzhafarov, E. N. (2003b). Thurstonian type representations for “same-different”
discriminations: Deterministic decisions and independent images. Journal of
Dzhafarov, E. N. (2003c). Thurstonian type representations for “same-different”
discriminations: Probalistic decisions and interdependent images. Journal of
References 387
Dzhafarov, E. N., & Colonius, H. (2006). Regular minimality: A fundamental law of

discrimination. In H. Colonius & E. N. Dzhafarov (Eds.), Measurement and
representation of sensations (pp. 1-46). Mahaw, NJ: Elrbaum.
Dzhafarov, E. N. & Gluhovsky, I. (2006). Notes on selective influence, probabilistic
causality, and probabilistic dimensionality. Journal of Mathematical
Dzhafarov, E. N., & Kujala, J. V. (2010). The joint distribution criterion and the distance
tests for selective probabilistic causality. Frontiers in Quantitative Psychology
and Measurement, 1, 211. doi:10.3389/fpsyg.2010.0021.
Dzhafarov, E. N., & Kujala, J. V. (2011). Selectivity in probabilistic causality: Drawing
arrows from inputs to stochastic outputs. arXiv:1108.3074v2.
Dzhafarov, E. N., & Rouder, J. N. (1996). Empirical discriminability of two models for
stochastic relationship between additive components of reaction time. Journal
of Mathematical Psychology, 40, 48-63.
Dzhafarov, E. N., & Schweickert, R. (1995). Decompositions of response times: An
almost general theory. Journal of Mathematical Psychology, 39, 285-314.
Dzhafarov, E. N., Schweickert, R., & Sung, K. (2004). Mental architectures with
selectively influenced but stochastically interdependent components. Journal of
Egeth, H., & Dagenbach, D. (1991). Parallel versus serial processing in visual search:
Further evidence from subadditive effects of visual quality. Journal of
Ehrenstein, A., Schweickert, R., Choi, S., Proctor, R. W. (1997). Scheduling processes in
working memory: Instructions control the order of memory search and mental
arithmetic. The Quarterly Journal of Experimental Psychology, 50A, 766-802.
Ellenbogen, J. M., Hulbert, J. C., Jiang, Y., & Stickgold, R. (2009). The sleeping brain’s
influence on verbal memory: Boosting resistance to interference. PLoS One, 4,
1-4.
Ellenbogen, J. M., Hulbert, J. C., Stickgold, R., Dinges, D. F., & Thompson-Schill, S. L.
(2006). Interfering with theories of sleep and memory: Sleep, declarative
memory, and associative interference. Current Biology, 16, 1290-1294.
Elmaghraby, S. E. (1977). Activity networks: Project planning and control by network
models. NY: Wiley.
Epstein, R. A., Parker, W. E., & Feiler, A. M. (2008). Evidence for Dissociable Neural
Mechanisms: Two Kinds of fMRI Repetition Suppression? Journal of
Neurophysiology, 99, 2877-2886.
Erdfelder, E., Auer, T. S., Hilbig, B. E., Aβfalg, A., Moshagen, M., & Nadarevic, L.
(2009). Multinomial processing tree models. Zeitschrift für Psychologie/
Journal of Psychology, 217, 108-124.
Eriksen, C. W. & Schultz, D. W. (1979). Information processing in visual search: A
continuous flow conception and experimental results. Perception &
388 References
Estes, W. K. (1991). On types of item coding and source of recall in short-term memory.
In W. E. Hockley & S. Lewandowsky (Eds.), Relating theory and data: Essays
on human memory in honor of Bennet B. Murdock (pp. 155-174). Hillsdale, NJ:
Erlbaum.
Feller, W. (1971). An introduction to probability theory and its applications. Vol. II (2nd
Ed.). NY: John Wiley & Sons.
Fific, M., Little, D. R., & Nosofsky, R. M. (2010). Logical-rule models of classification
response times: A synthesis of mental-architecture, random-walk, and decision-
bound approaches. Psychological Review, 117, 309-348.
Fific, M., Nosofsky, R. M., Townsend, J. T. (2008). Information-processing architectures
in multidimensional classification: A validation test of the systems factorial
technology. Journal of Experimental Psychology: Human Perception and
Performance, 34, 356-375.
Fific, M., & Townsend, J. T. (2003). Properties of visual search task on two items
revealed by the systems factorial methodology. Paper presented at the meeting
of the Society for Mathematical Psychology, Ogden, Utah.
Fill, J. A. & Machida, M. (2001). Stochastic monotonicity and realizable monotonicity.
The Annals of Probability, 29, 938-978.
Fisher, D. L. (1982). Limited channel models of automatic detection: Capacity and
scanning in visual search. Psychological Review, 89, 662-692.
Fisher, D. L. (1984). Central capacity limits in consistent mapping visual search tasks:
Four channels or more? Cognitive Psychology, 16, 449-484.
Fisher, D. L. (1985). Network models of reaction time: The generalized OP diagram. In
G. d’Ydewalle (Ed.), Cognition, Information Processing and Motivation
(Volume 3). Amsterdam: North-Holland Press, 229-254.
Fisher, D. L., Duffy, S. A., Young, C., & Pollatsek, A. (1988). Understanding the central
processing limit in consistent-mapping visual search tasks. Journal of
Fisher, D. L., & Glaser, R. A. (1996). Molar and latent models of cognitive slowing:
Implications for aging, dementia, depression, development, and intelligence.
Psychonomic Bulletin & Review, 4, 458-480.
Fisher, D. L., & Goldstein, W. M. (1983). Stochastic PERT networks as models of
cognition: Derivation of the mean, variance, and distribution of reaction time
using order-of-processing (OP) diagrams. Journal of Mathematical Psychology,
27, 121-151.
Fortin, C., Bedard, M. C., & Champagne, J. (2005). Timing during interruptions in
timing. Journal of Experimental Psychology: Human Perception and
Fortin, C., Rousseau, R., Bourque, P., & Kirouac, E. (1993). Time estimation and
concurrent nontemporal processing: Specific interference from short-term-
memory demands. Perception & Psychophysics, 53, 536-548.
Garner, W. R. (1974). The processing of information and structure. NY: Wiley.
References 389
Gathercole, S. E., Frankish, C. R., Pickering, S. J., & Peaker, S. (1999). Phonotactic
influences on short-term memory. Journal of Experimental Psychology:
Learning, Memory and Cognition, 25, 84-95.
Gluck, M. A., & Bower, G. H. (1988). Evaluating an adaptive network model of human
learning. Journal of Memory and Language, 27, 166-195.
Goldstein, W. M., & Fisher, D. L. (1991). Stochastic networks as models of cognition:
Derivation of response time distributions using the order-of-processing method.
Goldstein, W. M., & Fisher, D. L. (1992). Stochastic networks as models of cognition:
Deriving predictions for resource-constrained mental processing. Journal of
Golumbic, M. C. (1980). Algorithmic graph theory and perfect graphs. NY: Academic
Press.
Gondan, M., & Röder, B. (2006). A new method for detecting interactions between the
senses in event-related potentials. Brain Research, 1073, 389-397.
Gray, W. D., John, B. E., & Atwood, M. E. (1993). Project Ernestine: Validating GOMS
for predicting and explaining real-world task performance. Human Computer
Interaction, 8(3), 237-309.
Greenwald, A. G. (1972). Doing two things at once: Time-sharing as a function of
ideomotor compatibility. Journal of Experiment Psychology, 94, 52-57.
Harris, J. R., Shaw, M. W. and Bates, M. (1979). Visual search in multicharacter arrays
with and without gaps. Perception and Psychophysics, 26, 69-84.
Haxby, J. V., Hoffman, E. A., Gobbini, M. I. (2000). The distributed human neural
system for face perception. Trends in Cognitive Sciences, 4, 223-233.
Hay, J. F., & Jacoby, L. L. (1996). Separating habit and recollection: Memory slips,
process dissociations, and probability matching. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 22, 1323-1335.
Hays, W. L. (1994). Statistics (5 th Ed.). Fort Worth: Harcourt, Brace College Publishers.
Heathcote, A., Brown, S., Wagenmakers, E. J., & Eidels, A. (2010). Distribution-free
tests of stochastic dominance for small samples. Journal of Mathematical
Henson, R. N., Rylands, A., Ross, E., Vuilleumeir, P., & Rugg, M. D. (2004). The effect
of repetition lag on electrophysiological and hemodynamic correlates of visual
object priming. Neuroimage, 21, 1674-1689.
Herman, L. M., & Kantowitz, B. H. (1970). The psychological refractory period: Only
half of the double stimulation story? Psychological Bulletin, 73, 74-88.
Hick, W. E., & Welford, A. T. (1956). Central inhibition: Some refractory observations.
Comment. Quarterly Journal of Experimental Psychology, 8, 39-41.
Hoffman, J. E. (1978). Search through a sequentially presented visual display. Perception
and Psychophysics, 23, 1-11.
Hoffman, J. E. (1979). A two-stage model of visual search. Perception and
390 References
Hommel, B. (1998). Automatic stimulus-response translation in dual-task performance.

1368-1384.
Howard, R. A. (1971). Dynamic Probabilistic Systems. Volume I: Markov Models. New
York: Wiley.
Houpt, J. W., & Townsend, J. T. (2010). The statistical properties of the Survivor
Interaction Contrast. Journal of Mathematical Psychology, 54, 446-453.
Hu, X. (1999). Multinomial processing tree models: An implementation. Behavior
Research Methods, Instrumentation and Computers, 31, 689-695.
Hu, X. (2001). Extending general processing tree models to analyze reaction time
experiments. Journal of Mathematical Psychology. 45 (4), 603-634.
Hu, X. & Batchelder, W. H. (1994). The statistical analysis of general processing tree
models with the EM algorithm. Psychometrika, 59, 21-47.
Hulme, C., Maughan, S., & Brown, G. D. A. (1991). Memory for familiar and unfamiliar
words: Evidence for a long-term memory contribution to short-term memory
span. Journal of Memory and Language, 30, 685-701.
Hulme, C., Roodenrys, S., Schweickert, R., Brown, G. D. A., Martin, S., & Stuart, G.
(1997). Word-frequency effects on short-term memory tasks: Evidence for a
redintegration process in immediate serial recall. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 23, 1217-1232.
Hulme, C., Stuart, G., Brown, G. D. A., & Morin, C. (2003). High- and low-frequency
words are recalled equally well in alternating lists: Evidence for associative
effects in serial recall. Journal of Memory and Language, 49, 500-518.
Ingvalson, E. M., & Wenger, M. J. (2005). A strong test of the dual-mode hypothesis.
Perception & Psychophysics, 67, 14-35.
Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from
intentional uses of memory. Journal of Memory and Language, 30, 513-541.
Jacoby, L. L. (1998). Invariance in automatic influences of memory: Toward a user’s
guide for the process-dissociation procedure. Journal of Experimental
Psychology: Learning, Memory and Cognition, 24, 2-26.
Jacoby, L. L., Begg, I. M., & Toth, J. P. (1997). In defense of functional independence:
Violations of assumptions underlying the process-dissociation procedure?
Journal of Experimental Psychology: Learning, Memory and Cognition, 23,
484-495.
Jacoby, L. L., Toth, J. P., & Yonelinas, A. P. (1993). Separating conscious and
unconscious influences of memory–measuring recollection. Journal of
Experimental Psychology: General, 122,139-154.
Jentzsch, I., Leuthold, H., & Ulrich, R. (2007). Decomposing sources of response
slowing in the PRP paradigm. Journal of Experimental Psychology: Human
Perception and Performance, 33, 610-626.
Johnsen, A. M., & Briggs, G. E. (1973). On the locus of display load effects in choice
reactions. Journal of Experimental Psychology, 99, 266-271.
References 391
Johnson, A., & Proctor, R. W. (2004). Attention: Theory and practice. Thousand Oaks,
Sage.
Johnston, J. C., McCann, R. S., & Remington, R. W. (1995). Chronometric evidence for
two types of attention. Psychological Science, 6, 365-369.
Johnston, J. C., & McCann, R. S. (2006). On the locus of dual-task interference: Is there a
bottleneck at the stimulus classification stage? Quarterly Journal of
Jolicoeur, P., & Dell’Acqua, R. (2000). Selective influence of second target exposure
duration and Task(1) load effects in the attentional blink phenomenon.
Psychonomic Bulletin & Review, 7, 472-479.
Kaerkes, R., & Mohring, R. H. (1978). Voresungen über Ordnungen und Netzplantheorie
(Lectures on orders and a theory of networks). Aachen, Germany: Technischen
Universität Aachen.
Karlin, L. & Kestenbaum, R. (1968). Effects of number of alternatives on the
psychological refractory period. Quarterly Journal of Experimental
Keele, S. W. (1973). Attention and human performance. Pacific Palisades, CA:
Goodyear.
Kelley, C. M., & Jacoby, L. L. (2000). Recollection and familiarity: Process-dissociation.
In E. Tulving & F. I. M. Craik (Eds.) The Oxford handbook of memory (pp.
215-228).
Kelley, J. E. & Walker, M. R. (1959). Critical path planning and scheduling. Proceedings
of the Eastern Joint Computer Conference (pp. 160-173). Boston, MA.
Keppel, G., & Underwood, B. J. (1962). Proactive inhibition in short-term retention of
single items. Journal of Verbal Learning and Verbal Behavior, 1, 153-161.
Kinchla, R. A. (1994). Comments on Batchelder and Riefer multinomial model for source
monitoring. Psychological Review, 101, 166-171.
Kirk, R. E. (1982). Experimental design: Procedures for the behavioral sciences (2nd
ed.). Montery, CA: Brooks/Cole.
Klauer, K. C. & Wegner, I. (1998). Unraveling social categorization in the “Who said
what” paradigm. Journal of Personality and Social Psychology, 75, 1155-1178.
Knapp, B. R. & Batchelder, W. H. (2004). Representing parametric order constraints in
multi-trial applications of multinomial processing tree models. Journal of
Mathematical Psycholology, 48, 215-229.
Kohfeld, D. L., Santee, J. L., & Wallace, N. D. (1981). Loudness and reaction time: II
Identification of detection components at different intensities and frequencies.
Kornblum, S., Hasbroucq, T., & Osman, A. (1990). Dimensional overlap: A cognitive
bias for stimulus-response compatibility--A model and a taxonomy.
Kotz, S., Balakrishnan, N., & Johnson, N. (2000). Continuous multivariate distributions:
Volume 1: Models and applications. NY: Wiley.
392 References
Kounios J., & Holcomb P. (1992). Structure and process in semantic memory: Evidence
from brain related potentials and reaction time. Journal of Experimental
Psychology.: General, 121, 459-479.
Kujala, J. V., & Dzhafarov, E. N. (2008). Testing for selectivity in the dependence of
random variables on external factors. Journal of Mathematical Psychology, 52,
128-144.
Kujala, J. V., & Dzhafarov, E. N. (2010). Erratum to “Testing for selectivity in the
dependence of random variables on external factors” [J. Math. Psych. 52 (2008)
128-144]. Journal of Mathematical Psychology, 54, 400.
Kulkarni, V. G. and Adlakha, V. G. (1986). Markov and Markov-regenerative PERT
networks. Operations Research, 34, 769-781.
Külpe, O. (1895). Outlines of psychology: Based upon the results of experimental
investigation. (E. B. Titchner, Trans.) London: Swan Sonnenschein & Co.
(Original 1893).
Kutas, M., & Hillyard, S. A. (1980). Reading senseless sentences: Brain potentials reflect
semantic incongruity. Science, 207, 203-205.
Lee, D. D., & Seung, H. S. (2001). Algorithms for nonnegative matrix factorization.
Advances in Neural Information Processing Systems 13: Proceedings of the
2000 Conference (pp. 556-562). Cambridge: MIT Press.
Li, X., Schweickert, R., & Gandour, J. (2000). The phonological similarity effect in
immediate recall: Positions of shared phonemes. Memory & Cognition, 28,
1116-1125.
Lien, M.-C., & Proctor, R. W. (2002). Stimulus-response compatibility and psychological
refractory period effects: Implications for response selection. Psychonomic
Bulletin & Review, 9, 212-238.
Lien, M.-C., Schweickert, R., & Proctor, R. W. (2003). Task switching and response
correspondence in the psychological refractory period paradigm. Journal of
Lin, A. (1999). Time to collision with two dimensional motion: Effects of horizontal and
vertical velocity. Unpublished doctoral dissertation, Purdue University, West
Lafayette, IN.
Liu, Y. (1996). Queueing network modeling of elementary mental processes.
Liu, Y. (2008). Queuing network modeling of the Psychological Refractory Period
(PRP). Psychological Review, 115, 913-954.
Liu, Y., Feyen, R., & Tsimhoni, O. (2006). Queueing Network-Model Human Processor
(QN-MHP): A computational architecture for multitask performance in human-
machine systems. ACM Transactions on Computer-Human Interaction, 13, 37-
70.
Liu, Y. S., Holmes, P., & Cohen, J. D. (2008). A neural network model of the Eriksen
task: Reduction, analysis, and data fitting. Neural Computation, 20, 345-373.
References 393
Lively, B. L. (1972). Speed/accuracy trade off and practice as determinants of stage

durations in a memory-search task. Journal of Experimental Psychology, 96,
97-103.
Logan, G. (2002). Parallel and serial processing. In H. Pashler & J. Wixted (Eds.),
Steven’s handbook of experimental psychology: Vol. 4. Methodology in
experimental psychology (pp. 271-300). NY: Wiley.
Logan, G. D., & Burkell, J. (1986). Dependence and independence in responding to
double stimulation: A comparison of stop, change, and dual task paradigms.
549-563.
Logan, G. D., & Delheimer, J. A. (2001). Parallel memory retrieval in dual-task
situations: II. Episodic memory. Journal of Experimental Psychology:
Logan, G. D., & Gordon, R. D. (2001). Executive control of visual attention in dual-task
situations. Psychological Review, 108, 393-434.
Logan, G. D., & Schulkind, M. D. (2000). Parallel memory retrieval in dual-task
situations: I. Semantic memory. Journal of Experimental Psychology: Human
Luce, R. D. (1986). Response times: Their role in inferring elementary mental
organization. New York: Oxford University Press.
Malcolm, D. G., Roseboom, J. H., Clark, C. E., & Fazar, W. (1959). Applications of a
technique for research and development program evaluation. Operations
Research, 7, 646-669.
McCann, R. S. & Johnston, J. C. (1992). Locus of the single-channel bottleneck in dual-
taks performance. Journal of Experimental Psychology: Human Perception and
McClelland, G. (1977). A note on Arbuckle and Larimer, “The Number of Two-Way
Tables Satisfying Certain Additivity Axioms.” Journal of Mathematical
McClelland, J. L. (1979). On the time relations of mental processes: An examination of
systems of processes in cascade. Psychological Review, 86, 287-330.
McClelland, J. L., & Rumelhart, D. E. (Eds.) (1986). Parallel distributed processing
(Vol. 2). Cambridge, MA: MIT Press.
McElree, B., & Dosher, B. A. (1989). Serial position and set size in short-term memory:
The time course of recognition. Journal of Experimental Psychology: General,
118, 346-373.
McGill, W. J., & Gibbon, J. (1965). The general-gamma distribution and reaction times.
McKone, E., Crookes, K., & Kanwisher, N. (2009). The cognitive and neural
development of face recognition in humans. In Gazzaniga (Ed.), The Cognitive
Neurosciences (4th Ed.). Pages 467-482.
394 References
McLeod, P., Driver, J., & Crisp, J. (1988). Visual search for conjunctions of movement
and form in parallel. Nature, 332, 154-155.
Meyer, D. E., & Kieras, D. E. (1997a). A computational theory of executive cognitive
processes and multiple-task performance. Part 1. Basic mechanisms.
Meyer, D. E., & Kieras, D. E. (1997b). A computational theory of executive cognitive
processes and multiple-task performance. Part 2. Accounts of psychological
refractory period phenomena. Psychological Review, 104, 749-791.
Micro Analysis and Design (1985). Micro SAINT [Computer program]. Concord, MA:
MGA, Inc.
Miller, J. O. (1982). Divided attention: Evidence for coactivation with redundant signals.
Cognitive Psychology, 14, 247–279.
Miller, J. (1982). Discrete versus continuous models of human information processing: In
search of partial output. Journal of Experimental Psychology:Human
Miller, J. (1988). Discrete and continuous models of human information processing:
Theoretical distinctions and empirical results. Acta Psychologica, 67, 191-257.
Miller, J. O. (1993). A queue-series model for reaction time, with discrete-stage and
continuous flow models as special cases. Psychological Review, 100, 702-715.
Miller, J., Ulrich, R., & Rolke, B. (2009). On the optimality of serial and parallel
processing in the psychological refractory period paradigm: Effects of the
distribution of stimulus onset asynchronies. Cognitive Psychology, 58, 273-310.
Mochs, J. (1988). Decomposing event-related potentials: A new topographic components
model. Biological Psychology, 26, 119-215.
Molenaar, P. C. M., & van der Molen, M. W. (1986). Steps to a formal analysis of the
cognitive-energetic model of stress and human performance. Acta
Müller, A., & Stoyan, D. (2002). Comparison methods for stochastic models and risks.
NY: Wiley.
Nairne, J. S., & Kelley, M. R. (1999). Reversing the phonological similarity effect.
Memory and Cognition, 27, 45-53.
Nairne, J. S., Neath, I., & Serra, M. (1997). Proactive interference plays a role in the
word-length effect. Psychonomic Bulletin & Review, 4, 541-545.
Nakayama, K., & Silverman, G. H. (1986). Serial and parallel processing of visual
feature conjunctions. Nature, 320, 264-265.
Navon, D. & Miller, J. (1987). Role of outcome conflict in dual task interference.
435-448.
Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based random walk model of
speeded classification. Psychological Review, 104, 266-300.
Ollman, R. T. (1968). Central refractoriness in simple reaction time: The deferred
processing model. Journal of Mathematical Psychology, 5, 49-60.
References 395
Ollman, R. (1990). The matrix product model and the method of additive factors.
Unpublished manuscript.
Oriet C., & Jolicoeur, P. (2003). Absence of perceptual processing during reconfiguration
of task set. Journal of Experimental Psychology: Human Perception and
Osman, A., & Moore, C. M. (1993). The locus of dual-task interference: Psychological
refractory effects on movement-related brain potentials. Journal of
Experimental Psychology: Human Perception and Performance, 19, 1292-
1312.
Pashler, H. (1984). Processing stages in overlapping tasks: Evidence for a central
bottleneck. Journal of Experimental Psychology: Human Perception and
Pashler, H. (1987). Detecting conjunctions of color and form: Reassessing the serial
search hypothesis. Perception & Psychophysics, 41, 191-201.
Pashler, H. (1994). Dual-task interference in simple tasks: Data and theory.
Psychological Bulletin, 116, 220-244.
Pashler, H., & Badgio, P. (1985). Visual attention and stimulus identification. Journal of
Pashler, H., & Johnston, J. C. (1989). Chronometric evidence for central postponement in
temporally overlapping tasks. Quarterly Journal of Experimental Psychology,
41A, 19-45.
Peterson, L. R., & Peterson, M. J. (1959). Short-term retention of individual verbal items.
Phaf, R. H., Van der Heijden, A. H. C. and Hudson, P. T. W. (1990). SLAM: A
connectionist model for attention in visual selection tasks. Cognitive
Poirier, M., & Saint-Aubin, J. (1995). Memory for related and unrelated words: Further
evidence on the Influence of semantic factors in immediate serial recall.
Quarterly Journal of Experimental Psychology, 48A, 384-404.
Poirier, M., Schweickert, R., & Oliver, J. (2005). Silent reading rate and memory span.
Memory, 13,380-387.
Prinzmetal, W., Ivry, R. B., Beck, D., & Shimizu, N. (2002). A measurement theory of
illusory conjunctions. Journal of Experimental Psychology: Human Perception
and Performance, 28, 251-269.
Raab, D. (1962). Statistical facilitation of simple reaction times. Transactions of the New
York Academy of Sciences, 24, 574-590.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 88, 59-108.
Ratcliff, R., Van Zandt, T., & McKoon, G. (1999). Connectionist and diffusion models of
reaction time. Psychological Review, 106, 261-300.
Reynolds, D. (1964). Effects of double stimulation: Temporary inhibition of response.
Psychological Bulletin, 62, 333-347.
396 References
Riefer, D. M., & Batchelder, W. H. (1988). Multinomial modeling and the measurement
of cognitive processes. Psychological Review, 95, 318-339.
Riefer, D. M., & Batchelder, W. H. (1995). A multimomial modeling analysis of the
recognition-failure paradigm. Memory & Cognition, 23, 611-630.
Riefer, D. M., Hu, X. G., & Batchelder, W. H. (1994). Response strategies in source
monitoring. Journal of Experimental Psychology: Learning, Memory and
Cognition, 20, 680-693.
Rinkenauer, G., Ulrich, R., & Wing, A. M. (2001). Brief bimanual force pulses:
Correlations between the hands in force and time. Journal of Experimental
Psychology: Human Perception & Performance, 27, 1485-1497.
Roberts, S. (1987). Evidence for distinct serial processes in animals: The multiplicative-
factors method. Animal Learning & Behavior, 15, 135-173.
Roberts, S., & Sternberg, S. (1993). The meaning of additive reaction-time effects: Tests
of three alternatives. In S. Kornblum & D. E. Meyer (Eds.), Attention and
performance XIV: Synergies in experimental psychology, artificial intelligence,
and cognitive neuroscience--A silver jubilee 9 (pp. 611-654). Cambridge, MA:
MIT Press.
Roodenrys, S. (2009). Explaining phonological neighborhood effects in short-term
memory. In A. Thorn& M. Page (Eds.) (2009). Interactions between short-term
and long-term memory in the verbal domain (pp. 177-198). Hove: Psychology
Press.
Rouse, W. B. (1980). Systems engineering models of human-machine interaction. New
York: North Holland.
Royden, H. L. (1968). Real analysis (2nd ed.). London: Macmillan.
Rumelhart, D. E. (1970). A multicomponent theory of the perception of briefly exposed
visual displays. Journal of Mathematical Psychology, 7, 191-218.
Rumelhart, D. E., & McClelland, J. L. (Eds.) (1986). Parallel distributed processing
(Vol. 1). Cambridge, MA: MIT Press.
Ruthruff, E., Miller, J., & Lachmann, T. (1995). Does mental rotation require central
mechanisms? Journal of Experimental Psychology: Human Perception and
Sanders, A. F. (1990). Issues and trends in the debate on discrete versus continuous
processing of information. Acta Psychologica, 74, 123-167.
SAS Institute, Inc. (1985). User’s guide: Statistics. Cary, NC: SAS Institute, Inc.
Schmidt, R. A., Zelaznik, H., Hawkins, B., Frank, J. S., & Quinn, J. T. (1979). Motor-
output variability: A theory for the accuracy of rapid motor acts. Psychological
Review, 86, 415-451.
Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information
processing: I. Detection, search, and attention. Psychological Review, 84, 1-66.
Schuberth, R. E., Spoehr, K. T., & Lane, D. M. (1981). Effects of stimulus and contextual
information on the lexical decision process. Memory & Cognition, 9, 68-77.
References 397
Schvaneveldt, R. W. (1969). Effects of complexity in simultaneous reaction time tasks.

Schwarz, W. (1989). A new model to explain the redundant-signals effect. Perception &
Schwarz, W., & Ischebeck, A. (2001). On the interpretation of response time vs. onset
asynchrony functions: Applications to dual-task and precue-utilization
paradigms. Journal of Mathematical Psychology, 45, 452-479.
Schweickert, R. (1978). A critical path generalization of the additive factor method:
Analysis of a Stroop task. Journal of Mathematical Psychology, 18, 105-139.
Schweickert, R. (1982). The bias of an estimate of coupled slack in stochastic PERT
networks. Journal of Mathematical Psychology, 26, 1-12.
Schweickert, R. (1983a). Latent network theory: Scheduling of processes in sentence
verification and the Stroop effect. Journal of Experimental Psychology:
Schweickert, R. (1983b). Synthesizing partial orders given comparability information:
Partitive sets and slack in critical path networks. Journal of Mathematical
Schweickert, R. (1985). Separable effects of factors on speed and accuracy: Memory
scanning, lexical decision, and choice tasks. Psychological Bulletin, 97,
530-546.
Schweickert, R. (1993). A multinomial processing tree model for degradation and
redintegration in immediate recall. Memory & Cognition, 21, 168-175.
Schweickert, R. & Boggs, G. J. (1984). Models of central capacity and concurrency.
Schweickert, R., & Chen, S. (2008). Tree inference with factors selectively influencing
processes in a processing tree. Journal of Mathematical Psychology, 52, 158-
183.
Schweickert, R., Fisher, D. L., & Goldstein, W. M. (2010). Additive factors and stages of
mental processes. Journal of Mathematical Psychology, 54, 405-414.
Schweickert, R., Fisher, D. L., & Proctor, R. W. (2003). Steps toward building
mathematical and computer models from cognitive task analyses. Human
Factors, 45, 77-103.
Schweickert, R., Fortin, C., & Sung, K. (2007). Concurrent visual search and time
reproduction with cross-talk. Journal of Mathematical Psychology, 51, 99-121.
Schweickert, R., & Giorgini, M. (1999). Response time distributions: Some simple
effects of factors selectively influencing mental processes. Psychonomic
Schweickert, R., Giorgini, M., & Dzhafarov, E. N. (2000). Selective influence and
response time cumulative distribution functions in serial–parallel task
networks. Journal of Mathematical Psychology, 44, 504–535.
398 References
Schweickert, R., & Townsend, J. T. (1989). A trichotomy: Interactions of factors

prolonging sequential and concurrent mental processes in stochastic discrete
mental (PERT) networks. Journal of Mathematical Psychology, 33, 328-347.
Schweickert,R., & Wang, Z. (1993). Effects on response time of factors selectively
influencing processes in acyclic task networks with OR gates. British Journal
of Mathematical and Statistical Psychology, 46, 1-40.
Schweickert, R., & Xi, Z. (2011). Multiplicatively interacting factors selectively
influencing parameters in multiple response class processing and rate trees.
Searcy, J. H., & Bartlett, J. C. (1996). Inversion and processing of component and spatial-
relational information in faces. Journal of Experimental Psychology: Human
Perception & Performance, 22, 904-915.
Shaked, M., & Shanthikumar, J. G. (2007). Stochastic orders. NY: Springer.
Shwartz, S. P., Pomerantz, J. R., & Egeth, H. E. (1977). State and process limitations in
information-processing: Additive factors analysis. Journal of Experimental
Psychology: Human Perception and Performance, 3, 402-410.
Simson, R., Vaughan Jr., H. G., & Ritter, W. (1976). The scalp topography of potentials
associated with missing visual or auditory stimuli. Electorencephalograph Clin
Neurophysiol, 40, 33-42.
Slotnick, S. D., Klein, S. A., Dodson, C. S., & Shimamura, A. P. (2000). An analysis of
signal detection and threshold models of source memory. Journal of
Experimental Psychology: Learning, Memory and Cognition, 26, 1499-1517.
Smith, M. C. (1969). The effect of varying information on the psychological refractory
period. In W. G. Koster (Ed.), Attention and performance II. Acta
Smith, R. E., & Bayen, U. J. (2004). A multinomial model of event-based prospective
memory. Journal of Experimental Psychology: Learning, Memory and
Cognition, 30, 756-777.
Stahl, C. & Klauer, K. C. (2007). HMMTree: A computer program for latent-class
hierarchical multinomial processing tree models. Behavior Research Methods,
39, 267-273.
Sternberg, S. (1964). Estimating the distribution of additive reaction-time components.
Paper presented at the meeting of the Psychonomic Society, Niagra Falls,
Ontario, Canada.
Sternberg, S. (1966). High-speed scanning in human memory. Science, 153, 652-654.
Sternberg, S. (1967). Two operations in character recognition: Some evidence from
reaction-time measurements. Perception & Psychophysics, 2, 45-53.
Sternberg, S. (1969). The discovery of processing stages: Extensions of Donders’
method. In W. G. Koster (Ed.), Attention and performance II. Amsterdam:
North Holland.
Sternberg, S. (1998). Discovering mental processing stages: The method of additive
factors. In D. Scarborough & S. Sternberg (Eds.), An invitation to cognitive
References 399
science: Vol. 4. Methods, models and conceptual issues (pp. 703-864).

Cambridge, MA: MIT Press.
Sternberg, S. (2001). Separate modifiability, mental modules and the use of pure and
composite measures to reveal them. Acta Psychologica, 106, 147-246.
Stoyan, D. (1983). Comparison methods for queues and other stochastic models.
Chichester: Wiley.
Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of
Stuart, G. P., & Hulme, C. (2009). Lexical and semantic influences on immediate serial
recall: A role for redintegration. In A. Thorn& M. Page (Eds.) (2009).
Interactions between short-term and long-term memory in the verbal domain
(pp. 157-176). Hove: Psychology Press.
Sung, K. (2008). Serial and parallel attentive visual searches: Evidence from cumulative
distribution functions of response times. Journal of Experimental Psychology:
Human Perception and Performance, 34, 1372-1388.
Teder-Sälejärvi, W. A., McDonald, J. J., Russo, F., & Hillyard, S. A. (2002). An analysis
of audio-visual crossmodal integration by means of event-related potential
(ERP) recordings. Cognitive Brain Research, 14, 106-114.
Tehan, G., & Humphreys, M. S. (1995). Transient phonemic codes and immunity to
proactive interference. Memory and Cognition, 23, 181-191.
Tehan, G., & Turcotte, J. (1997). The role of cues and codes in proactive interference
effects in immediate serial recall. Unpublished manuscript, University of
Southern Queensland, Toowoomba, Australia.
Tehan, G., & Turcotte, J. (2002). Word length effects are not due to proactive
interference. Memory, 10, 139-149.
Telford, C. W. (1931). The refractory phase of voluntary and associative responses.
Thomas, R. D. (2006). Processing time predictions of current models of perception in the
classic additive factors paradigm. Journal of Mathematical Psychology, 50,
441-455.
Thomas, R. D., & Gallogly, D. (1996). Some consequences of the RT-distance
hypothesis on factorial additivity. Journal of Mathematical Psychology, 40,
353-353.
Thorn, A. S. C., Frankish, C. R., & Gathercole, S. (2009). The influence of long-term
knowledge on short-term memory: Evidence for multiple mechanisms. In A.
Thorn& M. Page (Eds.) (2009). Interactions between short-term and long-term
memory in the verbal domain (pp. 198-219). Hove: Psychology Press.
Thorn, A. S. C., Gathercole, S. E., & Frankish, C. R. (2005). Redintegration and the
benefits of long-term knowledge in verbal short-term memory: An evaluation
of Schweickert’s (1993) multinomial processing tree model. Cognitive
400 References
Thorn, A. & Page, M. (Eds.) (2009). Interactions between short-term and long-term
memory in the verbal domain. Hove: Psychology Press.
Thornton, T. L., & Gilden, D. L. (2007). Parallel and serial processes in visual search.
Tombu, M., & Jolicoeur, P. (2003). A central capacity sharing model of dual-task
performance. Journal of Experimental Psychology: Human Perception &
Townsend, J. T. (1971). A note on the identification of parallel and serial processes.
Townsend, J. T. (1972). Some results concerning the identifiability of parallel and serial
processes. British Journal of Mathematical and Statistical Psychology, 25, 168-
197.
Townsend, J. T. (1974). Issues and models concerning the processing of a finite number
of inputs. In B. H. Kantowitz (Ed.), Human information processing: Tutorials
in performance and cognition. Hillsdale, NJ: Erlbaum, pp. 133-168.
Townsend, J. T. (1984). Uncovering mental processes with factorial experiments. Journal
of Mathematical Psychology, 28, 363-400.
Townsend, J. T. (1990). Truth and consequences of ordinal differences in statistical
distributions: Toward a theory of hierarchical inference. Psychological Bulletin,
108, 551-567.
Townsend, J. T., & Ashby, F. G. (1983). Stochastic modeling of elementary
psychological processes. Cambridge: Cambridge University Press.
Townsend, J. T., & Fific, M. (2004). Parallel versus serial processing and individual
differences in high-speed search in human memory. Perception &
Townsend, J. T., & Nozawa, G. (1995). Spatio-temporal properties of elementary
perception: An investigation of parallel, serial, and coactive theories. Journal of
Townsend, J. T., & Schweickert, R. (1989). Toward the trichotomy method of reaction
times: Laying the foundation of stochastic mental networks. Journal of
Townsend, J. T., & Thomas, R. D. (1994). Stochastic dependencies in parallel and serial
models: Effects on systems factorial interactions. Journal of Mathematical
Townsend, J. T., & Wenger, M. J. (2004). A theory of interactive parallel processing:
New capacity measures and predictions for a response time inequality series.
Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention.
Cognitive Psychology, 12, 97-136.
Treisman, A., & Sato, S. (1990). Conjunction search revisited. Journal of Experimental
Tsao, D. (2006). A dedicated system for processing faces, Science, 314(5796). pp. 72-73.
References 401
Ulrich, R., Fernández, S. R., Jentzsch, I., Rolke, B., Schröter, H., & Leuthold, H. (2006).
Motor limitation in dual-task processing under ballistic movement conditions.
Psychological Science, 17, 788-793.
Ulirch, R., & Miller, J. (1997). Tests of race models for reaction time in experiments with
asynchronous redundant signals. Journal of Mathematical Psychology, 41, 367-
381.
Ulrich, R. & Miller, J. (2008). Response grouping in the psychological refractory period
(PRP) paradigm: Models and contamination effects. Cognitive Psychology, 57,
75-121.
Ulrich, R. & Wing, A. M. (1991). A recruitment theory of force-time relations in the
production of brief force pulses: The parallel force unit model. Psychological
Review, 98, 268-294.
Valls, V., Laguna, M., Lino, P., Pérez, A., & Quintanilla, S. (1998). Project scheduling
with stochastic activity interruptions. In J. Weglarez (Ed.), Recent advances in
project scheduling (pp. 333-354). Boston: Kluwer Academic Publishers.
van Lankveld, J. J. D. M., & Smulders, F. T. Y. (2008). The effect of visual sexual
content on the event-related potential. Biological Psychology, 79, 200-208.
Van Selst, M. & Jolicoeur, P. (1994). Can mental rotation occur before the dual-task
bottleneck? Journal of Experimental Psychology: Human Perception and
Van Selst, M., & Jolicoeur, P. (1997). Decision and response in dual-task interference.
Cognitive Psychology, 33, 266-307.
Van Zandt, T. (2002). Analysis of response time distributions. In H. Pashler & J. T.
Wixted (Eds.), Stevens’ handbook of experimental psychology: Vol. 4.
Methodology in experimental psychology (3rd ed., pp. 461-516). NY: Wiley.
Van Zandt, T., & Ratcliff, R. (1995). Statistical mimicking of reaction-time data: Single
process models, parameter variability, and mixtures. Psychonomic Bulletin &
Review, 2, 20-54.
Van Zandt, T., & Townsend, J. T. (1993). Self-terminating versus exhaustive processes in
rapid visual and memory search: An evaluative review. Perception &
Psychophysics, 53, 563–580.
Vorberg, D. & Schwarz, W. (1988). Network models of reaction times. Paper presented at
the XXIV International Congress of Psychology, Sidney, Australia.
Voss, A., Rothermund, K., & Voss, J. (2004). Interpreting the parameters of the diffusion
model: An empirical validation. Memory & Cognition, 32, 1206-1220.
Ward, R., & McClelland, J. L. (1989). Conjunctive search for one and two identical
targets. Journal of Experimental Psychology: Human Perception and
Waugh, N. C., & Norman, D. A. (1965). Primary memory. Psychological Review, 72, 89-
104.
Welford, A. T. (1952). The ‘psychological refractory period’ and the timing of high-speed
performance--a review and a theory. British Journal of Psychology, 43, 2-19.
402 References
Welford, A. T. (1959). Evidence of a single-channel decision mechanism limiting

performance in a serial reaction task. Quarterly Journal of Experimental
Welford, A. T. (1967). Single channel operation in the brain. Acta Psychologica, 27, 5-
22.
Wickens, C. D. (1976). The effects of divided attention in information processing in
tracking. Journal of Experimental Psychology: Human Perception and
Wenger, M. J., & Townsend, J. T. (2000). Basic response time tools for studying general
processing capacity in attention, perception, and cognition. Journal of General
Wiggs, C. L., & Martin, A. (1998). Properties and mechanisms of perceptual priming.
Current Opinion in Neurobiology, 8, 227-233.
Williams, R. J. (1986). The logic of activation functions. In D. E. Rumelhart and J. L.
McClelland (Eds.) Parallel distributed processing (Vol. 1). Cambridge, MA:
MIT Press. pp. 423-443.
Wolfe, J. M. (1994). Guided Search 2.0: A revised model of visual search. Psychonomic
Wolfe, J. M. (1998). Visual search. In Pashler, H. (Ed.). Attention, (pp. 13-74). East
Sussex, UK: Psychology Press.
Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided Search: An alternative to the
Feature Integration model for visual search. Journal of Experimental
Wu, C.X., & Liu, Y. L. (2008). Queuing network modeling of the Psychological
Refractory Period (PRP). Psychological Review, 115, 913-954.
Yonelinas, A. P. (1994). Receiver operating characteristics in recognition memory:
Evidence for a dual-process model. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 20, 1341-1354.
Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years
of research. Journal of Memory and Language, 46, 441-517.
Yonelinas, A. P., Aly, M., Wang, W.-C., & Koen, J. D. (2010). Recollection and
familiarity: Examining controversial assumptions and new directions.
Hippocampus, 20, 1178-1194.
Yu, J. C., & Bellezza, F. S. (2000). Process dissociation as source monitoring. Journal of
Experimental Psychology: Learning, Memory and Cognition, 26, 1518-1533.
Author Index
Adlakha, V. G., 256, 260 Burkell, J., 97

Aly, M., 301 Busemeyer, J. R., 274
Anderson, J. R., 15 Card, S. K., 14
Arbuckle, J., 39 Cattell, J. M., 1
Ashby, F. G., 5, 9, 15, 34, 44, 84, 152, Cave, K., 225
153, 165, 190, 191, 213, 228, 229, Champagne, J., 202
235, 238, 243, 245, 251, 259, 283, Chechile, R., 293, 303
292, 359 Chechile, R. A., 18, 303
Atwood, M. E., 13 Chen, J.-Y., 125
Auer, T. S., 300 Chen, M.-S., 125
Aβfalg, A., 300 Chen, S., 330, 331, 332, 334, 336
Backus, B. T., 155 Choi, S., 8, 32, 121, 196, 215
Badgio, P., 226 Cinlar, E., 168, 201
Balakrishnan, J. D., 235 Clark, C. E., 11
Balakrishnan, N., 206 Clark, F. C., 298
Bamber, D., 336, 341, 358 Cohen, J. D., 274
Barth, D. S., 347, 349 Coles, M. G. H., 127
Bartle, R. G., 52, 55, 62 Colonius, H., 147, 195, 197, 198, 200,
Bartlett, J. C., 252 244, 360
Batchelder, W. H., 15, 16, 18, 19, 300, Conway, R. W., 97
301 Cowan, N., 283
Bates, M., 280 Craik, K. J. W., 141
Bayen, U. J., 301, 302 Crisp, J., 225
Beck, D., 15 Crookes, K., 252
Bedard, M.C., 202 Curran, T., 300
Begg, I. M., 300, 301 Dagenbach, D., 198, 225, 226, 227,
Bellezza, F. S., 302 230, 231, 251
Boggs, G. J., 95 Davis, R., 8, 96, 105, 107
Borger, R., 141, 142, 143, 144 de Jong, R., 8, 32, 97, 98, 112, 114,
Bourque, P., 202, 239 115, 116, 119, 133, 135, 138
Bower, G. H., 15, 274 Delheimer, J. A., 190, 196
Boynton, G., 245, 251 Dell'Acqua, R., 296, 297
Brett, B., 347 Desimone, R., 252
Briggs, G. E., 226 Di, S., 347
Broadbent, D. E., 95, 97, 282 Diederich, A., 147, 198, 200, 244, 348
Brown, G. D. A., 303, 317, 326, 327 Dinges, D. F., 328
Brown, J., 304 Dodin, B., 176
Brown, S., 170 Dodson, C. S., 19, 300
Brown, S. W., 238, 239 Donders, F. C., 1, 2, 8, 293
Bruce, C., 252 Dosher, B. A., 235
Buchner, A., 301, 303, 307, 355 Driver, J., 225
Bundesen, C., 234 Duffy, S. A., 280
403
404 Author Index
Dunbar, K., 274 Gobbini, M. I., 252

Duncan, J., 146, 231, 282 Goldberg, N., 347
Dutta, A., 121, 196, 215 Goldstein, W. M., 9, 34, 44, 60, 61,
Dzhafarov, E. N., 5, 28, 52, 71, 72, 74, 258, 256, 263, 267, 268, 283, 285,
75, 166, 167, 176, 178, 187, 190, 286
191, 205, 208, 210, 212, 214, 215, Golumbic, M. C., 51, 58, 150
222, 229, 360, 364, 366, 367, 368, Gondan, M., 347, 348, 349
369, 370, 371, 373, 374, 375, 376, Gordon, R. D., 138, 147, 196
378, 379, 380 Gray, W. D., 13
Egeth, H., 189, 190, 198, 225, 226, Greenwald, A. G., 97, 123, 124, 125,
227, 230, 231, 251, 293, 296 126
Ehrenstein, A., 8, 32, 121 Gross, C. G., 252
Eidels, A., 170 Han, H. J., 357
Ellenbogen, J. M., 328, 356, 357 Harris, J. R., 280
Elmaghraby, S. E., 11 Hasbroucq, T., 32
Epstein, R. A., 352, 353 Hawkins, B., 140
Erdfelder, E., 300, 301, 303, 307, 355 Haxby, J. V., 252
Eriksen, C. W., 5 Hay, J. J., 301
Estes, W. K., 302 Hays, W. L., 66
Fazar, W., 11 Heathcote, A., 170
Feiler, A. M., 352 Henson, R. N., 352
Feller, W., 65, 172 Herman, L. M., 97
Fernández, S. R., 139 Hick, W. E., 127, 146
Feyen, R., 94 Hilbig, B. E., 300
Fific, M., 198, 229, 235, 236, 238, 243, Hillyard, S. A., 346, 347
244, 245, 246, 248, 249, 251, 252 Hintzman, D. L., 300
Fill, J. A., 74, 75 Hoffman, E. A., 252
Fisher, D. L., 8, 9, 14, 34, 44, 60, 61, Hoffman, J. E., 280
191, 229, 234, 256, 258, 263, 267, Holcomb, P. B., 346, 347
268, 280, 283, 285, 286 Holmes, P., 274
Fortin, C., 20, 196, 202, 215, 239, 240, Hommel, B., 147, 190, 195, 196, 197
242, 243, 251 Houpt, J. W., 169
Frank, J. S., 140 Howard, R. A., 260
Frankish, C. R., 303, 304, 317, 332 Hu, X., 16, 19, 293, 301
Franzel, S. L., 225 Hudson, P. T. W., 274
Gallogly, D., 293 Hulbert, J. C., 328
Gandour, J., 303 Hulme, C., 303, 317, 326, 327, 328
Garner, W. R., 243 Humphreys, G., 231, 282
Gathercole, S. E., 303, 304, 317, 332 Humphreys, M. S., 310
Geffen, G., 282 Ingvalson, E. M., 252, 253, 254, 255
Gelade, G., 224, 229, 230, 282 Ischebeck, A., 271
Gibbon, J., 259 Ivry, R., 15
Gilden, D. L., 229 Jacoby, L. L., 16, 17, 300, 301
Giorgini, M., 28, 176, 178, 187, 229 Jentzsch, I., 106, 111, 133, 134, 135,
Glaser, R. A., 8 136, 137, 138, 139
Gluck, M. A., 274 Jiang, Y., 328
Gluhovsky, I., 71, 191, 210, 360, 364, John, B., 14
380 John, B. E., 13
Author Index 405
Johnsen, A. M., 226 Lin, A., 46

Johnson, A., 93 Lino, P., 202
Johnson, N., 206 Little, D. R., 248, 251
Johnston, J. C., 8, 96, 105, 106, 107, Liu, Y. L., 9, 94, 280
108, 109, 111, 114, 120, 121, 133, Liu, Y. S., 274
137, 141, 268, 272, 287 Lively, B. L., 190, 296
Jolicoeur, P., 8, 97, 101, 102, 111, 113, Logan, G. D., 31, 97, 138, 147, 190,
138, 139, 191, 296, 297 196, 203, 215
Kaerkes, R., 176 Luce, R. D., 9, 34, 37, 65, 95, 165, 259
Kantowitz, B. H., 97 Machida, M., 74, 75
Kanwisher, N., 252 Maddox, W. T., 15, 243
Karlin, L., 20, 100, 101, 102, 103, 105, Malcom, D. G., 11
111, 112, 113, 114, 120, 121 Martensen, H., 301
Keele, S. W., 97, 197 Martin, A., 352
Kelley, C. M., 301 Martin, S., 326
Kelley, J. E., 11 Maughan, S., 303, 317, 326
Kelley, M. R., 310 Maxwell, W. L., 97
Keppel, G., 305, 307, 310, 316, 355 McCann, R. S., 8, 106, 111, 114, 121,
Kestenbaum, R., 20, 101, 102, 103, 137
105, 111, 112, 113, 114, 120, 121 McClelland, G., 39
Kieras, D. E., 7, 15, 93, 94 McClelland, J. L., 5, 273, 274
Kinchla, R. A., 19, 300 McDonald, J. J., 347
Kirk, R. E., 39 McElree, B., 235
Kirouac, E., 202, 239 McGill, W. J., 259
Klauer, K. C., 15, 19 McKone, E., 252
Klein, S. A., 19, 300 McKoon, G., 274
Knapp, B. R., 18 McLeod, P., 225, 229
Koen, J. D., 301 Meyer, D. E., 7, 15, 93, 94
Kohfeld, D. L., 34, 259 Meyer, D. L., 18, 303
Kornblum, S., 32 MICROSAINT, 37, 41, 45
Kotz, S., 206 Miller, J., 8, 9, 97, 113, 121, 141, 142,
Kounios, J., 346, 347 143, 144, 145, 146, 147, 191, 195,
Kujala, J. V., 28, 71, 191, 210, 215, 197, 198, 244, 271, 280, 282, 348
222, 360, 364, 366, 367, 368, 369, Miller, L. W., 97
370, 371, 373, 374, 375, 376, 378, Mochs, J., 345
379, 380 Mohring, R. H., 176
Kulkarni, V. G., 256, 260 Molenaar, P.C.M., 33
Külpe, O., 1, 2 Moore, C. M., 8, 127, 128, 129, 130,
Kutas, M., 346 131, 132, 135
Lachmann, T., 8, 113 Moran, T. P., 14
Laguna, M., 202 Morin, C., 317, 327
Lane, D. M., 190, 296 Moshagen, M., 300
Larimer, J., 39 Müller, A., 27, 68, 69
Lee, D. D., 343 Murnane, K., 301
Lee, W. W., 245, 251 Myung, I. J., 274
Leuthold, H., 106, 133, 138, 139 Nadarevic, L., 300
Li, X., 303 Nairne, J. S., 310, 325, 357
Lien, M.-C., 123, 138, 146, 196, 215 Nakayama, K., 225, 229
406 Author Index
Navon, D., 121, 195 Roseboom, J. H., 11

Neath, I., 325, 357 Ross, E., 352
Newell, A., 14 Rothermund, K., 292
Norman, D. A., 303 Rothkegel, R., 19
Nosofsky, R. M., 198, 243, 248, 251, Rouder, J. N., 190
292 Rouse, W. B., 279
Nozawa, G., 167, 168, 170, 172, 174, Rousseau, R., 202, 239
191, 195, 197, 200, 201, 214, 229, Royden, H. L., 221, 376
244, 246, 249, 283, 360 Rugg, M. D., 352
Oliver, J., 318 Rumelhart, D. E., 273, 282
Ollman, R. T., 95, 336, 341, 344 Russo, F., 347
Oriet, C., 138, 139 Ruthruff, E., 8, 113
Osman, A., 8, 32, 127, 128, 129, 130, Rylands, A., 352
131, 132, 135 Saint-Aubin, J., 317
Page, M., 304 Sanders, A. F., 4
Palmeri, T. J., 292 Santee, J. L., 34, 259
Parker, W. E., 352 SAS Institute, 269
Pashler, H., 8, 96, 105, 107, 108, 109, Sato, S., 224, 225, 229
111, 119, 120, 133, 141, 225, 226, Schmidt, R. A., 140
229, 268, 272, 287 Schneider, W., 282
Peaker, S., 303, 332 Schröter, H., 139
Pérez, A., 202 Schuberth, R. E., 190, 296
Peterson, L. R., 304 Schulkind, M. D., 138, 147, 190, 196,
Peterson, M. J., 304 203, 215
Phaf, R. H., 274 Schultz, D. W., 5
Pickering, S. J., 303, 332 Schvaneveldt, R. W., 6
Poirier, M., 317, 318, 357 Schwarz, W., 33, 200, 271
Pollatsek, A., 280 Schweickert, R., 5, 8, 14, 19, 20, 25,
Pomerantz, J. R., 189, 190, 293, 296 27, 28, 31, 32, 37, 38, 40, 42, 43,
Prinzmetal, W., 15, 19 45, 50, 51, 52, 54, 57, 58, 60, 61,
Proctor, R. W., 8, 14, 32, 93, 121, 123, 62, 75, 82, 83, 84, 86, 92, 95, 97,
138, 146, 196, 215 108, 121, 124, 125, 126, 138, 150,
Quinn, J. T., 140 166, 167, 176, 178, 187, 190, 191,
Quintanilla, S., 202 196, 212, 214, 215, 226, 229, 239,
Raab, D., 197 240, 241, 242, 243, 251, 293, 294,
Ratcliff, R., 164, 165, 213, 274, 291, 296, 302, 303, 318, 326, 330, 331,
293 332, 334, 336, 337, 338, 366, 367
Remington, R. W., 8 Searcy, J., 252
Reynolds, D., 97 Serra, M., 325
Riefer, D. M., 15, 18, 19, 300, 301 Seung, H. S., 343
Rinkenauer, G., 141 Shaked, M., 68, 69
Ritter, W., 348 Shanthikumar, J. G., 68, 69
Roberts, S., 5, 154, 155, 156, 157, 158, Shaw, M. W., 280
159, 160, 164, 165, 166, 167, 189, Shiffrin, R. M., 282
190, 213, 298, 336 Shimamura, A. P., 19, 300
Röder, B., 347, 348, 349 Shimizu, N., 15
Rolke, B., 97, 139, 191 Shwartz, S. P., 189, 190, 293, 296
Roodenrys, S., 326, 328 Silverman, G. H., 225, 229
Author Index 407
Simson, R., 348 Ulrich, R., 97, 106, 111, 133, 135, 137,
Slotnick, S. D., 19, 300 138, 139, 140, 141, 142, 143, 144,
Smith, M. C., 101, 103, 105, 268, 269, 145, 146, 191, 198, 271
270, 271, 287 Underwood, B. J., 305, 307, 310, 316,
Smith, R. E., 302 355
Smulders, F. T. Y., 349, 350 Valls, V., 202
Spoehr, K. T., 190, 296 Van der Heijden, A. H. C., 274
Stahl, C., 19 van der Molen, M. W., 33
Steffens, M. C., 301 Van Lankveld, J. J. D. M., 349, 350
Sternberg, S., 3, 4, 5, 6, 8, 20, 27, 34, van Santen, J. P. H., 336, 341, 358
58, 153, 154, 155, 156, 157, 158, Van Selst, M., 8, 101, 102, 111, 113
159, 160, 164, 165, 166, 167, 185, Van Zandt, T., 20, 164, 165, 213, 235,
189, 190, 213, 214, 222, 226, 227, 243, 274
234, 238, 243, 259, 291, 293, 294, Vaughan Jr., H. G., 348
359 Vorberg, D., 33, 195, 198
Stickgold, R., 328 Voss, A., 292
Stoyan, D., 27, 68, 69 Voss, J., 292
Stroop, J. R., 123, 124 Vuilleumeir, P., 352
Stuart, G., 317, 326, 327 Wagenmakers, E. J., 170
Stuart, G. P., 328 Walker, M. R., 11
Sung, K., 20, 176, 178, 187, 191, 196, Wallace, N. D., 34, 259
212, 214, 215, 229, 230, 231, 232, Wang, W.-C., 301
234, 239, 251 Wang, Z., 28, 31, 38, 40, 42, 45, 52, 54,
Teder-Sälejärvi, W. A., 347 62, 92
Tehan, G., 308, 309, 310, 311, 312, Waugh, N. C., 303
326, 356, 357 Wegener, I., 15
Tein, J.-Y., 235 Welford, A. T., 7, 8, 20, 94, 95, 97,
Telford, C. W., 6, 93 127, 134, 146, 191
Thomas, R. D., 190, 292, 293, 360 Wenger, M. J., 191, 194, 195, 198, 244,
Thompson-Schill, S. L., 328 252, 253, 254, 255, 283
Thorn, A., 304 West, A. N., 239
Thorn, A. S. C., 304, 317 Wickens, C. D., 282
Thornton, T. L., 229 Wiggs, C. L., 352
Tombu, M., 97, 191 Williams, R. J., 276, 277
Toth, J. P., 17, 300, 301 Wing, A. M., 141
Townsend, J. T., 5, 8, 9, 26, 27, 28, 34, Wolfe, J. M., 224, 225, 228, 229
37, 42, 44, 45, 50, 51, 60, 68, 75, Wu, C. X., 9, 94, 280
84, 152, 153, 165, 167, 168, 169, Wundt, W., 1
170, 172, 174, 190, 191, 194, 195, Xi, Z., 330, 336, 337, 338, 358
197, 198, 200, 201, 213, 214, 226, Yonelinas, A. P., 17, 301
228, 229, 235, 236, 238, 243, 244, Young, C., 280
246, 249, 256, 259, 283, 359, 360, Yu, J. C., 302
366, 367 Zelaznik, H., 140
Treisman, A., 224, 225, 229, 230, 282
Tsao, D., 252
Tsimhoni, O., 94
Turcotte, J., 308, 309, 310, 311, 312,
326, 356
Subject Index
accuracy BOLD (Blood-Oxygen-Level-

models of, 293, 294 Depencence) signal, 351
multiplicative effects, 294, 295, Boole's Inequality, 197
296, 297 Borger model for response grouping,
active processes, 257 142, 143, 144
acyclic networks, 10 bottleneck process, 107
acyclic task networks canonical factorization, 343
directed, 9, 10, 11, 12, 13 capacity coefficient, 193
in Human Factors, 13, 14 capacity limits and process
systems not easily represented in, dependence, 191, 192, 193, 194,
14, 15 195
Additive Amplitude Method, 346, 347, cascade model, 5
348, 349 Cauchy criterion
additive areas of evoked potentials, for convergence, 52
349, 350 for double sequences, 54
additive BOLD signals, 350, 352, 353 CDF IC (Cumulative Distribution
Additive Factor Method, 3, 153, 213, Function Interaction Contrast)
226, 359 testing serial attentive processing,
additive factors, 3, 4, 5 231, 234
additivity testing serial preattentive stage, 229
monotonic interaction contrasts and, time reproduction and, 239, 240,
58, 59, 60, 61 242
Alternate Pathways Model, 157, 158, central processing bottleneck
160, 161, 164, 213 hypothesis
AND gates, 11 alternative to, 97
AND networks, 11 central refractoriness process, 107
OR networks compared to, 90, 91 Channel Summation Model, 200, 201,
Order-of-Processing diagrams and 202
overview of, 257 child (processing trees), 299
states, 257 children of vertex, 17
transitions, 258, 259 coactivation and process dependence
overview of, 256 failure of selective influence with
arc (processing trees), 299 coactivation, 200, 201, 202
arcs overview of, 197, 198
duration of, 12 selective influence with
traversed and reached, 18 coactivation, 198, 199
attentional blink paradigm, 296 coactivation models, 244, 248, 249,
bar charts, 7, 8, 9 251
Bartlett, J. C., 252 cognitive task analysis, 13, 14
block matrices, 342 commutative, associative operations
Blood-Oxygen-Level-Depencence and Decomposition Test, 166, 167
(BOLD) signal, 351
409
410 Subject Index
complete marginal selectivity, 367, coupled slack, 25, 51, 206

368, 369 CPM-GOMS method, 13, 14
computing moments of response time critical path models of dual tasks
Order-of-Processing diagrams and central and response limitations, 97,
overview of, 259 98
process durations, 259, 260, 261, central limitations, 94, 96
262, 263, 264, 265, 266 overview of, 93, 94
overview of, 256 response limitations, 97
concurrent processes, 11, 82, 84, 85 critical path networks, 11
limiting values of interaction critical paths, 13
contrasts, 52, 53, 54 cross-talk
monotonic interaction contrasts effects of, 215
distinguishing from sequential process dependence and, 195, 196,
processes, 50, 51 197
exponential distributions, 35, 36 cross-talk effects, 121
OR networks, 38 cumulative distribution function
overview of, 34 interaction contrast, 158
statistical considerations, 39 Cumulative Distribution Function
truncated normal distributions, Interaction Contrast (CDF IC)
36, 37 testing serial attentive processing,
selectively influencing, 185 231, 234
concurrent processing testing serial preattentive stage, 229
using sequential processing instead time reproduction and, 239, 240,
of, 149 242
concurrent time reproduction and visual cumulative distribution functions
search, 238, 239, 240, 241, 243 definition of and notation for, 67, 68
conditional density, 71 formula for, 152
conditional expectation, 70, 71 overview of, 214
conditional independence, 360 tests of, 153, 154
overview of, 214 cut vertex, 58, 59, 60
selective influence and, 206, 211 cycle, 10
example of proof not requiring data, fitting
independence, 212, 213 Ellenbogen et al., 356, 357
conditional stochastic dominance, 211 Keppel and Underwood, 355, 356
conjunction searches, 225, 228, 229 Tehan and Tecotte, 356
connected networks, 18 Decomposition Test, 166, 167, 214
connectionist networks degree of rotation, 113
generalization of OP diagram to, deoxyhemoglobin, 351
273, 275, 276, 277, 279 dependent processes
contingency or confusion matrices, factors selectively influencing, 359,
340, 342, 343, 344, 345 360
contralateral-ipsilateral difference in Diffusion Model, 291
movement preparation, 127 directed acyclic task networks, 8, 9, 10,
convergence 11, 12, 13
Cauchy criterion for, 52 Order-of-Processing diagram
convex arcs, 57 compared to, 256
cosphericity test, 370, 371, 372, 373, precedence constraint arcs in, 257
374, 375
Subject Index 411
representing connectionist networks 138, 139, 141, 142, 144, 145,

as, 273, 275, 276, 277, 279 146
representing queueing networks as, sensory and central Task 2
279, 280, 281, 282 processing, 119, 120
states and, 257 SOA and Task 2 central
discriminability processing, 111, 112, 113,
of Task 2, 114 114, 116, 117, 118, 119
psychological refractory period and, SOA and Task 2 sensory
121 processing, 107, 109, 111
Distance Test, 376, 377, 378, 379 SOA in
distractors, 223 monotonic response time means
distribution function interaction and, 30
contrasts dual-process signal detection model,
overview of, 167, 169, 170 301
processes in parallel or in series, dummy process, 10
170, 171, 172, 174, 175 duration
task networks of arcs, 12
overview of, 175, 176 of paths, 12
results for, 178, 185, 186, 187, duration of processes, 361
188, 189 symbols for, 76, 77, 78, 79
synopsis of results for, 177, 178 ending vertex of arc, 10
Donders, F. C., subtractive method of, equal distribution function tests
1, 2 cumulative distribution functions,
double bottleneck model, 97 153, 154
double factorial paradigm, 195 Decomposition Test, 166, 167
double sequences Mixture Test, 157, 158, 160, 161,
Cauchy criterion for, 54 164
double simulation task, and OP overview of, 152, 153
diagrams, 268, 269, 270, 271 statistical mimicking, 164, 165
dual task models, 7, 8 Summation Test, 154, 155, 156, 157
dual tasks, 6 Equations (6.3)
critical path models of, 93, 94, 96, simulations of, 205
97, 98 events
production and visual search, model definition of, 65
for, 20 evoked potentials
selective influence of processes in Additive Amplitude Method and,
central processing in Task 1 and 346, 347, 348, 349
SOA, 100, 101, 102, 103, additive areas of, 349, 350
105, 107 overview of, 345
central processing of Tasks 1 Evoked Response Potential (ERP), 346
and 2, 120, 121, 122, 123, Executive Process Interactive Control
124, 125, 126 (EPIC) model, 94
factors influencing, 99, 100 Exemplar Based Random Walk
overview of, 98, 99 (EBRW) model, 292, 293
post-central and response exhaustive search models, 235
processes, 126, 127, 128, exhaustive systems, 244
132, 133, 134, 135, 136, 137, expectancy hypothesis, 101
412 Subject Index
experimental factors, effects of on general gamma random variables, and

processes, 76, 77, 78, 79 OP diagrams, 261, 262
experiments general networks and OP diagrams,
definition of, 65 285, 286
exponential distributions generalization to other cognitive
concurrent processes, 35, 36 networks
exponential random variables, and OP OP diagrams
diagrams, 259, 260, 261 connectionist networks, 273, 275,
Extended Response Selection 276, 277, 279
Bottleneck model, 136 overview of, 273
Extended Selection Bottleneck model, queueing networks, 279, 280,
134, 137, 138 281, 282
face perception, 252, 253, 254, 255 generalization to resource constrained
factor additivity systems
stochastic independence and, 359 OP diagrams, 282, 283, 284, 285
factors geodesic paths, 13
additive, 3 Guided Search (GS), 224
additive effects of, 213 hazard function, 193
additivity of, 4, 5 homeomorphic graphs, 44
as sets of levels, 361 Human Factors
definition of, 3 acyclic task networks in, 13, 14
interaction of, 3 ideomotor compatibility, 123, 124
factors influencing processes in dual immediate memory and processing
tasks, 99 trees
failure of selective influence effects of proactive interference and
produced by dependencies, 202, retention interval, 304, 306, 307
203, 204 effects of serial position and list
with coactivation, 200, 201, 202 length
familiarity, recollection compared to, overview of, 318
300 qualitative tests, 318, 321, 322,
Feature Integration Theory (FIT) 323, 324, 325
as two-stage model of search, 224 quantitative tests, 325, 326
testing serial attentive processing, effects of serial position and word
228, 230, 231 frequency, 326, 327, 328
feature maps, 224 effects of serial position, delay, and
feedback cycles, 14 proactive interference
first order stochastic dominance, 68 discussion, 310
FIT (Feature Integration Theory) interpretation, 316, 317, 318
as two-stage model of search, 224 method, 308
testing serial attentive processing, overview of, 307
228, 230, 231 qualitative tests, 310, 311, 312,
flow model, 5 313, 314
fMRI, 350, 352, 353 quantitative tests, 314, 316
functions results, 309
monotonically increasing and effects of sleep and retroactive
monotonically decreasing, 65 interference, 328, 329, 330
Gantt charts, 7, 8, 9 overview of, 302, 303, 304
Subject Index 413
inclusion-exclusion tasks and limited capacity processing, 194

processing trees, 300, 301 limiting values of interaction contrasts
independence (property), 39 concurrent processes, 52, 53, 54
independent networks, 178 overview of, 51, 52
information theory, 105 sequential processes, 54, 55, 56
inserting mental processes, 1, 2, 3 limits of interaction contrasts, 62
insertions and process dependence list length and serial position, 318, 321,
failure of selective influence 322, 323, 324, 325, 326
produced by dependencies, 202, locus of slack analysis, 108, 109, 111,
203, 204 145, 146
overview of, 202 LRP. See Lateralized Readiness
selective influence and conditional Potential
independence Magnetic Resonance Imaging (MRI),
example of proof not requiring 351
independence, 212, 213 marginal cumulative distribution
overview of, 211 function, 68
successful selective influence, 204, marginal density, 70
206 marginal selectivity, 366, 367, 368, 369
integral dimension processing, 248 matrix factorization model for
integral dimensions, 243 contingency tables, 341
integrated hazard function, 193 mean reaction times
integrative chronometric analysis predictions about, 359
parameters, 287, 289 measurable functions and random
Pashler and Johnston study entities, 221, 222
application, 272, 273, 287 measurable space, 221
simulation, 289, 290 measurement of mental processes
Smith study application, 268, 269, early attempts at, 1
270, 271, 287 memory
Intensity Factor, 195 immediate
interaction contrasts, 80, 81, 82 effects of proactive interference
interaction contrasts of means, 214 and retention interval, 304,
interaction of factors, 3 306, 307
interresponse intervals, 144 effects of serial position and list
interstimulus intervals, 101, 102, 103, length, 318, 321, 322, 323,
105 324, 325, 326
joint cumulative distribution, 68 effects of serial position and
joint density, 70 word frequency, 326, 327,
joint distribution criterion, 369, 370, 328
380 effects of serial position, delay,
Lateralized Readiness Potential (LRP) and proactive interference,
description of, 127 307, 308, 309, 310, 311, 312,
interval from stimulus to onset of 313, 314, 316, 317, 318
movement-related brain effects of sleep and retroactive
potential, 127, 128, 132, 133 interference, 328, 329, 330
residual PRP effect, 134, 135, 136 overview of, 302, 303, 304
level of experimental factor prospective, 302
changing, and selective influence, memory scanning, 234, 235, 236, 238
26, 27, 28
414 Subject Index
memory scanning or memory search superprocesses, 57, 58

task, 4 monotonic reaction time means, 80
mental processes monotonic response time means
early attempts at measurement of, 1 OR networks, 30
inserting, 1, 2, 3 overview of, 28, 29
not in series and not parallel, 6 SOA in dual tasks, 30
stretching, 3, 4, 5, 6 monotonically increasing and
MICROSAINT, 34 monotonically decreasing functions,
Miller's Inequality, 197 65
Minkowski inequality, 376 motor time, 139, 140, 141
Mixture Test, 157, 158, 160, 161, 164, movement preparation, 127
213 MRI (Magnetic Resonance Imaging),
mixtures of processes 351
Mixture Test, 157, 158, 160, 161, multinomial processing trees, 18, 300
164 multiplicative effects
model parameters, selectively accuracy, 294, 295, 296, 297
influencing, 291, 292, 294 rates, 297, 299
moments of response time, computing. networks. See also task networks
See computing moments of response connected, 18
time networks for second stage of two-stage
monotonic interaction contrasts models, 230
additive factors and stages, 58, 59, nonBoolean gates, 15
60, 61 nondecision process, 292
calculations and simulations, 33, 34 nonwords, 326
concurrent processes OP (Order-of-Processing) diagrams, 9
exponential distributions, 35, 36 OR gates, 11
OR networks, 38 OR networks, 11, 90, 91
overview of, 34 monotonic interaction contrasts and,
statistical considerations, 39 38
truncated normal distributions, monotonic response time means
36, 37 and, 30
limiting values of Order-of-Processing diagrams and,
concurrent processes, 52, 53, 54 266, 268
overview of, 51, 52 ordered pairs, 311
sequential processes, 54, 55, 56 ordered processes, 11
limits of, 62 ordering random variables, 68, 69, 70
overview of, 31, 33 Order-of-Procesing (OP) diagrams, 94
sequential processes Order-of-Processing (OP) diagram
complete Wheatstone bridge, 45, AND networks and
46, 50 overview of, 256, 257
distinguishing from concurrent states, 257
processes, 50, 51 transitions, 258, 259
incomplete Wheatstone bridge, application of integrative
43, 44, 45 chronometric analysis
not in Wheatstone bridge, 40, 42, parameters, 287, 289
43 Pashler and Johnston study, 272,
overview of, 39, 40 273, 287
sets of processes, 56, 57 simulation, 289, 290
Subject Index 415
Smith study, 268, 269, 270, 271, physiological measures

287 additive areas of evoked potentials,
computing moments of response 349, 350
time evoked potentials and Additive
overview of, 259 Amplitude Method, 345, 346,
process durations, 259, 260, 261, 347, 348, 349
262, 263, 264, 265, 266 fMRI and additive BOLD signals,
description of, 256 350, 352, 353
general networks and, 285, 286 positive sets, 4
generalization to other cognitive precedence constraint arcs, 257
networks Presence/Absence Factor, 195
connectionist networks, 273, 275, proactive interference
276, 277, 279 retention interval and, 304, 306, 307
overview of, 273 serial position, delay, and, 307, 308,
queueing networks, 279, 280, 309, 310, 311, 312, 313, 314,
281, 282 316, 317, 318
generalization to resource probability measures, 66
constrained systems, 282, 283, probability of paths, 18
284, 285 probability space, 221, 222
OR networks, 266, 268 probability spaces, 65, 66, 68
Order-of-Processing (OP) diagrams, 9 process dependence
oxyhemoglobin, 351 capacity limits and, 191, 192, 193,
parallel models, 8 194, 195
parallel preattentive stage, testing of, coactivation
225, 226, 227, 228 failure of selective influence
parallel processes with, 200, 201, 202
accuracy and, 293 overview of, 197, 198
distribution function interaction selective influence with, 198,
contrasts, 170, 171, 172, 174, 199
175 cross-talk and, 195, 196, 197
parallel processing in memory insertions
scanning, 238 failure of selective influence
partitive arcs, 57 produced by dependencies,
passive processes, 257 202, 203, 204
paths overview of, 202
critical, 13 successful selective influence,
duration of, 12 206
geodesic, 13 overview of, 189, 190, 191
probability of, 18 selective influence and conditional
simple, 18 independence
vertexes and, 10 example of proof not requiring
perception independence, 212, 213
definition of, 1 overview of, 206, 211
perceptual classification, 243, 244, 245, process dissociation procedure, 16, 17
246, 248, 249, 251 process dissociation procedure and
PERT (Program Evaluation and processing trees, 300, 301
Review Technique) networks, 11 process durations
PERT networks, 256
416 Subject Index
computing moments of response SOA and Task 2 sensory

time and processing, 107, 109, 111
general case, 262, 263, 264, 265, sequential, 85, 86, 88
266 processes in series
independent exponential, 259, cumulative distribution functions,
260, 261 153, 154
independent general gamma, Summation Test, 154, 155, 156, 157
261, 262 processing trees
conditional independence of, 214 immediate memory
stochastic independence of, 199, effects of proactive interference
214 and retention interval, 304,
process schedules 306, 307
acyclic task networks in Human effects of serial position and list
Factors, 13, 14 length, 318, 321, 322, 323,
directed acyclic task networks and, 324, 325, 326
8, 9, 10, 11, 12, 13 effects of serial position and
Gantt charts and, 7, 8, 9 word frequency, 326, 327,
processing trees, 15, 17, 18 328
systems not easily represented as effects of serial position, delay,
processing trees, 19 and proactive interference,
systems not easily represented in 307, 308, 309, 310, 311, 312,
acyclic task networks, 14, 15 313, 314, 316, 317, 318
processes. See also specific processes effects of sleep and retroactive
and processing interference, 328, 329, 330
concurrent, 82, 84, 85 overview of, 302, 303, 304
duration of, 361 multinomial, 300
effects of experimental factors on, overview of, 15, 17, 18, 299, 300
76, 77, 78, 79 process dissociation and
order of, 149, 150, 151 inclusion-exclusion tasks, 300,
selective influence of in dual tasks 301
central processing in Task 1 and prospective memory, 302
SOA, 100, 101, 102, 103, source monitoring, 301, 302
105, 107 standard tree
central processing of Tasks 1 for multiplicatively
and 2, 120, 121, 122, 123, interacting factors, 337
124, 125, 126 for ordered processes, 332
factors influencing, 99, 100 for unordered processes, 331
overview of, 98, 99 systems not easily represented
post-cental and response as, 19
processes, 126, 127, 128, task networks and, 19
132, 133, 134, 135, 136, 137, tree inference
138, 139, 141, 142, 144, 145, generalization to rates, more
146 response classes, and more
sensory and central Task 2 influenced vertices, 336, 337,
processing, 119, 120 338, 339, 340
SOA and Task 2 central overview of, 330, 331, 332, 333,
processing, 111, 112, 113, 334, 336
114, 116, 117, 118, 119 production systems, 15
Subject Index 417
Program Evaluation and Review central and response limitations and,

Technique (PERT) networks, 11 97, 98
Project Ernestine, 13 dual task model and, 96
prospective memory and processing response processing of tow tasks
trees, 302 and, 97
PRP. See psychological refractory single channel model and, 94, 95
period removing mental processes, 1, 2, 3
psychological refractory period (PRP) residual PRP effect, 134, 135, 136
central process order, 121, 122, 123 resource constrained systems,
description of, 93 generalization of OP diagram to,
discriminability, 121 282, 283, 284, 285
number of alternatives, 120 response grouping
residual PRP effect, 134, 135, 136 Borger model of, 141, 142, 143, 144
Welford on, 94 correlations between response
queueing networks times, 145
generalization of OP diagram to, interresponse intervals, 144
279, 280, 281, 282 locus of slack and, 145, 146
queuing network model, 94 response time, 144
Race Model Inequality, 197 response interdiction model, 97
random entities response modality, manipulation of,
definition of, 363 114, 116, 117, 119
random entities and measurable response movement and motor time,
functions, 221, 222 139, 140, 141
random variables response selection, 127
dependence between, 359, 360 Response Selection Bottleneck Model,
notation for, 67 107, 109, 111
ordering, 68, 69, 70 Response Selection Bottleneck Model
random entities and, 222 failure of selective influence and,
univariate, 66 203
random vectors simulations of, 205, 206, 216, 217,
definition of, 67, 208, 360 221
notation for, 67 retention interval and immediate
selective influence on, 210 memory, 304, 306, 307
unrelated, 362 retroactive interference and sleep, 328,
rank of matrices, 343 329, 330
rates reverse hazard function, 194
multiplicative effects, 297, 299 R-locked potentials, 133
reached arcs, 18 roots, 17
reaction time density function, 158 sample spaces
Reaction-Time Distance Hypothesis, definition of, 65
292 sample spaces for treatments, 361, 362
recall selective influence, 3
word frequency and, 326 Alternate Pathways Model and, 213
recency effect and list length, 321 Chapter 6 definition of, 364, 365
recollection, familiarity compared to, conditional independence and
300 example of proof not requiring
redundant signal paradigm, 192 independence, 212, 213
refractory delays overview of, 206, 211
418 Subject Index
cosphericity test, 370, 371, 372, selectively influencing processes in

373, 374, 375 task networks
definition of, 362, 363 effects of
Distance Test, 376, 377, 378, 379 overview, 20, 21, 23, 24
examples of, 365, 366 selective influence, 26, 27, 28
failure of slack, 24, 25, 26
produced by dependencies, 202, self-terminating search models, 235
203, 204 self-terminating systems, 244
with coactivation, 200, 201, 202 sensory processing
joint distribution criterion, 369, 370, scheduling of for Task 2, 137, 138
380 task switching and, 138, 139
marginal selectivity, 366, 367, 368, sequential processes, 11, 85, 86, 88
369 all times, 186
on random entities, 363 limiting values of interaction
process dependence and, 190 contrasts, 54, 55, 56
successful, with process monotonic interaction contrasts
dependence, 204, 206 complete Wheatstone bridge, 45,
with coactivation, 198, 199 46, 50
selective influence by increments, 210 distinguishing from concurrent
selective influence of processes in dual processes, 50, 51
tasks incomplete Wheatstone bridge,
central processes 43, 44, 45
in Task 1 and SOA, 100, 101, not in Wheatstone bridge, 40, 42,
102, 103, 105, 107 43
of Tasks 1 and 2, 120, 121, 122, overview of, 39, 40
123, 124, 125, 126 selectively influencing, 185
SOA and Task 2 central small times, 185
processing, 111, 112, 113, sequential processing, using when
114, 116, 117, 118, 119 concurrent processing is possible,
factors influencing, 99, 100 149
overview of, 98, 99 serial attentive processing stage, testing
post-central and response processes of, 228, 229, 230, 231, 234
LRP, 127, 128, 132, 133, 134, serial exhaustive processing
135, 136 perceptual classification and, 246
motor time, 139, 141 serial models, 8
overview of, 126 serial position
response grouping, 141, 142, delay, proactive interference, and,
144, 145, 146 307, 308, 309, 310, 311, 312,
scheduling of Task 2 sensory 313, 314, 316, 317, 318
processing, 137, 138 list length and, 318, 321, 322, 323,
task switching and sensory 324, 325, 326
processing, 138, 139 word frequency and, 326, 327, 328
remarks, 146, 148 serial processes
sensory and central Task 2 distinguishing from mixtures, 158,
processing, 119, 120 160
sensory processes distribution function interaction
SOA and Task 2 sensory contrasts, 170, 171, 172, 174,
processing, 107, 109, 111 175
Subject Index 419
serial processing in memory scanning, stimulus discriminability, manipulation

234, 235, 236 of, 106
serial-parallel networks, 176 stimulus onset asynchrony (SOA), 107
sigma algebra, 66 in dual tasks, 30
Signal Detection Theory (SDT), 292, stochastic independence and factor
293 additivity, 359
simple paths, 18 stochastic independence of process
simulations durations, 199, 214
details of, 216 Stochastic Signal Detection Theory,
of Equations, 205 292, 293
of Response Selection Bottleneck stretching mental processes, 3, 4, 5, 6
Model, 205, 206, 217, 221 Stroop tasks, 123, 124, 125, 126
of Wheatstone bridge, 206, 217, 221 structures for containing processes
satisfying Mixture Test, 160, 161, overview of, 7
164 subdividing graphs, 44
Single Central Bottleneck Model, 107, sublists, 367
111 subprocesses, 56
motor time, 139, 140, 141 subtractive method of Donders, 1, 2
OP diagrams and, 268, 269, 270, Summation Test, 154, 155, 156, 157,
271 161, 213
predictions of, 146 supercapacity processing, 194
Single Central Bottleneck Model superprocesses, 56, 57, 58
cross-talk and, 197 survivor function, 167, 169, 170, 214
process dependence and, 190 survivor function interaction contrast,
single channel theory, 7, 95 243, 246, 248, 249
singleton searches, 224 survivor interaction contrast, 168
slack target present and target absent trials,
in task networks, 24, 25, 26 11
slack in OR networks, 91 Task Network Inference
sleep and retroactive interference, 328, definition of, 13
329, 330 task networks
S-locked potentials, 132 acyclic, in Human Factors, 13, 14
smoothing graphs, 44 definition of, 11
SOA (stimulus onset asynchrony), 107 directed acyclic, 8, 9, 10, 11, 12, 13
source monitoring and processing trees, effects of selectively influencing
301, 302 processes in
stages overview of, 20, 21, 23, 24
definition of, 3 selective influence, 26, 27, 28
monotonic interaction contrasts and, slack, 24, 25, 26
58, 59, 60, 61 overview of, 175, 176
starting vertex of arc, 9 processing trees and, 19
states results for, 178, 185, 186, 187, 188,
AND networks and, 257 189
statistical considerations simulation details, 216, 217, 221
monotonic interaction contrasts and, synopsis of results for, 177, 178
39, 40 systems not easily represented in
statistical mimicking, 164, 165 acyclic, 14, 15
Sternberg memory scanning task, 4
420 Subject Index
task switching and sensory processing, unrelated random vectors, 362

138, 139 usual stochastic order, 68, 69, 70
terminal vertex, 18 values
terminal vertex (processing trees), 299 of interaction contrasts, limiting
terminology, 360 concurrent processes, 52, 53, 54
testing overview of, 51, 52
parallel preattentive stage, 225, 226, sequential processes, 54, 55, 56
227, 228 vectors
serial attentive processing stage, notation for, 64
228, 229, 230, 231, 234 vertex
tests of equal distribution functions description of, 57
cumulative distribution functions, of articulation, 58
153, 154 vertex (processing trees), 299
Decomposition Test, 166, 167 visual search
Mixture Test, 157, 158, 160, 161, concurrent time reproduction and,
164 238, 239, 240, 241, 243
overview of, 152, 153 testing
statistical mimicking, 164, 165 parallel preattentive stage, 225,
Summation Test, 154, 155, 156, 157 226, 227, 228
Theorem 6.1, 170, 171, 172 serial attentive processing stage,
Theorem 6.2, 172, 174, 175 228, 229, 230, 231, 234
Theorem 6.3, 187, 188, 189 two-stage models of, 223, 224, 225
Theorem 6.4, 212, 213 weakly connected superprocess, 57
total slack, 24 Wheatstone bridge
transitions complete, sequential processes on
AND networks and, 258, 259 opposite sides of, 45, 46, 50
Transitive Orientation Algorithm, 51 incomplete, sequential processes on
transitive precedence, 10 opposite sides of, 43, 44, 45
traversed arcs, 18 sequential processes not on opposite
tree inference and processing trees sides of, 40, 42, 43
generalization to rates, more simulations of, 206, 216, 217, 221
response classes, and more Wheatstone bridges
influenced vertices, 336, 337, sequential processes and, 88
338, 339, 340 word frequency and serial position,
overview of, 330, 331, 332, 333, 326, 327, 328
334, 336
trees, 18
truncated normal distributions
concurrent processes, 36, 37
Two-High-Threshold Model, 301, 302
two-stage visual search structure, 223,
224, 225
underadditive interactions with SOA,
108
univariate random variables, 66
unlimited processing, 194
unordered pairs, 311
unordered processes, 11

Discovering Cognitive Architecture by Selectively Influencing Mental Processes

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Discovering Cognitive Architecture by Selectively Influencing Mental Processes

Hochgeladen von

Copyright:

Verfügbare Formate

Discovering Cognitive Architecture by

Selectively Influencing Mental Processes

7344.9789814277457-tp.indd 1 13/4/12 1:32 PM

Series Editors: H. Colonius (University of Oldenburg, Germany)

Vol. 1: The Global Structure of Visual Space

Vol. 2: Theories of Probability: An Examination of Logical and

Vol. 3: Descriptive and Normative Approaches to Human Behavior

Vol. 4: Discovering Cognitive Architecture by Selectively Influencing

EH - Discovering Cognitive Architecture.pmd 1 4/13/2012, 1:44 PM

7344.9789814277457-tp.indd 2 13/4/12 1:32 PM

Library of Congress Cataloging-in-Publication Data

British Library Cataloguing-in-Publication Data

Copyright © 2012 by World Scientific Publishing Co. Pte. Ltd.

EH - Discovering Cognitive Architecture.pmd 2 4/13/2012, 1:44 PM

To Susan Duffy and Susan Haas,

To Sue, Siwon, Jiwon, and Sihyun,

This is a book about a technique used in cognitive psychology to

is assumed to be familiar with probability and statistics, as found in a

A person performing a task such as searching a screen for a target

Stretching Processes Rather Than Inserting Them

In Sternberg’s (1969) elegant approach, instead of trying to insert a

of Variance. Its applications have been numerous, see, e.g., Sanders

τ21 − τ11 = τ22 − τ12.

To construct a model with two processes in series, let process A have

a1 = .5τ11 when Factor A is at level 1

a2 = .5τ11 + τ21 − τ11 when Factor A is at level 2.

Let process B have duration

b1 = .5τ11 when Factor B is at level 1

b2 = .5τ11 + τ12 − τ11 when Factor B is at level 2.

Introduction to Process Schedules

The main reason for selectively influencing processes is to learn about

Gantt Charts and Directed Acyclic Task Networks

Fig. 2.1. Gantt chart for a dual task.

When intuition about relationships among processes is important, a

If the processes in a task cannot be represented in an acyclic task

Directed Acyclic Task Networks

transitive, that is, if process x precedes process y, and process y precedes

o, to the ending vertex, r, is called the critical path; in an AND network,

Acyclic Task Networks in Human Factors

A major use of task networks is in Human Factors. A network is often

Systems Not Easily Represented in Acyclic Task Networks

Systems that cannot be formulated as acyclic AND or OR task networks

feedback activates processes following it at the same time as it

Responses can be classified in various ways, as, say, correct or incorrect,

which outcome occurred. Some outcomes of processes are responses,

According to the model, when a subject sees a word at test, he

Consider a word in List 1. A subject in the inclusion group will say

(1976), Hu and Batchelder (1994), and Riefer and Batchelder (1988).

Systems Not Easily Represented As Processing Trees

A tree is a special form of directed acyclic network, so difficulties that

Analyzing both reaction time and accuracy

It is natural to attempt to combine a processing tree with a task network,

Selectively Influencing Processes

Although Sternberg (1969) focused on serial processes, he noted that the

Effects of Selectively Influencing Processes in Task Networks

is completed. The response is made at time 600. If the SOA is increased

Figure 3.2 illustrates the effect of selectively influencing both the

of the two factors, produced-interval-goal and display size, selectively

different process. It is awkward to conclude in this situation that an

The behavior of AND networks and OR networks is similar, so it will

s(A, r) = d(o, r)  d(o, A')  d(A)  d(A", r).

s(A, B') = d(o, B')  d(o, A')  d(A)  d(A", B').

Now, restore the removed processes to the network, and suppose A is

k(A, B) = s(A, r)  s(A, B')