Beruflich Dokumente
Kultur Dokumente
Network Systems
Francesco Bullo
With contributions by
Jorge Corts
Florian Drfler
Sonia Martnez
Lectures on Network Systems
Francesco Bullo
Version v0.85(i) (6 Aug 2016).
With contributions by J. Corts, F. Drfler, and S. Martnez
This document is intended for personal use: you are allowed to print this pdf
file and/or photocopy it. All other rights are reserved, e.g., this document (in
whole or in part) may not be posted online or shared in any way without express
consent. 2012-16.
Contents
Contents iii
I Linear Systems 1
iii
iv Contents
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Contents v
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
vi Contents
Bibliography 249
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Preface
Books which try to digest, coordinate, get rid of the duplication, get rid of the less fruitful methods
and present the underlying ideas clearly of what we know now, will be the things the future
generations will value. Richard Hamming (1915-1998)
Topics These lecture notes are intended for first-year graduate students interested in network systems,
distributed algorithms, and cooperative control. The objective is to answer basic questions such as: What are
fundamental dynamical models of interconnnected systems? What are the essential dynamical properties
of these models and how are they related to network properties? What are basic estimation, control, and
optimization problems for these dynamical models?
The book is organized in three parts: Linear Systems, Topics in Averaging Systems, and Nonlinear
Systems. The Linear Systems part, together with part on the Topics in Averaging Systems, includes
(i) several key motivating examples systems drawn from social, sensor, and compartmental networks,
as well as additional ones from robotics,
(ii) basic concepts and results in matrix and graph theory, with an emphasis on PerronFrobenius theory,
algebraic graph theory and linear dynamical systems,
(iii) averaging systems in discrete and continuous time, described by static, time-varying and random
matrices, and
(iv) positive and compartmental systems, described by Metzler matrices, with examples from ecology,
epidemiology and chemical kinetics.
(v) formation control and coordination problems for relative sensing networks,
(vi) networks of phase oscillator systems with an emphasis on the Kuramoto model and models of power
networks, and
(vii) virus propagation models, including lumped and network models as well as stochastic and determin-
istic models, and
vii
viii Contents
(viii) population dynamic models, describing mutualism, competition and cooperation in multi-species
systems.
Teaching instructions These lecture notes are meant to be taught over a quarter-long course with a
total 35 to 40 hours of contact time. On average, each chapter should require approximately 2 hours of
lecture time. Indeed, these lecture notes are an outgrowth of an introductory graduate course that I taught
at UC Santa Barbara over the last several years.
The intended audience is 1st year graduate students in Engineering, Sciences, and Applied Mathematics
programs. For the first part on Linear Systems, the required background includes competency in linear
algebra and only very basic notions of dynamical systems. For the second part on Nonlinear Systems
(including coupled oscillators and virus propagation), the required background includes a calculus course.
The treatment is self-contained and does not require a nonlinear systems course.
For the benefit of instructors, these lecture notes are supplemented by three documents:
The book, in its three formats, are available for download at: http://motion.me.ucsb.edu/book-lns.
I am extremely grateful to Jorge Corts and Sonia Martnez for their fundamental contribution to my under-
standing and our joint work on distributed algorithms and robotic networks; their scientific contribution is
most obviously present in
I am extremely grateful to Alessandro Giua for detailed comments and insightful suggestions; his input
helped shape the early chapters. I am grateful to Noah Friedkin for instructive discussions about social
influence networks that influenced Chapter 5. I wish to thank Sandro Zampieri and Wenjun Mei for their
contribution to Chapters 16 and 17 and to Stacy Patterson for adopting an early version of these notes and
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Contents ix
providing me with detailed feedback. I wish to thank Jason Marden and Lucy Pao for their invitation to
visit the University of Colorado at Boulder and deliver an early version of these lecture notes.
I also would like to acknowledge the generous support received from funding agencies. This book
is based on work supported in part by the Army Research Office through grants W911NF-11-1-0092 and
W911NF-15-1-0577, the Air Force Office of Scientific Research through grant FA9550-15-1-0138, and the
National Science Foundation through grants CPS-1035917 and CPS-1135819.
A special thank you goes to all students who took this course and all scientists who read these notes.
Particular thanks go to Alex Olshevsky, Ashish Cherukuri, Bala Kameshwar Poolla, Basilio Gentile, Catalin
Arghir, Deepti Kannapan, Fabio Pasqualetti, Francesca Parise, John W. Simpson-Porco, Pedro Cisneros-
Velarde, Peng Jia, Saber Jafarpour, Sepehr Seifi, Shadi Mohagheghi, Tyler Summers, and Vaibhav Srivastava.
for their contributions to these lecture notes and related homework.
Finally, I wish to thank Gabriella, Marcello, Lily and my whole family for their loving support.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Part I
Linear Systems
1
Chapter 1
Motivating Problems and Systems
In this introductory chapter, we introduce some example problems and systems from multiple disciplines.
The objective is to motivate our treatment of linear network systems in the following chapters. We look at
the following examples:
(i) In the context of social influence networks, we discuss a classic reference on how opinions evolve
and possibly reach a consensus in groups of individuals. Here, consensus means that the opinions of
the individuals are identical.
(ii) In the context of wireless sensor networks, we discuss distributed simple averaging algorithms and, in
the appendix, two advanced design problems in the context of parameter estimation and hypothesis
testing.
(iii) In the context of compartmental networks, we discuss dynamical flows among compartments, such
as arising in ecosystems.
(iv) Finally, in the context of robotic networks: we discuss simple robotic behaviors for cyclic pursuit and
balancing.
In all cases we are interested in presenting the basic models and motivating interest in understanding their
dynamic behaviors, such as the existence and attractivity of equilibria.
We present additional linear examples in later chapters and nonlinear examples in the second part.
For a similar valuable list of related and instructive examples, we refer to (Hendrickx 2008, Chapter 9)
and (Garin and Schenato 2010, Section 3.3). Other examples of multi-agent systems and applications can be
found in the following texts (Ren and Beard 2008; Bullo et al. 2009; Mesbahi and Egerstedt 2010; Cristiani
et al. 2014; Fuhrmann and Helmke 2015; Francis and Maggiore 2016).
3
4 Chapter 1. Motivating Problems and Systems
n
X
Fi+ = aij Fj ,
j=1
Figure 1.1: Interactions in a social influ-
ence network
where aij denotes the weight that individual i assigns to the dis-
tribution of individual j when carrying out this revision. More
precisely, the coefficient aii describes the attachment of individual i to its own opinion and aij , j 6= i, is an
interpersonal influence weight that individual i accords to individual j.
In the DeGroot model, the coefficients aij satisfy the following constraints: they are nonnegative,
Pm is, aij 0, and, for each individual, the sum of self-weight and accorded weights equals 1, that is,
that
j=1 aij = 1 for all i. In mathematical terms, the matrix
a11 . . . a1n
.. .. ..
A= . . .
an1 . . . ann
has nonnegative entries and each of its rows has unit sum. Such matrices are said to be row-stochastic.
(iii) Under what conditions do the distributions converge to consensus? What is this value?
(iv) What are more realistic, empirically-motivated models, possibly including stubborn individuals or
antagonistic interactions?
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
1.2. Wireless sensor networks: averaging algorithms 5
sensor node
gateway node
Figure 1.2: A wireless sensor network composed of a collection of spatially-distributed sensors in a field and a
gateway node to carry information to an operator. The nodes are meant to measure environmental variables, such as
temperature, sound, pressure, and cooperatively filter and transmit the information to an operator.
A wireless sensor network is a collection of spatially-distributed devices capable of measuring physical and
environmental variables (e.g., temperature, vibrations, sound, light, etc), performing local computations,
and transmitting information to neighboring devices and, in turn, throughout the network (including,
possibly, an external operator).
Suppose that each node in a wireless sensor network has measured a scalar environmental quantity,
say xi . Consider the following simplest distributed algorithm, based on the concepts of linear averaging:
each node repeatedly executes
x+
i := average xi , {xj , for all neighbor nodes j} , (1.1)
where x+ i denotes the new value of xi . For example, for the graph in Figure 1.3, one
can easily write x+
1 := (x 1 + x 2 )/2, x +
2 := (x 1 + x 2 + x3 + x4 )/4,
and so forth. In summary, the algorithms behavior is described by 3 4
1/2 1/2 0 0
1/4 1/4 1/4 1/4
x+ =
0 1/3 1/3 1/3 x = Awsn x, 1 2
0 1/3 1/3 1/3
Figure 1.3: Example graph
where the matrix Awsn in equation is again row-stochastic.
Questions of interest are:
(i) Does each node converge to a value? Is this value the same for all nodes?
(ii) Is this value equal to the average of the initial conditions?
(iii) What properties do the graph and the corresponding matrix need to have in order for the algorithm
to converge?
(iv) How quick is the convergence?
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
6 Chapter 1. Motivating Problems and Systems
drinking herbivory
animals evaporation
Figure 1.4: Water flow model for a desert ecosystem. The blue line denotes an inflow from the outside environment.
The red lines denote outflows into the outside environment.
If we let qi denote the amount of material in compartment i, the mass balance equation for the ith
compartment is written as:
X
qi = (Fji Fij ) Fi0 + ui ,
j6=i
where ui is the inflow from the environment and Fi0 is the outflow into the environment. We now assume
linear flows, that is, we assume that the flow Fij from node i to node j (as well as to the environment) is
proportional to the mass quantity at i, that is, Fij = fij qi for a positive flow rate constant fij . Therefore
we can write
X
qi = (fji qj fij qi ) fi0 qi + ui
j6=i
and so, in vector notation, there exists an appropriate C matrix such that
q = Cq + u.
For example, let us write down the compartmental matrix C for the water flow model in figure. We
let q1 , q2 , q3 denote the water mass in soil, plants and animals, respectively. Moreover, as in figure, we
let fe-d-r , ftrnsp , fevap , fdrnk , fuptk , fherb , denote respectively the evaporation-drainage-runoff, transpiration,
evaporation, drinking, uptake, and herbivory rate. With these notations, we can write
fe-d-r fuptk fdrnk 0 0
C= fuptk ftrnsp fherb 0 .
fdrnk fherb fevap
Questions of interest are:
(i) for constant inflows u, does the total mass in the system remain bounded?
(ii) is there an asymptotic equilibrium? do all evolutions converge to it?
(iii) which compartments become empty asymptotically?
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
1.4. Appendix: Robotic networks in cyclic pursuit and balancing 7
where mod(, 2) is the remainder of the division of by 2 and its introduction is required to ensure
that i (k + 1) remains inside [0, 2).
The n-bugs problem is related to the study of pursuit curves and inquires about what the paths of n bugs
are, not aligned initially, when they chase one another. We refer to (Watton and Kydon 1969; Bruckstein
et al. 1991; Marshall et al. 2004; Smith et al. 2005) for surveys and recent results.
In short, given a control gain [0, 1], we assume that the ith bug sets its control signal to
i i
i+1 i+1
i 1
distcc (i , i+1 ) distcc (i , i+1 ) distc (i , i 1)
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
8 Chapter 1. Motivating Problems and Systems
A preliminary analysis
It is unrealistic (among other aspects of this setup) to assume that the bugs know the absolute position
of themselves and of their neighbors. Therefore, it is interesting to rewrite the dynamical system in terms
of pairwise distances between nearby bugs.
For i {1, . . . , n}, we define the relative angular distances (the lengths of the counterclockwise arcs)
di = distcc (i , i+1 ) 0. (We also adopt the usual convention that dn+1 = d1 and that d0 = dn ). The
change of coordinates from (1 , . . . , n ) to (d1 , . . . , dn ) leads us to rewrite the cyclic pursuit and the cyclic
balancing laws as:
upursuit,i (k) = di ,
ubalancing,i (k) = di di1 .
In this new set of coordinates, one can show that the cyclic pursuit and cyclic balancing systems are,
respectively,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
1.5. Appendix: Design problems in wireless sensor networks 9
These are two linear time-invariant dynamical systems with state d = (d1 , . . . , dn ) and governing equation
described by the two n n matrices:
1 0 0 1 2 0
. .. . ..
0
1 .. . 0
1 2 . . . 0
..
Apursuit = ... ..
.
..
.
..
. 0 , A = ..
.
..
.
..
. 0 .
balancing
.
.. .. .. ..
0 . . 1 0 . . 1 2
0 0 1 0 1 2
yi = Bi + vi ,
where yi Rmi , Bi is a known matrix and vi is random measurement noise. We assume that
(A1) the noise vectors v1 , . . . , vn are independent jointly-Gaussian variables with zero-mean E[vi ] = 0mi
and positive-definite covariance E[vi vi> ] = i = > i , for i {1, . . . , n}; and
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
10 Chapter 1. Motivating Problems and Systems
B1
P
(A2) the measurement parameters satisfy the following two properties: i mi m and ... is full
Bn
rank.
Given the measurements y1 , . . . , yn , it is of interest to compute a least-square estimate of , that is, an
estimate of that minimizes a least-square error. Specifically, we aim to minimize the following weighted
least-square error:
Xn Xn
min
yi Bi b
2 1 = y i Bi
b > 1 yi Bi b .
i
i
b i=1 i=1
In this weighted least-square error, individual errors are weighted by their corresponding inverse covariance
matrices so that an accurate (respectively, inaccurate) measurement corresponds to a high (respectively,
low) error weight. With this particular choice of weights, the least-square estimate coincides with the
so-called maximum-likelihood estimate; see (Poor 1998) for more details. Under assumptions (A1) and (A2),
the optimal solution is
X n 1 X
n
b = Bi> 1
i B i Bi> 1
i yi .
i=1 i=1
This formula is easy to implement by a single processor with all the information about the problem, i.e., the
parameters and the measurements.
To compute b in the sensor (and processor) network, we perform two steps:
[Step 1:] we run two distributed algorithms in parallel to compute the average of the quantities Bi> 1 i Bi
and Bi> 1
i y i .
[Step 2:] we compute the optimal estimate via
1
b = average B1> 1
1 B 1 , . . . , B > 1
n n B n average B > 1
1 1 y1 , . . . , B > 1
n n y n .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
1.5. Appendix: Design problems in wireless sensor networks 11
Also, assume that each observation is conditionally independent of all other observations, given any
hypothesis.
(i) We wish to compute the maximum a posteriori estimate, that is, we want to identify which one is the
most likely hypothesis, given the measurements. Note that, under the independence assumption,
Bayes Theorem implies that the a posteriori probabilities satisfy
n Y
p(h )
p(h |y1 , . . . , yn ) = p(yi |h ).
p(y1 , . . . , yn )
i=1
(ii) Observe that p(h ) is known, and p(y1 , . . . , yn ) is a constant normalization factor scaling all posteriori
probabilities equally. Therefore, for each hypothesis , we need to compute
n
Y
p(yi |h ),
i=1
(iii) In summary, even in this hypothesis testing problem, we need algorithms to compute the average of
the n numbers log p(y1 |h ), . . . , log p(yn |h ), for each hypothesis .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
12 Chapter 1. Motivating Problems and Systems
1.6 Exercises
E1.1 Simulating the averaging dynamics. Simulate in your favorite programming language and software pack-
age the linear averaging algorithm in equation (1.1). Set n = 5, select the initial state equal to (1, 1, 1, 1, 1),
and use the following undirected unweighted graphs, depicted in Figure E1.1:
(i) the complete graph,
(ii) the ring graph, and
(iii) the star graph with node 1 as center.
Which value do all nodes converge to? Is it equal to the average of the initial values? Turn in your code, a few
printouts (as few as possible), and your written responses.
Figure E1.1: Complete graph, ring graph and star graph with 5 nodes
E1.2 Computing the bugs dynamics. Consider the cyclic pursuit and balancing dynamics described in Section 1.4.
Verify
(i) the cyclic pursuit closed-loop equation (1.2),
(ii) the cyclic balancing closed-loop equation (1.3), and
(iii) the counterclockwise order of the bugs is never violated.
Hint: Recall the distributive property of modular addition: mod(a b, n) = mod(mod(a, n) mod(b, n), n).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 2
Elements of Matrix Theory
We review here basic concepts from matrix theory. These concepts will be useful when analyzing graphs
and averaging algorithms defined over graphs.
In particular we are interested in understanding the convergence of the linear dynamical systems
discussed in Chapter 1. Some of those systems are described by matrices that have nonnegative entries and
have row-sums equal to 1.
Notation
It is useful to start with some basic notations from matrix theory and linear algebra. We let f : X Y
denote a function from set X to set Y . We let R, N and Z denote respectively the set of real, natural
and integer numbers; also R0 and Z0 are the set of nonnegative real numbers and nonnegative integer
numbers. For real numbers a < b, we let
Given a complex number z C, its norm (sometimes referred to as complex modulus) is denoted by |z|, its
real part by <(z) and its imaginary part by =(z). We let i denote the imaginary unit 1.
We let 1n Rn (respectively 0n Rn ) be the column vector with all entries equal to +1 (respectively
0). Let e1 , . . . , en be the standard basis vectors of Rn , that is, ei has all entries equal to zero except for the
ith entry equal to 1.
We let In denote the n-dimensional identity matrix and A Rnn denote a square n n matrix with
real entries {aij }, i, j {1, . . . , n}. The matrix A is symmetric if A> = A.
For a matrix A, C is an eigenvalue and v Cn is a right eigenvector, or simply an eigenvector, if
they together satisfy the eigenvalue equation Av = v. Sometimes it will be convenient to refer to (, v)
as an eigenpair. A left eigenvector of the eigenvalue is a vector w Cn satisfying w> A = w> .
A symmetric matrix is positive definite (resp. positive semidefinite) if all its eigenvalues are positive
(resp. nonnegative). The kernel of A is the subspace kernel(A) = {x Rn | Ax = 0n }, the image of A is
image(A) = {y Rn | Ax = y, for some x Rn }, and the rank of A is the dimension of its image. Given
vectors v1 , . . . , vj Rn , their span is span(v1 , . . . , vj ) = {a1 v1 + + aj vj | a1 , . . . , aj R} Rn .
13
14 Chapter 2. Elements of Matrix Theory
Definition 2.1 (Discrete-time linear system). A square matrix A defines a discrete-time linear systems
by
x(k + 1) = Ax(k), x(0) = x0 , (2.1)
or, equivalently by x(k) = Ak x0 , where the sequence {x(k)}kZ0 is called the solution, trajectory or
evolution of the system.
Sometimes it is convenient to adopt the shorthand x+ = f (x) to denote the system x(k + 1) = f (x(k)).
We are interested in understanding when a solution from an arbitrary initial condition has an asymptotic
limit as time diverges and to what value the solution converges. We formally define this property as follows.
lim x(k) = A x0 .
k+
Remark 2.3 (Modal decomposition for symmetric matrices). Before treating the general analysis
method, we present the self-contained and instructive case of symmetric matrices. Recall that a symmetric
matrix A has real eigenvalues 1 2 n and corresponding orthonormal (i.e., orthogonal and
unit-length) eigenvectors v1 , . . . , vn . Because the eigenvectors are an orthonormal basis for Rn , we can write
the modal decomposition
x(k) = y1 (k)v1 + + yn (k)vn ,
where the ith normal mode is defined by yi (k) = vi> x(k). We then left-multiply the two equalities (2.1) by
vi> and exploit Avi = i vi to obtain
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.1. Linear systems and the Jordan normal form 15
(i) limk x(k) = 0n if and only if |i | < 1 for all i {1, . . . , n}, and
(ii) limk x(k) = (v1> x0 )v1 + + (vm
> x )v if and only if = = = 1 and | | < 1 for all
0 m 1 m i
i {m + 1, . . . , n}.
Theorem 2.4 (Jordan normal form). Each matrix A Cnn is similar to a block diagonal matrix
J Cnn , called the Jordan normal form of A, given by
J1 0 0
0 J2 . . . 0
J =. Cnn ,
.. . . . . . . 0
0 0 Jm
where each block Ji , called a Jordan block, is a square matrix of size ji and of the form
i 1 0
0 i . . . 0
Ji = . Cji ji . (2.2)
. .
.. .. .. 1
0 0 i
Clearly, m n and j1 + + jm = n.
We refer to (Horn and Johnson 1985) for a standard proof of this theorem. In other words, Theorem 2.4
implies there exists an invertible matrix T such that
A = T JT 1 (2.3)
AT = T J (2.4)
1 1
T A = JT . (2.5)
The matrix J is unique, modulo a re-ordering of the Jordan blocks. The eigenvalues of J, and therefore also
of A, are the (not necessarily distinct) numbers 1 , . . . , m . Given an eigenvalue ,
(i) the algebraic multiplicity of is the sum of the sizes of all Jordan blocks with eigenvalue (or,
equivalently, the multiplicity of as a root of the characteristic polynomial of A), and
(ii) the geometric multiplicity of is the number of Jordan blocks with eigenvalue (or, equivalently, the
number of linearly-independent eigenvectors associated to ).
An eigenvalue is
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
16 Chapter 2. Elements of Matrix Theory
(i) simple if it has algebraic and geometric multiplicity equal precisely to 1, that is, a single Jordan block
of size 1, and
(ii) semisimple if all its Jordan blocks have size 1, so that its algebraic and geometric multiplicity are
equal.
Let t1 , . . . , tn and r1 , . . . , rn denote the columns and rows of T and T 1 respectively. If all eigenvalues
of A are semisimple, then the equations (2.4) and (2.5) imply, for all i {1, . . . , n},
Ati = i ti and ri A = i ri .
In other words, the ith column of T is the right eigenvector (or simply eigenvector) of A corresponding to
the eigenvalue i , and the ith row of T 1 is the corresponding left eigenvector of A.
Finally, it is possible to have eigenvalues with larger algebraic than geometric multiplicity. In this case,
the columns of the matrix T are the right eigenvectors and the generalized right eigenvectors of A, whereas
the rows of T 1 are the left eigenvectors and the generalized left eigenvector of A. For more details about
generalized eigenvectors, we refer to reader to (Horn and Johnson 1985).
Example 2.5 (Revisiting the wireless sensor network example). Next, as numerical example, let us
reconsider the wireless sensor network discussed in Section 1.2 and the 4-dimensional row-stochastic matrix
Awsn , which we report here for convenience:
1/2 1/2 0 0
1/4 1/4 1/4 1/4
Awsn = 0 1/3 1/3 1/3 .
Therefore, the eigenvalues of A are 1, 0, 24
1
(5 73) 0.14, and 1
24 (5 + 73) 0.56. Corresponding to
the eigenvalue 1, the right and left eigenvector equations are:
> >
1 1 1/6 1/6
1 1 1/3 1/3
Awsn
1 = 1 and Awsn = .
1/4 1/4
1 1 1/4 1/4
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.1. Linear systems and the Jordan normal form 17
so that, for a square matrix A with Jordan blocks Ji , i {1, . . . , m}, the following statements are equivalent:
(i) A is semi-convergent (resp. convergent),
(ii) J is semi-convergent (resp. convergent), and
(iii) each block Ji is semi-convergent (resp. convergent).
Next, we compute the kth power of the generic Jordan block Ji with eigenvalue i as a function of
block size 1, 2, 3, . . . , ji ; they are, respectively,
k k k1 k k2
i 1 i 2 i jik1 ikji +1
..
k kk1 k k2 k k k1 . . .
" # i 0 i 1 i .
k
k i ki k1
i 2 i
i , , k k1 , . . . , . . . .
0 k ,
0 ki i i .. .. .. .. k k2
2 i
0 0 ki
k k k1
0 0 i 1 i
0 0 ki
(2.6)
k k
where the binomial coefficient = k!/(m!(k m)!) satisfies
m m k m /m!.
Note that, independently
of the size of Ji , each entry of the kth power of Ji is upper bounded by a constant times k h ki for some
nonnegative integer h. Because exponentially-decaying factors dominate polynomially-growing terms, we
know
0, if || < 1,
h k
lim k = 1, if = 1 and h = 0,
k
non-existent or unbounded, if ( = 1) or (|| > 1) or ( = 1 and h = 1, 2, . . . ).
(2.7)
In summary, for each block Ji with eigenvalues i , we can infer that:
(i) a block Ji of size 1 is convergent if and only if |i | < 1,
(ii) a block Ji of size 1 is semi-convergent and not convergent if and only if i = 1, and
(iii) a block Ji of size larger than 1 is semiconvergent and convergent if and only if |i | < 1.
Based on this discussion, we are now ready to present necessary and sufficient conditions for semi-
convergence and convergence of an arbitrary square matrix.
We complete this discussion with two useful definitions and the main result of this section.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
18 Chapter 2. Elements of Matrix Theory
1 1 1
(a) The spectrum of a convergent matrix (b) The spectrum of a semiconvergent (c) The spectrum of a matrix that is not
matrix, provided the eigenvalue 1 is semiconvergent.
semisimple.
Definition 2.6 (Spectrum and spectral radius of a matrix). Given a square matrix A,
or, equivalently, the radius of the smallest disk in C centered at the origin and containing the spectrum
of A.
Theorem 2.7 (Convergence and spectral radius). For a square matrix A, the following statements hold:
(i) nonnegative (respectively positive) if aij 0 (respectively aij > 0) for all i and j in {1, . . . , n};
(ii) row-stochastic if nonnegative and A1n = 1n ;
(iii) column-stochastic if nonnegative and A> 1n = 1n ; and
(iv) doubly-stochastic if it is row- and column-stochastic.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.2. Row-stochastic matrices and their spectral radius 19
In the following, we write A > 0 and v > 0 (respectively A 0 and v 0) for a positive (respectively
nonnegative) matrix A and vector v.
Given a finite number of points p1 , p2 , . . . , pn in Rn , a convex combination of p1 , p2 , . . . , pn is a point
of the form
1 p1 + 2 p2 + + n pn
where the real numbers 1 , . . . , n satisfy 1 + + n = 1 and i 0 for all i {1, . . . , n}. (For example,
on the plane R2 , the set of convex combinations of two distinct points is the segment connecting them and
the set of convex combinations of three distinct points is the triangle (including its interior) defined by
them.) The numbers 1 , . . . , n are called convex combination coefficients and each row of a row-stochastic
matrix consists of convex combination coefficients.
Theorem 2.8 (Gergorin Disks Theorem). For any square matrix A Rnn ,
[ n Xn o
spec(A) z C |z aii | |aij | .
i{1,...,n} j=1,j6=i
| {z }
n
disk in the complex plane centered at aii with radius
P
j=1,j6=i |aij |
Proof. Consider the eigenvalue equation Ax = x for the eigenpair (, x), where and x 6= 0n are
in general complex. Choose the index i {1, . . . , n} so that |xi | =Pmaxj{1,...,n} |xj | > 0. The ith
component of the eigenvalue equation can be rewritten as aii = nj=1,j6=i aij xj /xi . Now, take the
complex magnitude of this equality and upper-bound its right-hand side:
n
X xj
n
X |xj |
n
X
| aii | = aij |aij | |aij | .
xi |xi |
j=1,j6=i j=1,j6=i j=1,j6=i
This inequality defines a set of the possible locations for the arbitrary eigenvalue of A. The statement
follows by taking the union of such sets for each eigenvalue of A.
Each disk in the theorem statement is referred to as a Gergorin disks, or more accurately, as a Gergorin
row disks; an analogous disk theorem can be stated for Gergorin column disks. Exercise E2.16 showcases an
instructive application to distributed computing of numerous topics covered so far, including convergence
notions and the Gergorin Disks Theorem.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
20 Chapter 2. Elements of Matrix Theory
Proof. First, recall that A being row-stochastic is equivalent to two facts: aij 0, i, j {1, . . . , n},
and A1n = 1n . The second fact implies that 1n is an eigenvector with eigenvalue 1. Therefore, by
definition of spectral radius, (A) 1. Next, we prove that (A) 1 by invoking the Gergorin Disks
Theorem 2.8 to show that spec(A) is contained in the unit disk centered at the origin. The Gergorin disks
of a row-stochastic matrix as illustrated in Figure 2.2.
aii 1
X
aij
j6=i
Figure 2.2: All Gergorin disks of a row-stochastic matrix are contained in the unit disk.
P
Note that A being row-stochastic implies aii [0, 1] and aii + j6=i aij = 1. Hence, the center of the
ith Gergorin disk belongs to the positive real axis between 0 and 1, and the right-most point in the disk is
at 1.
Note: because 1 is an eigenvalue of each row-stochastic matrix A, clearly A is not convergent. But it is
possible for A to be semi-convergent.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.3. PerronFrobenius theory 21
Based on these preliminary examples, we now introduce two sets of nonnegative matrices with certain
characteristic properties.
Note that A1 , A3 and A5 are reducible wereas A2 and A4 are irreducible. Moreover, note that A2 is not
primitive whereas A4 is. Additionally note that a positive matrix is clearly primitive. Finally, note that, if
there is k N such that Ak is positive, then (one can show that) all subsequent powers Ak+1 , Ak+2 , . . .
are necessarily positive as well; see Exercise E2.5.
We now state a useful result and postpone its proof to Exercise E4.5.
Lemma 2.11 (A primitive matrix is irreducible). If a nonnegative matrix is primitive, then it is also
irreducible.
As a consequence of this lemma we can draw the set diagram in Figure 2.3 describing the set of
nonnegative square matrices and its subsets of irreducible, primitive and positive matrices. Note that the
inclusions in the diagram are strict in the sense that:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
22 Chapter 2. Elements of Matrix Theory
non-negative irreducible
P primitive positive
n 1 k
(A 0) ( k=0 A > 0) (there exists k (A > 0)
such that Ak > 0)
Figure 2.3: The set of nonnegative square matrices and its subsets of irreducible, primitive and positive matrices.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.3. PerronFrobenius theory 23
Remark 2.13 (Examples and counterexamples). The characterizations in the theorem are sharp in the
following sense:
0 1
(i) the matrix A3 = is nonnegative and reducible, and, indeed, its dominant eigenvalue is 0;
0 0
0 1
(ii) the matrix A2 = is irreducible but not primitive and, indeed, its dominant eigenvalues +1 is not
1 0
stricly larger, in magnitude, than the other eigenvalues 1.
Proposition 2.14 (Powers of primitive matrices). For a primitive matrix A with dominant eigenvalue
and with dominant right and left eigenvectors v and w normalized so that v > w = 1, we have
lim Ak /k = vw> .
k
We now apply this result to row-stochastic matrices. Recall that A 0 is row-stochastic if A1n = 1n .
Therefore, the right eigenvector of the eigenvalue 1 can be selected as 1n .
Corollary 2.15 (Consensus for primitive row-stochastic). For a primitive row-stochastic matrix A,
(i) the simple eigenvalue (A) = 1 is strictly larger than the magnitude of all other eigenvalues, hence A is
semi-convergent;
(ii) limk Ak = 1n w> , where w is the left positive eigenvector of A with eigenvalue 1 satisfying w1 +
+ wn = 1;
(iii) the solution to x(k + 1) = Ax(k) satisfies
lim x(k) = w> x(0) 1n ;
k
1>
n x(0)
lim x(k) = 1n = average x(0) 1n .
k n
In this case we say that the dynamical system achieves average consensus.
> >
w w1 w2 wn w x(0)
.. , and (1 w> )x(0) = (w> x(0))1 = .. .
Note: 1n w> = ... = ... ..
.
..
. . n n .
w> w1 w2 wn w> x(0)
Note: the limiting vector is therefore a weighted average of the initial conditions. The relative weights
of the initial conditions are the convex combination coefficients w1 , . . . , wn . In a social influence network,
the coefficient wi is regarded as the social influence of agent i. An early reference to average consensus
is (Harary 1959).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
24 Chapter 2. Elements of Matrix Theory
Example 2.16 (Revisiting the wireless sensor network example). Finally, as numerical example, let
us reconsider the wireless sensor network discussed in Section 1.2 and the 4-dimensional row-stochastic matrix
Awsn . First, note that Awsn is primitives because A2wsn is positive:
1/2 1/2 0 0 3/8 3/8 1/8 1/8
1/4 1/4 1/4 1/4 3/16 17/48 11/48 11/48
Awsn =
0
= A2wsn =
1/12 11/36 11/36 11/36 .
1/3 1/3 1/3
0 1/3 1/3 1/3 1/12 11/36 11/36 11/36
Therefore, the PerronFrobenius Theorem 2.12 for primitive matrices applies to Awsn . The four pairs of
eigenvalues and right eigenvectors of Awsn (as computed in Example 2.5) are:
2 273 2(1 + 73) 0
1 1 0
(1, 14 ), (5 + 73), 11 + 73 , (5 73), 11 73 , 0, .
24 8 24 8 1
8 8 1
Moreover, we know that Awsn is semi-convergent. To apply the convergence results in Corollary 2.15, we numer-
ically compute its left dominant eigenvector, normalized to have unit sum, to be w = [1/6, 1/3, 1/4, 1/4]> so
that we have:
1/6 1/3 1/4 1/4
1/6 1/3 1/4 1/4
lim Akwsn = 14 w> = 1/6 1/3 1/4 1/4 .
k
1/6 1/3 1/4 1/4
Therefore, each solution to the averaging system x(k + 1) = Awsn x(k) converges to a consensus vector
(w> x(0))14 , that is, the value at each node of the wireless sensor network converges to w> x(0) = (1/6)x1 (0)+
(1/3)x2 (0) + (1/4)x3 (0) + (1/4)x4 (0). Note that Awsn is not doubly-stochastic and, therefore, the averaging
algorithm does not achieve average consensus and that node 2 has more influence than the other nodes.
Note: If A is reducible, then clearly it is not primitive. Yet, it is possible for an averaging algorithm
described by a reducible matrix to converge to consensus. In other words, Corollary 2.15 provides only
a sufficient condition for consensus. Here is a simple example of an averaging algorithm described by a
reducible matrix that converges to consensus:
x1 (k + 1) = x1 (k),
x2 (k + 1) = x1 (k).
To fully understand what all phenomena are possible and what properties of A are necessary and sufficient
for convergence to consensus, we will study graph theory in the next two chapters.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.3. PerronFrobenius theory 25
Proof of Theorem 2.12 We start by establishing that a primitive A matrix satisfies (A) > 0. By
contradiction, if spec(A) = {0}, then the Jordan normal form J of A is nilpotent, that is, there is a k N
so that J k = Ak = 0 for all k k . But this is a contradiction because A being primitive implies that there
is k N so that Ak > 0 for all k k .
Next, we prove that (A) is a real positive eigenvalue with a positive right eigenvector v > 0. We first
focus on the case that A is a positive matrix, and later show how to generalize the proof to the case of
primitive matrices. Without loss of generality, assume (A) = 1. If (, x) is an eigenpair for A such that
|| = (A) = 1, then
Here, we use the notation |x| = (|xi |)i{1,...,n} , |A| = {|aij |}i,j{1,...,n} , and vector inequalities are
understood component-wise. In what follows, we show |x| = A|x|. With the shorthands z = A|x| and
y = z |x|, equation (2.8) reads y 0 and we aim to show y = 0. By contradiction, assume y has a
non-zero component. Therefore, Ay > 0. Independently, we also know z = A|x| > 0. Thus, there must
exist > 0 such that Ay > z. Eliminating the variable y in the latter equation, we obtain A z > z, where
we define A = A/(1 + ). The inequality A z > z implies Ak z > z for all k > 0. Now, observe that
(A ) < 1 so that limk Ak = 0nn and therefore 0 > z. Since we also knew z > 0, we now have a
contradiction. Therefore, we know y = 0.
So far, we have established that |x| = A|x|, so that (1, |x|) is an eigenpair for A. Also note that A > 0
and x 6= 0 together imply A|x| > 0. Therefore we have established that 1 is an eigenvalue of A with
eigenvector |x| > 0. Next, observe that the above reasoning is correct also for primitive matrices if one
replaces the first equality (2.8) by |x| = |k ||x| and carries the exponent k throughout the proof.
In summary, we have established that there exists a real eigenvalue > 0 such that || for all
other eigenvalues , and that each right (and therefore also left) eigenvector of can be selected positive
up to rescaling. It remains to prove that is simple and is strictly greater than the magnitude of all other
eigenvalues. For the proof of these two points, we refer to (Meyer 2001, Chapter 8).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
26 Chapter 2. Elements of Matrix Theory
The first column of the above matrix equation is Av1 = v1 , that is, v1 is the dominant right eigenvector
of A. By analogous arguments, we find that w1 is the dominant left eigenvector of A. Next, we recall
k
k 0
A =T T 1 so that
0 B
A k
1k 0 1 1 0 1
lim =T lim T =T T .
k+ k+ 0 (B/)k 0 0
Here we used the fact that Theorem 2.12 implies (B/) < 1, which in turn implies limk+ B k /k =
0(n1)(n1) by Theorem 2.7. Moreover,
>
1 0 0 ... 0 w1
0 0 0 . . . 0 w2>
0 >
lim Ak /k = v1 v2 v3 . . . vm 0 0 0 . . . w3 = v1 w1> .
k+ . . . . .. ..
.. .. .. . . . .
0 0 0 ... 0 >
wm
Finally, the (1, 1) entry of the matrix equality T T 1 = In gives precisely the normalization v1> w1 = 1.
This concludes the proof of Proposition 2.14.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
2.4. Exercises 27
2.4 Exercises
E2.1 Simple properties of stochastic matrices. Let A1 , A2 , . . . , Ak be n n matrices, let A1 A2 Ak be their
product and let 1 A1 + +k Ak be their convex combination with arbitrary convex combination coefficients.
Show that
(i) if A1 , A2 , . . . , Ak are nonnegative, then their product and all their convex combinations are nonnegative,
(ii) if A1 , A2 , . . . , Ak are row-stochastic, then their product and all their convex combinations are row-
stochastic, and
(iii) if A1 , A2 , . . . , Ak are doubly-stochastic, then their product and all their convex combinations are
doubly-stochastic.
E2.2 Semi-convergence and Jordan block decomposition. Consider a matrix A Cnn , n 2, with (A) =
1. Show that the following statements are equivalent:
(i) A is semi-convergent,
(ii) either A = In or there exists a nonsingular matrix T Cnn and a number m {1, . . . , n 1} such
that
Im 0m(nm) 1
A=T T ,
0(nm)m B
where B C(nm)(nm) is convergent, that is, (B) < 1.
(Note that, if A is real, then it is possible to find real T and B in statement (ii) by using the notion of real
Jordan normal form (Hogben 2013).)
E2.3 Row-stochastic matrices after pairwise-difference similarity transform. For n 2, let A Rnn be
row stochastic. Define T Rnn by
1 1
.. ..
. .
T = .
1 1
1/n 1/n . . . 1/n
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
28 Chapter 2. Elements of Matrix Theory
then the matrix A is primitive. (Here the symbol ? denotes a strictly positive entry. The absence of a symbol
denotes a positive or zero entry.)
E2.8 Reducibility fallacies. Consider the following statement:
Any nonnegative square matrix A Rnn with a zero entry is reducible, because the zero entry
can be moved in position An,1 via a permutation.
Is the statement true? If yes, explain why; if not, provide a counterexample.
E2.9 Symmetric doubly-stochastic matrix. Let A Rnn be doubly-stochastic. Show that:
(i) the matrix A> A is doubly-stochastic and symmetric,
(ii) spec(A> A) [0, 1],
(iii) the eigenvalue 1 of A> A is not necessarily simple even if A is irreducible.
E2.10 On some nonnegative matrices. How many 2 2 matrices exist that are simultaneously doubly stochastic,
irreducible and not primitive? Justify your claim.
E2.11 Discrete-time affine systems. Given A Rnn and b Rn , consider the discrete-time affine system
x(k + 1) = Ax(k) + b.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 2 29
(ii) for each R, there exists a unique equilibrium point x satisfying 1>
n x = , and
(iii) all solutions with initial condition x(0) satisfying 1n x(0) = converge to x .
>
Hint: This statement, written in the style of (MeyerP2001, Section 7.10), is an extension of Theorem 2.7 and a
generalization of the classic geometric series 1x
1
= k=0 xk , convergent for all |x| < 1. For the proof, the hint
is to use the Jordan normal form.
E2.14 Orthogonal and permutation matrices. A set G with a binary operation mapping two elements of G into
another element of G, denoted by (a, b) 7 a ? b, is a group if:
a ? (b ? c) = (a ? b) ? c for all a, b, c G (associativity property);
there exists e G such that a ? e = e ? a = a for all a G (existence of an identity element); and
there exists a1 G such that a ? a1 = a1 ? a = e for all a G (existence of inverse elements).
Recall that: an orthogonal matrix R is a square matrix whose columns and rows are orthonormal vectors,
i.e., RR> = In ; an orthogonal matrix acts on a vector like a rotation and/or reflection; let O(n) denote the set
of orthogonal matrices. Similarly, recall that: a permutation matrix is a square binary (i.e., entries equal to 0
and 1) matrix with precisely one entry equal to 1 in every row and every columns; a permutation matrix acts
on a vector by permuting its entries; let Pn denote the set of permutation matrices. Prove that
(i) the set of orthogonal matrices O(n) with the operation of matrix multiplication is a group;
(ii) the set of permutation matrices Pn with the operation of matrix multiplication is a group; and
(iii) each permutation matrix is orthogonal.
E2.15 On doubly-stochastic and permutation matrices. The following result is known as the Birkhoff Von
Neumann Theorem. For a matrix A Rnn , the following statements are equivalent:
(i) A is doubly-stochastic; and
(ii) A is a convex combination of permutation matrices.
Do the following:
show that the set of doubly-stochastic matrices is convex (i.e., given any two doubly-stochastic matrices
A1 and A2 , any matrix of the form A1 + (1 )A2 , for [0, 1], is again doubly-stochastic);
show that (ii) = (i);
find in the literature a proof of (i) = (ii) and sketch it in one or two paragraphs.
E2.16 The Jacobi relaxation in parallel computation. Consider n distributed processors that aim to collectively
solve the linear equation Ax = b, where b Rn and A Rnn is invertible and its diagonal elements aii
are nonzero. Each processor stores a variable xi (k) as the discrete-time variable k evolves and applies the
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
30 Chapter 2. Elements of Matrix Theory
following iterative strategy termed Jacobi relaxation. At time step k N each processor performs the local
computation
1
Xn
xi (k + 1) = bi aij xj (k) , i {1, . . . , n}.
aii
j=1,j6=i
Next, each processor i {1, . . . , n} sends its value xi (k + 1) to all other processors j {1, . . . , n} with
aji 6= 0, and they iteratively repeat the previous computation. The initial values of the processors are arbitrary.
(i) Assume the Jacobi relaxation converges, i.e., assume limk x(k) = x . Show that Ax = b.
(ii) Give a necessary and sufficient condition for the Jacobi relaxation to converge.
(iii) Use Gergorin Disks Theorem 2.8 to P show that the Jacobi relaxation converges if A is strictly row
n
diagonally dominant, that is, if |aii | > j=1,j6=i |aij | for all i {1, . . . , n}.
E2.17 The Jacobi over-relaxation in parallel computation. We now consider a more sophisticated version of the
Jacobi relaxation presented in Exercise E2.16. Consider again n distributed processors that aim to collectively
solve the linear equation Ax = b, where b Rn and A Rnn is invertible and its diagonal elements aii
are nonzero. Each processor stores a variable xi (k) as the discrete-time variable k evolves and applies the
following iterative strategy termed Jacobi over-relaxation. At time step k N each processor performs the
local computation
n
X
xi (k + 1) = (1 )xi (k) + bi aij xj (k) , i {1, . . . , n},
aii
j=1,j6=i
where R is an adjustable parameter. Next, each processor i {1, . . . , n} sends its value xi (k + 1) to all
other processors j 6= i with aji 6= 0, and they iteratively repeat the previous computation. The initial values
of the processors are arbitrary.
(i) Assume the Jacobi over-relaxation converges to x? and show that Ax? = b if 6= 0.
(ii) Find the expression governing the dynamics of the error variable e(k) := x(k) x? .
P
(iii) Suppose that A is strictly row diagonally dominant, that is |aii | > j6=i |aij |. Use the Gergorin Disks
Theorem 2.8 to discuss the convergence properties of the algorithm for all possible values of R.
Hint: Consider different thresholds for .
E2.18 Robotic coordination and geometric optimization on the real line. Consider n 3 robots with dynam-
ics pi = ui , where i {1, . . . , n} is an index labeling each robot, pi R is the position of robot i, and ui R
is a steering control input. For simplicity, assume that the robots are indexed according to their initial position:
p1 (0) p2 (0) p3 (0) pn (0). We consider the following distributed control laws to achieve some
geometric configuration:
(i) Move towards the centroid of your neighbors: The robots i {2, . . . , n 1} (each having two neighbors)
move to the centroid of the local subset {pi1 , pi , pi+1 }:
1
pi = (pi1 + pi + pi+1 ) pi , i {2, . . . , n 1} .
3
The robots {1, n} (each having one neighbor) move to the centroid of the local subsets {p1 , p2 } and
{pn1 , pn }, respectively:
1 1
p1 = (p1 + p2 ) p1 and pn = (pn1 + pn ) pn .
2 2
By using these coordination laws, the robots asymptotically rendezvous.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 2 31
(ii) Move towards the centroid of your neighbors or walls: Consider two walls at the positions p0 p1
and pn+1 pn so that all robots are contained between the walls. The walls are stationary, that is,
p0 = 0 and pn+1 = 0. Again, the robots i {2, . . . , n 1} (each having two neighbors) move to
the centroid of the local subset {pi1 , pi , pi+1 }. The robots {1, n} (each having one robotic neighbor
and one neighboring wall) move to the centroid of the local subsets {p0 , p1 , p2 } and {pn1 , pn , pn+1 },
respectively. Hence, the closed-loop robot dynamics are
1
pi = (pi1 + pi + pi+1 ) pi , i {1, . . . , n} .
3
By using these coordination laws, the robots become uniformly spaced on the interval [p0 , pn+1 ].
(iii) Move away from the centroid of your neighbors or walls: Again consider two stationary walls at p0 p1 and
pn+1 pn containing the positions of all robots. We partition the interval [p0 , pn+1 ] into areas of interest,
where each robot gets a territory assigned that is closer to itself than to other robots. Hence, robot i
{2, . . . , n 1} (having two neighbors) obtains the partition Vi = [(pi + pi1 )/2, (pi+1 + pi )/2], robot 1
obtains the partition V1 = [p0 , (p1 +p2 )/2], and robot n obtains the partition Vn = [(pn1 +pn )/2, pn+1 ].
We want to design a distributed algorithm such that the robots have equally sized partitions. We consider
a simple coordination law, where each robot i heads for the midpoint ci (Vi (p)) of its partition Vi :
pi = ci (Vi (p)) pi .
By using these coordination laws, the robots partitions asymptotically become equally large.
Consider n = 3 robots, take your favorite problem from above, and show that both the continuous-time and
discrete-time dynamics asymptotically lead to the desired geometric configurations.
E2.19 Continuous-time cyclic pursuit. Consider four mobile robotic vehicles, indexed by i {1, 2, 3, 4}. We
model each robot as fully-actuated kinematic point mass, that is, we write pi = ui , where pi C is the
position of robot i in the plane and ui C is its velocity command. The robots are equipped with onboard
cameras as sensors. The task of the robots is rendezvous at a common point (while using only onboard sensors).
A simple strategy to achieve rendezvous is cyclic pursuit: each robot i picks another robot, say i + 1, and
pursues it. This gives rise to the control ui = pi+1 pi and the closed-loop system
p1 1 1 0 0 p1
p2 0 1 1 0 p2
= .
p3 0 0 1 1 p3
p4 1 0 0 1 p4
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
32 Chapter 2. Elements of Matrix Theory
0.3
0.2
0.1
y 0
-0.1
-0.2
-0.3
(iii) Prove that if the robots are initially arranged in a square formation, they remain in a square formation
under cyclic pursuit.
Hint: Recall that for a matrix
PA with semisimple eigenvalues, the solution to the equation x = Ax is given by
the modal expansion x(t) = i ei t vi wi> x(0), where i is an eigenvalue, and vi and wi are the associated right
and left eigenvectors pairwise normalized to wi> vi = 1.
E2.20 Simulation (contd). This is a followup to Exercise E1.1. Consider the linear averaging algorithm in equa-
tion (1.1): set n = 5, select the initial state equal to (1, 1, 1, 1, 1), and use (a) the complete graph (b) a ring
graph, and (c) a star graph with node 1 as center.
(i) To which value do all nodes converge to?
(ii) Compute the dominant left eigenvector of the averaging matrix associated to each of the three graphs
and verify that the result in Corollary 2.15(iii) is correct.
E2.21 Continuous- and discrete-time control control of mobile robots. Consider n robots moving on the line
with positions z1 , z2 , . . . zn R. In order to gather at a common location (i.e., reach rendezvous), each robot
heads for the centroid of its neighbors, that is,
1 X
n
zi = zj zi .
n1
j=1,j6=i
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 2 33
(ii) Consider the Euler discretization of the above closed-loop dynamics with sampling rate T > 0:
1 X
n
zi (k + 1) = zi (k) + T zj (k) zi (k) .
n1
j=1,j6=i
For which values of the sampling period T will the robots rendezvous?
Hint: Use the modal decomposition in Remark 2.3.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 3
Elements of Graph Theory
In this chapter we review some basic concepts from graph theory as exposed in standard books, e.g.,
see (Diestel 2000; Bollobs 1998). Graph theory provides key concepts to model, analyze and design network
systems and distributed algorithms; the language of graphs pervades modern science and technology and
is therefore essential
[Graphs] An undirected graph (in short, a graph) consists of a set V of elements called vertices and of a set
E of unordered pairs of vertices, called edges. For u, v V and u 6= v, the set {u, v} denotes an unordered
edge. We define and visualize some basic examples graphs in Figure 3.1.
Figure 3.1: Example graphs. First row: the ring graph with 6 nodes, a star graph with 7 nodes, a tree (see definition
below), the complete graph with 6 nodes (usually denoted by K(6)). Second row: the complete bipartite graph with
3 + 3 nodes (usually denoted by K(3, 3)), a grid graph, and the Petersen graph.
35
36 Chapter 3. Elements of Graph Theory
[Neighbors and degrees in graphs] Two vertices u and v of a given graph are neighbors if {u, v} is an
undirected edge. Given a graph G, we let NG (v) denote the set of neighbors of v.
The degree of v is the number of neighbors of v. A graph is regular if all the nodes have the same degree;
e.g., in Figure 3.1, the ring graph is regular with degree 2 whereas the complete bipartite graph K(3, 3) and
the Petersen graph are regular with degree 3.
[Digraphs and self-loops] A directed graph (in short, a digraph) of order n is a pair G = (V, E), where V is a
set with n elements called vertices (or nodes) and E is a set of ordered pairs of vertices called edges. In other
words, E V V . As for graphs, V and E are the vertex set and edge set, respectively. For u, v V , the
ordered pair (u, v) denotes an edge from u to v. A digraph is undirected if (v, u) E anytime (u, v) E.
In a digraph, a self-loop is an edge from a node to itself. Consistenyl with a customary convention, self-loops
are not allowed in graphs. We define and visualize some basic examples digraphs in Figure 3.2.
Figure 3.2: Example digraphs: the ring digraph with 6 nodes, the complete graph with 6 nodes, and a directed acyclic
graph, i.e., a digraph with no directed cycles.
[In- and out-neighbors] In a digraph G with an edge (u, v) E, u is called an in-neighbor of v, and v is
called an out-neighbor of u. We let N in (v) (resp., N out (v)) denote the set of in-neighbors, (resp. the set of
out-neighbors) of v. Given a digraph G = (V, E), an in-neighbor of a nonempty set of nodes U is a node
v V \ U for which there exists an edge (v, u) E for some u U .
[In- and out-degree] The in-degree din (v) and out-degree dout (v) of v are the number of in-neighbors and
out-neighbors of v, respectively. Note that a self-loop at a node v makes v both an in-neighbor as well as an
out-neighbor of itself. A digraph is topologically balanced if each vertex has the same in- and out-degrees
(even if distinct vertices have distinct degrees).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
3.3. Paths and connectivity in digraphs 37
[Paths] A path in a graph is an ordered sequence of vertices such that any pair of consecutive vertices in
the sequence is an edge of the graph. A path is simple if no vertex appears more than once in it, except
possibly for the initial and final vertex.
[Connectivity and connected components] A graph is connected if there exists a path between any two vertices.
If a graph is not connected, then it is composed of multiple connected components, that is, multiple connected
subgraphs.
[Cycles] A cycle is a simple path that starts and ends at the same vertex and has at least three distinct
vertices. A graph is acyclic if it contains no cycles. A connected acyclic graph is a tree.
Figure 3.3: This graph has two connected components. The leftmost connected component is a tree, while the
rightmost connected component is a cycle.
[Directed paths] A directed path in a digraph is an ordered sequence of vertices such that any pair of
consecutive vertices in the sequence is a directed edge of the digraph. A directed path is simple if no vertex
appears more than once in it, except possibly for the initial and final vertex.
[Cycles in digraphs] A cycle in a digraph is a simple directed path that starts and ends at the same vertex. It
is customary to accept as feasible cycles in digraphs also cycles of length 1 (that is, a self-loop) and cycles
of length 2 (that is, composed of just 2 nodes). The set of cycles of a directed graph is finite. A digraph is
acyclic if it contains no cycles.
[Sources and sinks] In a digraph, every vertex with in-degree 0 is called a source, and every vertex with
out-degree 0 is called a sink. Every acyclic digraph has at least one source and at least one sink; see
Exercise E3.1.
Figure 3.4: Acyclic digraph with one sink and two sources. Figure 3.5: Directed cycle.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
38 Chapter 3. Elements of Graph Theory
[Directed trees] A directed tree (sometimes called a rooted tree) is an acyclic digraph with the following
property: there exists a vertex, called the root, such that any other vertex of the digraph can be reached
by one and only one directed path starting at the root. A directed spanning tree of a digraph is a spanning
subgraph that is a directed tree.
(i) G is strongly connected if there exists a directed path from any node to any other node;
(ii) G is weakly connected if the undirected version of the digraph is connected;
(iii) G possesses a globally reachable node if one of its nodes can be reached from any other node by
traversing a directed path; and
(iv) G possesses a directed spanning tree if one of its nodes is the root of directed paths to every other
node.
An example of a strongly connected graph is shown in Figure 3.6, and a weakly connected graph with a
globally reachable node is illustrated in Figure 3.7.
3 4 3 4
2 5 2 5
1 6 1 6
Figure 3.6: A strongly connected digraph Figure 3.7: A weakly connected digraph with a globally
reachable node, node #2.
For a digraph G = (V, E), the reverse digraph G(rev) has vertex set V and edge set E(rev) composed
of all edges in E with reversed direction. Clearly, a digraph contains a directed spanning tree if and only if
the reverse digraph contains a globally reachable node.
[Periodic and aperiodic digraphs] A strongly-connected directed graph is periodic if there exists a k > 1,
called the period, that divides the length of every cycle of the graph. In other words, a digraph is periodic if
the greatest common divisor of the lengths of all its cycles is larger than one. A digraph is aperiodic if it is
not periodic.
Note: the definition of periodic digraph is well-posed because a digraph has only a finite number of
cycles (because of the assumptions that nodes are not repeated in simple paths). The notions of periodicity
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
3.3. Paths and connectivity in digraphs 39
Figure 3.8: (a) A periodic digraph with period 2. (b) An aperiodic digraph with cycles of length 1 and 2. (c) An
aperiodic digraph with cycles of length 2 and 3.
and aperiodicity only apply to digraphs and not to undirected graphs (where the notion of a cycle is defined
differently). Any strongly-connected digraph with a self-loop is aperiodic.
[Condensation digraph] The condensation digraph of a digraph G, denoted by C(G), is defined as follows:
the nodes of C(G) are the strongly connected components of G, and there exists a directed edge in C(G)
from node H1 to node H2 if and only if there exists a directed edge in G from a node of H1 to a node of H2 .
Figure 3.9: An example digraph, its strongly connected components and its condensation.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
40 Chapter 3. Elements of Graph Theory
Lemma 3.1 (Properties of the condensation digraph). For a digraph G and its condensation digraph
C(G),
(i) C(G) is acyclic,
(ii) G is weakly connected if and only if C(G) is weakly connected, and
(iii) the following statement are equivalent:
a) G contains a globally reachable node,
b) C(G) contains a globally reachable node, and
c) C(G) contains a unique sink.
Proof. We prove statement (i) by contradiction. If there exists a cycle (H1 , H2 , . . . , Hm , H1 ) in C(G), then
the set of vertices H1 , . . . , Hm are strongly connected in C(G). But this implies that also the subgraph
of G containing all node of H1 , . . . , Hm is strongly connected in G. But this is a contradiction with the
fact that any subgraph of G strictly containing any of the H1 , . . . , Hm must be not strongly connected.
Statement (ii) is intuitive and simple to prove; we leave this task to the reader.
Regarding statement (iii), we start by proving that (iii)a = (iii)b. Let v be a a globally reachable node
in G and let H be an arbitrary node of C(G). Let H denote the nodes in C(G) containing v and pick a
node v in H. Since v is globally reachable, there exists a directed path from v to v in G. This directed path
induces naturally a directed path in C(G) from H to H. This shows that H is a globally reachable node in
C(G).
Regarding (iii)b = (iii)a, let H be a globally reachable node of C(G) and pick a node v in H. We
claim v is globally reachable in G. Indeed, pick any node v in G belonging to a strongly connected
component U of G. Because H is globally reachable in C(G), there exists a directed path of the form
H = H0 , H1 , . . . , Hk , Hk+1 = H in C(G). One can now piece together a directed path in G from v to v,
by walking inside each of the strongly connected components Hi and moving to the subsequent strongly
connected components Hi+1 , for i {0, . . . , k}.
The final equivalence between statement (iii)b and statement (iii)c is an immediate consequence of
C(G) being acyclic.
3.7
2.3
a12 = 3.7, a13 = 3.7, a21 = 8.9,
8.9 3.7 4.4
a24 = 1.2, a34 = 3.7, a35 = 2.3,
3
1 3.7 2.3 a51 = 4.4, a54 = 2.3, a55 = 4.4.
4.4 5
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
3.5. Appendix: Database collections and software libraries 41
The weighted digraph G is weight-balanced if dout (vi ) = din (vi ) for all vi V .
Useful software libraries for network analysis and visualization are freely available online; here are
some examples:
(i) Gephi, available at https://gephi.org, is an interactive visualization and exploration platform for
all kinds of networks and complex systems, dynamic and hierarchical graphs. Datasets are available
at https://wiki.gephi.org/index.php?title=Datasets.
(ii) NetworkX, available at http://networkx.github.io, is a Python library for network analysis. For
example, one feature is the ability to compute condensation digraphs. A second interesting feature
is the ability to generate numerous well-known model graphs, see http://networkx.lanl.gov/
reference/generators.html
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
42 Chapter 3. Elements of Graph Theory
Figure 3.10: Example networks from distinct domains: Figure 3.10a shows the standard IEEE 118 power grid testbed (118
nodes); Figure 3.10b shows the Klavzar bibliography network (86 nodes); Figure 3.10c shows the GD99c Pajek network
(105 nodes). Networks parameters are available at http://www.cise.ufl.edu/research/sparse/matrices, and
their layout is obtained via the graph drawing algorithm proposed by Hu (2005).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
3.6. Exercises 43
3.6 Exercises
E3.1 Acyclic digraphs. Let G be an acyclic digraph with n nodes. Show that:
(i) G contains at least one sink, i.e., a vertex without out-neighbors and at least one source, i.e., a vertex
without in-neighbors;
(ii) the vertices of G can be given labels in the set {1, . . . , n} in such a way that if (u, v) is an edge, then
label(u) > label(v). This labeling is called a topological sort of G. Provide an algorithm to define this
labelling; and
(iii) after topologically sorting its vertices, the adjacency matrix of the digraph is lower-triangular, i.e., all its
entries above the main diagonal are equal to zero.
E3.2 Condensation digraphs. Draw the condensation for each of the following digraphs.
E3.3 Directed spanning trees in the condensation digraph. For a digraph G and its condensation digraph
C(G), show that the following statements are equivalent:
(i) G is a tree;
(ii) G is connected and m = n 1; and
(iii) G is acyclic and m = n 1.
E3.5 Connectivity in topologically balanced digraphs. Prove the following statement: If a digraph G is
topologically balanced and contains either a globally reachable vertex or a directed spanning tree, then G is
strongly connected.
E3.6 Globally reachable nodes and disjoint closed subsets (Lin et al. 2005; Moreau 2005). Consider a digraph
G = (V, E) with at least two nodes. Prove that the following statements are equivalent:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
44 Chapter 3. Elements of Graph Theory
BASEL
ST. GALLEN
5
ZURICH 2
1
BERN
6
INTERLAKEN 3
LAUSANNE
4 CHUR
7
9
ZERMATT 8 LUGANO
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 4
The Adjacency Matrix
We review here basic concepts from algebraic graph theory. Standard books on algebraic graph theory
are (Biggs 1994; Godsil and Royle 2001). One objective is to relate matrix properties with graph theoretical
properties. A second objective is to understand when a row-stochastic matrix is primitive.
The binary adjacency matrix A {0, 1}nn of a digraph G = (V = {1, . . . , n}, E) or of a weighted
digraph is defined by
(
1, if (i, j) E,
aij = (4.1)
0, otherwise.
45
46 Chapter 4. The Adjacency Matrix
Finally, in a weighted digraph, the weighted out-degree matrix Dout and the weighted in-degree matrix
Din are the diagonal matrices defined by
dout (1) 0 0
..
Dout = diag(A1n ) = 0 . 0 , and Din = diag(A> 1n ),
0 0 dout (n)
(i) G is undirected if and only if A is symmetric and its diagonal entries are equal to 0;
(ii) G is weight-balanced if and only if A1n = A> 1n , i.e., Dout = Din ;
(iii) in a digraph G without self-loops, the node i is a sink in G if and only if ith row-sum of A is zero;
(iv) in a digraph G without self-loops, the node i is a source in G if and only if ith column-sum of A is
zero;
(v) A is row-stochastic if and only if each node of G has weighted out-degree equal to 1 (so that
Dout = In ); and
(vi) A is doubly-stochastic if and only if each node of G has weighted out-degree and weighted in-degree
equal to 1 (so that Dout = Din = In and, in particular, G is weight-balanced).
Next we relate the powers of the adjacency matrix with the existence of directed paths in the digraph.
We start with some simple observation. First, pick two nodes i and j and note that there exists a directed
path from i to j of length 1 (i.e., an edge) if and only if (A)ij > 0. Next, consider the formula for the matrix
power:
X n
(A2 )ij = (ith row of A) (jth column of A) = Aih Ahj .
h=1
A directed path from i to j of length 2 exists if and only if there exists a node k such that (i, k) and (k, j)
are edges of G. In turn, (i, k) and (k, j) are edges if and only if Aik > 0 and Akj > 0 and therefore
(A2 )ij > 0. In short, we know that a directed path from i to j of length 2 exists if and only if (A2 )ij > 0.
These observations lead to the following result, whose proof we leave as Exercise E4.1.
Lemma 4.1 (Directed paths and powers of the adjacency matrix). Let G be a weighted digraph with n
nodes, with weighted adjacency matrix A, with unweighted adjacency matrix A0,1 {0, 1}nn , and possibly
with self-loops. For all i, j {1, . . . , n} and k N
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
4.3. Graph theoretical characterization of irreducible matrices 47
(i) the (i, j) entry of Ak0,1 equals the number of directed paths of length k (including paths with self-loops)
from node i to node j; and
(ii) the (i, j) entry of Ak is positive if and only if there exists a directed path of length k (including paths
with self-loops) from node i to node j.
Theorem 4.2 (Connectivity properties of the digraph and positive powers of the adjacency ma-
trix). Let G be a weighted digraph with n 2 nodes and weighted adjacency matrix A. The following
statements are equivalent:
Pn1
(i) A is irreducible, that is, k=0 Ak > 0;
(ii) there exists no permutation matrix P such that P > AP is block triangular;
(iii) G is strongly connected;
(iv) for all partitions {I, J } of the index set {1, . . . , n}, there exists i I and j J such that {i, j} is an
edge in G.
Note: as the theorem establishes, there are four equivalent characterizations of irreducibility. In the
literature, it is common to define irreducibility through property (ii) or (iv). We next see two simple
examples.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
48 Chapter 4. The Adjacency Matrix
Proof of Theorem 4.2. Regarding (iii) = (iv), pick a partition {I, J } of the index set {1, . . . , n} and two
nodes i0 I and j0 J . By assumptions there exists a directed path from i0 to j0 . Hence there must exist
an edge from a node in I to a node in J .
Regarding (iv) = (iii), pick a node i {1, . . . , n} and let Ri {1, . . . , n} be the set of nodes
reachable from i, i.e., the set of nodes that belong to directed paths originating from node i. Denote the
unreachable nodes by Ui = {1, . . . , n} \ Ri . Second, by contradiction, assume Ui is not empty. Then
Ri Ui is a partition of the index set {1, . . . , n} and irreduciblity implies the existence of a non-zero entry
ajh with j Ri and h Ui . But then the node h is reachable. Therefore, Ui = , and all nodes are
reachable from i.
Regarding (iii) = (i), because G is strongly connected, there exists a directed path of length k 0
connecting node i to node j, for all i and j. By removing any cycle from such a path exists (so that no
intermediate node is repeated), one can compute a path from i to j of length k P < n. Hence, by Lemma 4.1(ii),
the entry (A )ij is strictly positive and, in turn, so is the entire matrix sum n1
k
k=0 A .
k
Pn1 k
Regarding (i) = (iii), pick two nodes i and j. Because k=0 A > 0, there must exists k such that
(Ak )ij > 0. Lemma 4.1(ii) implies the existence of a path of length k from i to j. Hence, G is strongly
connected.
Regarding (ii) = (iv), by contradiction, assume there exists a partition (I, J ) of {1, . . . , n} such that
aij = 0 for all (i, j) I J . Let : {1, . . . , n} {1, . . . , n} be the permutation that maps all entries of
I into the first |I| entries of {1, . . . , n}. Here we let |I| denote the number of elements of I. Let P be the
corresponding permutation matrix. We now compute P AP > and block partition it as:
> AII AIJ
P AP = ,
AJ I AJ J
where AII R|I||I| , AIJ R|I||J | , AJ I R|J ||I| , and AJ J R|J ||J | . By construction, AJ I =
0|J ||I| so that P AP > is block triangular, which is in contradiction with the assumed statement (ii).
Regarding (iv) = (ii), by contradiction, assume there exists a permutation matrix P and a number
r < n such that
> B C
P AP = ,
0(nr)r D
where the matrices B Rrr , C Rr(nr) , and D R(nr)(nr) are arbitrary. The permutation
matrix P defines a unique permutation : {1, . . . , n} {1, . . . , n} with the property that the columns of
P are e(1) , . . . , e(n) . Let J = {(1), . . . , (r)} and I = {1, . . . , n} \ J . Then, by construction, for any
pair (i, j) I J , we know aij = 0, which is in contradiction with the assumed statement (iv).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
4.3. Graph theoretical characterization of irreducible matrices 49
Next we present two results, whose proof are analogous to those of the previous theorem and left to
the reader as an exercise.
Lemma 4.3 (Global reachability and powers of the adjacency matrix). Let G be a weighted digraph
with n 2 nodes and weighted adjacency matrix A. For any j {1, . . . , n}, the following equivalent
statements are equivalent:
Next, we notice that if node j is reachable from node i via a path of length k and at least one node
along that path has a self-loop, then node j is reachable from node i via paths of length k, k + 1, k + 2, and
so on. This observation and the last lemma lead to the following corollary.
Corollary 4.4 (Connectivity properties of the digraph and positive powers of the adjacency ma-
trix: contd). Let G be a weighted digraph with n nodes, weighted adjacency matrix A and a self-loop at
each node. The following statements are equivalent:
For any j {1, . . . , n}, the following two statements are equivalent:
Remark 4.5 (Similarity transformations defined by permutation matrices). Note that P > AP is the
similarity transformation of A defined by P because the permutation matrix P satisfies P 1 = P > ; see
Exercise
E2.14. Moreover,
note
that P > AP
is simply
areordering
of rows and columns.
For example, consider
0 0 1 0 1 0 1 3 1 2
P = 1 0 0 with P = 0 0 1 . Note P 2 = 1 as well as P
> > 2 = 3 and compute
0 1 0 1 0 0 3 2 3 1
a11 a12 a13 a22 a23 a21
A = a21 a22 a23 = P > AP = a32 a33 a31 ,
a31 a32 a33 a12 a13 a11
so that the entries of the 1st, 2nd and 3rd rows of A are mapped respectively to the 3rd, 1st and 2nd rows of
P > AP and, at the same time, the entries of the 1st, 2nd and 3rd columns of A are mapped respectively to
the 3rd, 1st and 2nd columns of P > AP .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
50 Chapter 4. The Adjacency Matrix
Proposition 4.6 (Strongly connected and aperiodic digraph and primitive adjacency matrix). Let
G be a weighted digraph with weighted adjacency matrix A. The following two statements are equivalent:
Before proving Proposition 4.6, we introduce a useful fact from number theory, whose proof we leave
as Exercise E4.11. First, we recall a useful notion: a set of integers are coprime if its elements share no
common positive factor except 1, that is, their greatest common divisor is 1. Loosely, the following lemma
states that coprime numbers generate, via linear combinations with nonnegative integer coefficients, all
numbers larger than a given threshold.
Lemma 4.7 (Frobenius number). Given a finite set A = {a1 , a2 , . . . , an } of positive integers, an integer
M is said to be representable by A if there exist nonnegative integers {1 , 2 , . . . , n } such that M =
1 a1 + + N aN . The following statements are equivalent:
(i) there exists a finite largest unrepresentable integer, called the Frobenius number of A, and
(ii) the greatest common divisor of A is 1.
Finally, we provide a proof for Proposition 4.6 taken from (Bullo et al. 2009).
Proof of Proposition 4.6. Regarding (i) = (ii), pick any ordered pair (i, j). We claim that there exists
a number k(i, j) with the property that, for all m > k(i, j), we have (Am )ij > 0, that is, there exists a
directed path from i to j of length m for all m k(i, j). If this claim is correct, then the statement (ii) is
proved with k = max{k(i, j) | i, j {1, . . . , n}}. To show this claim, let {c1 , . . . , cN } be the set of the
cycles of G and let {k1 , . . . , kN } be their lengths. Because G is aperiodic, the lengths {k1 , . . . , kN } are
coprime and Lemma 4.7 implies the existence of a number h(k1 , . . . , kN ) such that any number larger than
h(k1 , . . . , kN ) is a linear combination of k1 , . . . , kN with nonnegative integer as coefficients. Because G is
strongly connected, there exists a path of arbitrary length (i, j) that starts at i, contains a vertex of each
of the cycles c1 , . . . , cN , and terminates at j. Now, we claim that k(i, j) = (i, j) + h(k1 , . . . , kN ) has the
desired property. Indeed, pick any number m > k(i, j) and write it as m = (i, j) + 1 k1 + + N kN for
appropriate numbers 1 , . . . , N N. A directed path from i to j of length m is constructed by attaching
to the path the following cycles: 1 times the cycle c1 , 2 times the cycle c2 , . . . , N times the cycle cN .
Regarding (ii) = (i), from Lemma 4.1 we know that Ak > 0 means that there are paths of length
k from every node to every other node. Hence, the digraph G is strongly connected. Next, we prove
aperiodicity. Because G is strongly connected, each node of G has at least one outgoing edge, that is, for
all i, there exists at least one index j such that aij > 0. This Pn fact implies that the matrix Ak+1 = AAk is
positive via the following simple calculation: (A )il = h=1 aih (A )hl aij (Ak )jl > 0. In summary, if
k+1 k
Ak is positive for some k, then Am is positive for all subsequent m > k (see also Exercise E2.5). Therefore,
there are closed paths in G of any sufficiently large length. This fact implies that G is aperiodic; indeed,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
4.4. Graph theoretical characterization of primitive matrices 51
by contradiction, if the cycle lengths were not coprimes, then G would not possess such closed paths of
arbitrary sufficiently large length.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
52 Chapter 4. The Adjacency Matrix
Theorem 4.8 (Bounds on the spectral radius of a nonnegative matrix). For a nonnegative n n
matrix A with associated digraph G, the following statements hold:
Before providing the proof, we introduce a useful notion and establish a corollary.
Note that a row-substochastic matrix satisfies min(A1n ) < max(A1n ) and that any irreducible row-
substochastic matrix satisfies condition (iii)a because the associated digraph is strongly connected. These
two observations lead immediately to the following corollary.
Proof of Theorem 4.8. Regarding statement (i), the PerronFrobenius Theorem 2.12 applied to the nonnega-
tive matrix A implies the existence of a vector x 0n , x 6= 0n , such that
n
X
Ax = (A)x = (A)xi = aij xj .
j=1
Let ` argmaxi{1,...,n} {xi } be the index (or one of the indices) satisfying x` = max{x1 , . . . , xn } > 0
and compute
Xn Xn
xj
(A) = a`j a`j max(A1n ) .
x`
j=1 j=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
4.5. Elements of spectral graph theory 53
Regarding statement (ii), note that 1n is an eigenvector with eigenvalue max(A1n ) so that we know
(A) max(A1n ). But we also know from statement (i) that (A) max(A1n ).
Next, we establish that the condition (iii)a implies the bound (iii)b. It suffices to focus on row-
substochastic matrices (if max(A1n ) 6= 1, we consider the row-substochastic matrix A/(A)). We now
claim that:
(1) if e> > 2
i A1n < 1, then ei A 1n < 1,
(2) if i has an outneighbor j (that is, Aij > 0) with e> > 2
j A1n < 1, then ei A 1n < 1,
(3) there exists k such that Ak 1n < 1n , and
(4) (A) < 1.
Regarding statement (1), for a node i satisfying e>
i A1n < 1, we compute
e>
i A1n < 1 = e> 2 > >
i A 1n = ei A(A1n ) ei A1n < 1,
where we used the implication: if 0n v 1n and w 0n , then w> v w> 1n . Next, for a node i
satisfying Aij > 0) with e> >
j A1n < 1, note that ej A1n < 1 and A1n 1n together imply
A1n 1n 1 ej A1n ej , where 1 ej A1n > 0.
Therefore, we compute
e> 2 >
i A 1n = (ei A)(A1n )
>
(e>
i A) 1n 1 ej A1 >
n ej = ei A1n 1 ej A1n ei Aej 1 1 ej A1n Aij < 1.
Given any natural number k, we can write k = ak + b with a positive integer and b {0, . . . , k 1}.
Note that
Ak 1n Aak 1n a 1n .
The last inequality implies that, as k and therefore a , the sequence Ak converges to 0. This
fact proves statement (4) and, in turn, that the condition (iii)a implies the bound (iii)b.
Finally, we sketch the proof that the bound (iii)b implies the condition (iii)a. By contradiction, if
condition (iii)a does not hold, then the condensation of G contains a sink whose corresponding row-sums
in A are all equal to max(A1n ). But to that sink corresponds an eigenvector of A whose eigenvalue is
therefore max(A1n ). We refer to Theorem 5.3 for a brief review of the properties of reducible nonnegative
matrix and leave to the reader the details of the proof.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
54 Chapter 4. The Adjacency Matrix
4.6 Exercises
E4.1 Directed paths and powers of the adjacency matrix. Prove Lemma 4.1.
E4.2 Edges and triangles in an undirected graph. Let A be the binary
Pn adjacency matrix for an undirected graph
G without self-loops. Recall that the trace of A is trace(A) = i=1 aii .
(i) Show trace(A) = 0.
(ii) Show trace(A2 ) = 2|E|, where |E| is the number of edges of G.
(iii) Show trace(A3 ) = 6|T |, where |T | is the number of triangles of G. (A triangle is a complete subgraph
with three vertices.)
0 1 1
(iv) Verify results (i)(iii) on the matrix A = 1 0 1.
1 1 0
E4.3 A sufficient condition for primitivity. Assume the square matrix A is nonnegative and irreducible. Show
that
(i) if A has a positive diagonal element, then A is primitive,
(ii) if A is primitive, then it is false that A must have a positive diagonal element.
E4.4 Example row-stochastic matrices and associated digraph. Consider the row-stochastic matrices
0 0 1 1 1 0 1 0 1 0 1 0
1 1 0 1 0 1 1 0 1 0 1 1 1 0 0
A1 = , A2 = , and A3 = .
2 0 1 0 1 2 0 1 0 1 2 0 0 1 1
1 1 0 0 0 1 0 1 0 1 0 1
Draw the digraphs G1 , G2 and G3 associated with these three matrices. Using only the original definitions
and without relying on the characterizations in Theorem 4.2 and Proposition 4.6, show that:
(i) the matrices A1 , A2 and A3 are irreducible and primitive,
(ii) the digraphs G1 , G2 and G3 are strongly connected and aperiodic, and
(iii) the averaging algorithm defined by A2 converges in a finite number of steps.
E4.5 Primitive matrices are irreducible. Prove Lemma 2.11, that is, show that a primitive matrix is irreducible.
Hint: You are allowed to use Theorem 4.2.
E4.6 Yet another equivalent definition of irreducibility. Consider a nonnegative matrix A of dimension n.
From Theorem 4.2, we know that A is irreducible if and only if
(i) there does not exist a permutation P {0, 1}nn and 1 r n 1 such that
> Brr Cr(nr)
P AP = .
0(nr)r D(nr)(nr)
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 4 55
E4.9 Eigenvalue shifting for stochastic matrices. Let A Rnn be an irreducible row-stochastic matrix. Let
E be a diagonal matrix with diagonal elements Eii {0, 1}, with at least one diagonal element equal to zero.
Show that AE is convergent.
E4.10 Normalization of nonnegative irreducible matrices. Consider a strongly connected weighted digraph
G with n nodes and with an irreducible adjacency matrix A Rnn . The matrix A is not necessarily
row-stochastic. Find a positive vector v Rn so that the normalized matrix
1
Anormalized = (diag(v))1 A diag(v)
(A)
is nonnegative, irreducible, and row-stochastic.
E4.11 The Frobenius number. Prove Lemma 4.7.
Hint: Read up on the Frobenius number in (Owens 2003).
E4.12 Leslie population model. The Leslie model is used in population ecology to model the changes in a
population of organisms over a period of time; see the original reference (Leslie 1945) and a comprehensive
text (Caswell 2006). In this model, the population is divided into n groups based on age classes; the indices i are
ordered increasingly with the age, so that i = 1 is the class of the newborns. The variable xi (k), i {1, . . . , n},
denotes the number of individuals in the age class i at time k; at every time step k the xi (k) individuals
produce a number i xi (k) of offsprings (i.e., individuals belonging to the first age class), where i 0
is a fecundity rate, and
progress to the next age class with a survival rate i [0, 1].
If x(k) denotes the vector of individuals at time k, the Leslie population model reads
1 2 . . . n1 n
1 0 . . . 0 0
.. ..
x(k + 1) = Ax(k) = 0 . . 0 x(k), (E4.1)
2
. . . . .
.. .. .. .. ..
0 0 . . . n1 0
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
56 Chapter 4. The Adjacency Matrix
where A is referred to as the Leslie matrix. Consider the following two independent sets of questions. First,
assume i > 0 for all i {1, . . . , n} and 0 < i 1 for all i {1, . . . , n 1}.
(i) Prove that the matrix A is primitive.
(ii) Let pi (k) = Pnxi (k) denote the percentage of the total population in class i at time k. Call p(k) the
i=1 xi (k)
population distribution at time k. Compute limk+ p(k) as a function of the spectral radius (A) and
the parameters (i , i ), i {1, . . . , n}.
(iii) Assume i = > 0 and i = n for i {1, . . . , n}. What percentage of the total population belongs to
the eldest class asymptotically, that is, what is limk pn (k)?
(iv) Find a sufficient condition on the parameters (i , i ), i {1, . . . , n}, so that the population will
eventually become extinct.
Second, assume i 0 for i {1, . . . , n} and 0 i 1 for all i {1, . . . , n 1}.
(v) Find a necessary and sufficient condition on the parameters (i , i ), i {1, . . . , n} so that the Leslie
matrix A is irreducible.
(vi) For an irreducible Leslie matrix (as in the previous point (v)), find a sufficient condition on the parameters
(i , i ), i {1, . . . , n}, that ensures that the population will not go extinct.
E4.13 Swiss railroads: continued. From Exercise E3.7, consider the fictitious railroad map of Switzerland given in
Figure E3.1. Write the unweighted adjacency matrix A of this transportation network and, relying upon A
and its powers, answer the following questions:
(i) what is the number of links of the shortest path connecting St. Gallen to Zermatt?
(ii) is it possible to go from Bern to Chur using 4 links? And 5?
(iii) how many different routes, with strictly less then 9 links and possibly visiting the same station more
than once, start from Zrich and end in Lausanne?
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 5
Discrete-time Averaging Systems
After our discussions about matrix and graph theory, we are finally ready to go back to the examples
introduced in Chapter 1. Namely, we recall from Chapter 1 the
study of (i) opinion dynamics in social influence networks (given
an arbitrary stochastic matrix, what do its powers converge to?)
and (ii) averaging algorithms in wireless sensor networks (design
an algorithm to compute the average of a collection numbers lo-
cated at distinct nodes). Other related examples were given in the
appendices of Chapter 1, including the study of robotic networks in
cyclic pursuit and balancing and of more general design problems
in wireless sensor networks.
This chapter discusses two topics. First, we present some analy-
Figure 5.1: Interactions in a social influ-
sis results, and, specifically, some convergence results for averaging ence network
algorithms defined by stochastic matrices; we discuss primitive ma-
trices and reducible matrices with a single or multiple sinks. Our
treatment is related to the discussion in (Jackson 2010, Chapter 8) and (DeMarzo et al. 2003, Appendix C and,
specifically, Theorem 10). Second, we show some design results and, specifically, how to design optimal
matrices; we discuss the equal-neighbor model and the MetropolisHastings model. The computation of
optimal averaging algorithms (doubly-stochastic matrices) is discussed in Boyd et al. (2004).
Corollary 5.1 (Consensus for row-stochastic matrices with strongly connected and aperiodic
graph). If a row-stochastic matrix A has an associated digraph that is strongly connected and aperiodic
(hence A is primitive), then
(i) limk Ak = 1n w> , where w > 0 is the left eigenvector of A with eigenvalue 1 satisfying w1 + +
wn = 1;
57
58 Chapter 5. Discrete-time Averaging Systems
1>
n x(0)
lim x(k) = 1n = average x(0) 1n .
k n
Figure 5.2: First panel: An example digraph with a set of globally reachable nodes. Second panel: its strongly
connected components (in red and blue). Third panel: its condensation digraph with a sink. For this digraph, the
subgraph induced by the globally reachable nodes is aperiodic.
We are now ready to establish the semiconvergence of adjacency matrices of digraphs with globally
reachable nodes. The following result amounts to an extension to a class of reducible matrices of the
Perron-Frobenius Theorem 2.12.
Theorem 5.2 (Consensus for row-stochastic matrices with a globally-reachable aperiodic strong-
ly-connected component). Let A be a row-stochastic matrix and let G be its associated digraph. Assume
that G has a globally reachable node and the subgraph induced by the set of globally reachable nodes is
aperiodic. Then
(i) the simple eigenvalue (A) = 1 is strictly larger than the magnitude of all other eigenvalues, hence A is
semi-convergent;
(ii) limk Ak = 1n w> , where w 0 is the left eigenvector of A with eigenvalue 1 satisfying w1 + +
wn = 1;
(iii) the eigenvector w 0 has positive entries corresponding to each globally reachable node and has zero
entries for all other nodes;
(iv) the solution to x(k + 1) = Ax(k) satisfies
lim x(k) = w> x(0) 1n .
k
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
5.2. Averaging with reducible matrices 59
Note that: for all nodes j which are not globally reachable, the initial values xj (0) have no effect on the
final convergence value.
Note: as we discussed in Section 2.3, the limiting vector is a weighted average of the initial conditions.
The relative weights of the initial conditions are the convex combination coefficients w1 , . . . , wn . In a social
influence network, the coefficient wi is regarded as the social influence of agent i. We illustrate this concept
by computing the social influence coefficients for the famous Krackhardts advice network (Krackhardt
1987); see Figure 5.3.
Note: adjacency matrices of digraphs with globally reachable nodes are sometimes called indecomposable;
see (Wolfowitz 1963).
Figure 5.3: Krackhardts advice network with 21 nodes. The social influence of each node is illustrated by its gray
level.
Proof of Theorem 5.2. By assumption the condensation digraph of A contains a sink that is globally reachable,
hence it is unique. Assuming 0 < n1 < n nodes are globally reachable, a permutation of rows and columns
(see Exercise E3.1), brings the matrix A into the form
A11 0
A= , (lower-triangular matrix), (5.1)
A21 A22
where A11 Rn1 n1 , A22 Rn2 n2 , with n1 + n2 = n. The state vector x is correspondingly partitioned
into x1 Rn1 and x2 Rn2 so that
In other words, x1 and A11 are the variables and the matrix corresponding to the sink. Because the sink,
as a subgraph of G, is strongly connected and aperiodic, A11 is primitive and row-stochastic and, by
Corollary 5.1,
lim Ak11 = 1n1 w1> ,
k
where w1 > 0 is the left eigenvector with eigenvalue 1 for A11 normalized so that 1> n1 w1 = 1.
The matrix A22 is analyzed as follows. Recall from Corollary 4.10 that an irreducible row-substochastic
matrix has spectral radius less than 1. Now, because A21 cannot be zero (otherwise the sink would not
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
60 Chapter 5. Discrete-time Averaging Systems
be globally reachable), the matrix A22 is row-substochastic. Moreover, (after appropriately permuting
rows and columns of A22 ) it can be observed that A22 is a lower-triangular matrix such that each diagonal
block is row-substochastic and irreducible (corresponding to each node in the condensation digraph).
Therefore, we know (A22 ) < 1 and, in turn, In2 A22 is invertible. Because A11 is primitive and
(A22 ) < 1, A is semiconvergent and limk x2 (k) exists. Taking the limit as k in equation (5.3),
some straightforward algebra shows that
lim x2 (k) = (In2 A22 )1 A21 lim x1 (k) = (In2 A22 )1 A21 (1n1 w1> ) x1 (0).
k k
From the row-stochasticity of A, we know A21 1n1 + A22 1n2 = 1n2 and hence (In2 A22 )1 A21 1n1 = 1n2 .
Collecting these results, we write
k >
A11 0 1n1 w1> 0 w
lim = = 1n 1 .
k A21 A22 1n2 w1> 0 0
Theorem 5.3 (Convergence for row-stochastic matrices with multiple aperiodic sinks). Let A be
a row-stochastic matrix and let G be its associated digraph. Assume the condensation digraph C(G) contains
M 2 sinks and assume all of them are aperiodic. Then
(i) the semi-simple eigenvalue (A) = 1 has multiplicity equal M and is strictly larger than the magnitude
of all other eigenvalues, hence A is semi-convergent,
(ii) there exist M left eigenvectors of A, denoted by wm Rn , for m {1, . . . , M }, with the properties
that: wm 0, w1m + + wnm = 1 and wim is positive if and only if node i belongs to the m-th sink,
(iii) the solution to x(k + 1) = Ax(k) with initial condition x(0) satisfies
(wm )> x(0), if node i belongs to the m-th sink,
(wm )> x(0), if node i is connected with the m-th sink and no other sink,
lim xi (k) = M
k X
zi,m (wm )> x(0) , if node i is connected to more than one sink,
m=1
where, for each node i connected to more than one sink, the coefficients zi,m , m {1, . . . , S}, are convex
combination coefficients and are strictly positive if and only if there exists a directed path from node i to
the sink m.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
5.3. Averaging with reducible matrices and multiple sinks 61
Proof. Rather than treating with heavy notation the general case, we work out an example and refer the
reader to (DeMarzo et al. 2003, Theorem 10) for the general proof. Assume the condensation digraph of A
is composed of three nodes, two of which are sinks, as in the side figure.
x3
x1 x2
Therefore, after a permutation of rows and columns (see Exercise E3.1), A can be written as
A11 0 0
A = 0 A22 0
A31 A32 A33
and the state vector x is correspondingly partitioned into the vectors x1 , x2 and x3 . The state equations are:
By the properties of the condensation digraph and the assumption of aperiodicity of the sinks, the
digraphs associated to the row-stochastic matrices A11 and A22 are strongly connected and aperiodic.
Therefore, we immediately conclude that
lim x1 (k) = w1> x1 (0) 1n1 and lim x2 (k) = w2> x2 (0) 1n2 ,
k k
where w1 (resp. w2 ) is the left eigenvector of the eigenvalue 1 for matrix A11 (resp. A22 ) with the usual
normalization 1> >
n1 w1 = 1n2 w2 = 1.
Regarding the matrix A33 , the same discussion as in the previous proof leads to (A33 ) < 1 and, in
turn, to the statement that In3 A33 is nonsingular. By taking the limit as k in equation (5.6), some
straightforward algebra shows that
lim x3 (k) = (In3 A33 )1 A31 lim x1 (k) + A32 lim x2 (k)
k k k
= (w1 x1 (0)) (In3 A33 ) A31 1n1 + (w2> x2 (0)) (In3 A33 )1 A32 1n2 .
> 1
This concludes our proof of Theorem 5.3 for the simplified case C(G) having three nodes and two sinks.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
62 Chapter 5. Discrete-time Averaging Systems
Note that: convergence does not occur to consensus (not all components of the state are equal) and
the final value of all nodes is independent of the initial values at nodes which are not in the sinks of the
condensation digraph.
We conclude this section with a figure providing a summary of the asymptotic behavior of discrete-time
averaging systems and its relationships with properties of matrices and graphs; see Figure 5.4.
Strongly connected
Primitive matrices and aperiodic
Converges to consensus
depending on all nodes
One aperiodic
sink component
Converges to consensus
that does not depend
on all the nodes
Multiple aperiodic
Converges sink components
not to consensus
Figure 5.4: Corresponding properties for the discrete-time averaging dynamial system x(k + 1) = Ax(k), the
row-stochastic matrix A and the associated weighted digraph.
1/3
1/3 1/3
3 4 3 4
1/3
1/3 1/3
1/2 1/4
1/4 1/4
1/4
1 2 1 2
1/2
From Section 1.2 let us consider an undirected graph as in Figure 5.5 and the following simplest
distributed algorithm, based on the concepts of linear averaging. Each node contains a value xi and
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
5.4. Appendix: Design of graphs weights 63
repeatedly executes:
x+
i := average xi , {xj , for all neighbor nodes j} . (5.7)
Let us make a few simple observations. The algorithm (5.7) can be written in matrix format as:
1/2 1/2 0 0
1/4 1/4 1/4 1/4
x(k + 1) =
0
x(k) =: Awsn x(k).
1/3 1/3 1/3
0 1/3 1/3 1/3
The binary symmetric adjacency matrix and the degree matrix of the undirected graph are
0 1 0 0 1 0 0 0
1 0 1 1 0 3 0 0
A=
0
, D= ,
1 0 1 0 0 2 0
0 1 1 0 0 0 0 2
Awsn = (D + I4 )1 (A + I4 ),
1 X
xi (k + 1) = xi (k) + xj (k) .
1 + d(i)
jN (i)
Recall that A + I4 is the adjacency matrix of a graph that is equal to the graph in figure with the addition
of a self-loop at each node; this new graph has degree matrix D + I4 .
Now, it is also quite easy to verify (see also see Exercise E5.3) that
We summarize this discussion and state a more general result, in arbitrary dimensions and for arbitrary
graphs.
Lemma 5.4 (The equal-neighbor row-stochastic matrix). Let G be a weighted digraph with n nodes,
weighted adjacency matrix A and weighted out-degree matrix Dout . Define
Note that the weighted digraph associated to (A + In ) is G with the addition of a self-loop at each node with
unit weight. Then
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
64 Chapter 5. Discrete-time Averaging Systems
5/12
5/12
3 4 3 1/3 4
1/4
1 2 1 1/4 2
Proof. First, for any v Rn with non-zero entries, it is easy to see diag(v)1 v = 1n . Recalling the
definition Dout + In = diag((A + In )1n ),
(Dout + In )1 (A + In ) 1n = diag((A + In )1n )1 (A + In )1n = 1n ,
which proves statement (i). To prove statement (ii), note that, beside self-loops, G and the weighted digraph
associated with Aequal-neighbor have the same edges. Also note that the weighted digraph associated with
Aequal-neighbor is aperiodic by design. Finally, if Dout = Din = dIn for some d R>0 , then
> 1
(Dout + In )1 (A + In ) 1n = (A + In )> 1n
d+1
= (Din + I)1 (A + In )> 1n
= diag((A + In )> 1n )1 (A + In )> 1n = 1n .
This concludes the proof of statement (iii).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
5.5. Appendix: Centrality measures 65
One can verify that the MetropolisHastings weights have the following properties:
(i) (AMetropolis-Hastings )ij > 0 if {i, j} E, (AMetropolis-Hastings )ii > 0 for all i {1, . . . , n}, and
(AMetropolis-Hastings )ij = 0 else;
(ii) AMetropolis-Hastings is symmetric and doubly-stochastic; and
(iii) AMetropolis-Hastings is primitive if and only if G is connected.
(i) if G is strongly connected, then the spectral radius (A) is an eigenvalue of maximum magnitude
and its corresponding left eigenvector can be selected to be strictly positive and with unit sum (see
Theorem 2.12); and
(ii) if G contains a globally reachable node, then the spectral radius (A) is an eigenvalue of max-
imum magnitude and its corresponding left eigenvector is nonnegative and has positive entries
corresponding to each globally reachable node (see Theorem 5.2).
Degree centrality For an arbitrary weighted digraph G, the degree centrality cdegree (i) of node i is its
in-degree:
n
X
cdegree (i) = din (i) = aji , (5.8)
j=1
that is, the number of in-neighbors (if G is unweighted) or the sum of the weights of the incoming edges.
Degree centrality is relevant, for example, in (typically unweighted) citation networks whereby articles are
ranked on the basis of their citation records. (Warning: the notion that a high citation count is an indicator
of quality is clearly a fallacy.)
Eigenvector centrality One problem with degree centrality is that each in-edge has unit count, even
if the in-neighbor has negligible importance. To remedy this potential drawback, one could define the
importance of a node to be proportional to the weighted sum of the importance of its in-neighbors
(see (Bonacich 1972b) for an early reference). This line of reasoning leads to the following definition.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
66 Chapter 5. Discrete-time Averaging Systems
For a weighted digraph G with globally reachable nodes (or for an undirected graph that is connected),
define the eigenvector centrality vector, denoted by cev , to be the left dominant eigenvector of the adjacency
matrix A associated with the dominant eigenvalue and normalized to satisfy 1> n cev = 1.
Note that the eigenvector centrality satisfies
n
X
1
A> cev = cev cev (i) = aji cev (j). (5.9)
j=1
1
where = (A) is the only possible choice of scalar coefficient in equation (5.9) ensuring that there exists
a unique solution and that the solution, denoted cev , is strictly positive in a strongly connected digraph
and nonnegative in a digraph with globally reachable nodes. Note that this connectivity property may be
restrictive in some cases. We refer to Exercise E5.13 for a generalization of eigenvector centrality.
Figure 5.7: Comparing degree centrality versus eigenvector centrality: the node with maximum in-degree has zero
eigenvector centrality in this graph
Katz centrality For a weighted digraph G, pick an attenuation factor < 1/(A) and define the Katz
centrality vector (see (Katz 1953)), denoted by cK , by the following equivalent formulations:
n
X
cK (i) = aji (cK (j) + 1), (5.10)
j=1
or
X
X n
cK (i) = k (Ak )ji . (5.11)
k=1 j=1
(i) the importance of a node is an attenuated sum of the importance and of the number of the in-neighbors
note indeed how equation (5.10) is a combination of equations (5.8) and (5.9), and
(ii) the importance of a node is times number of length-1 paths into i (i.e., the in-degree) plus 2 times
the number of length-2 paths into i, etc. (From Lemma 4.1, recall that, for an unweighted digraph,
(Ak )ji is equal to the number of directed paths of length k from j to i.)
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
5.5. Appendix: Centrality measures 67
Note how, for < 1/(A), equation (5.10) is well-posed and equivalent to
cK = A> (cK + 1n )
cK + 1n = A> (cK + 1n ) + 1n
(In A> )(cK + 1n ) = 1n
cK = (In A> )1 1n 1n (5.12)
X
cK = k (A> )k 1n ,
k=1
P
where we used the identity (In A)1 = k=0 Ak valid for any matrix A with (A) < 1; see Exer-
cise E2.13.
There are two simple ways to compute the Katz centrality. According to equation (5.12), for limited size
problems, one can invert the matrix (In A> ). Alternatively, one can show that the following iteration
converges to the correct value: c+ >
K := A (cK + 1n ).
1000
Figure 5.8: Image taken temporarily without permission
from (Ishii and Tempo 2014). The pattern in figure dis-
plays the so-called hyperlink matrix, i.e., the transpose
2000 of the adjacency matrix, for a collection of websites at
the Lincoln University in New Zealand from the year
2006. Blue points are nonzero entries of the adjacency
matrix; red points are outgoing links toward dangling
nodes. Each empty column corresponds to a webpage
3000 without any outgoing link, that is, to a so-called dan-
gling node. This Web has 3756 nodes with 31,718 links.
A fairly large portion of the nodes are dangling nodes:
in this example, there are 3255 dangling nodes, which is
0 1000 2000 3000 over 85% of the total.
Pagerank centrality For a weighted digraph G with row-stochastic adjacency matrix (i.e., unit out-
degree for each node), pick a convex combination coefficient ]0, 1[ and define the pagerank centrality
vector, denoted by cpr , as the unique positive solution to
n
X 1
cpr (i) = aji cpr (j) + , (5.13)
n
j=1
or, equivalently, to
1
cpr = M cpr , 1>
n cpr = 1, where M = A> + 1n 1>
n. (5.14)
n
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
68 Chapter 5. Discrete-time Averaging Systems
(To establish the equivalence between these two definitions, the only non-trivial step is to notice that if cpr
solves equation (5.13), then it must satisfy 1>
n cpr = 1.)
Note that, for arbitrary unweighted digraphs and binary adjacency matrices A0,1 , it is natural to compute
1
the pagerank vector with A = Dout A0,1 . We refer to (Brin and Page 1998; Page 2001; Ishii and Tempo
2014) for the important interpretation of the pagerank score as the stationary distribution of the so-called
random surfer of an hyperlinked document network it is under this disguise that the pagerank score was
conceived by the Google co-founders and a corresponding algorithm led to the establishment of the Google
search engine. In the Google problem it is customary to set .85.
Closeness and betweenness centrality (based on shortest paths) Degree, eigenvector, Katz and
Pagerank centrality are presented using the adjacency matrix. Next we present two centrality measures
based on the notions of shortest path and geodesic distance; these two notions belong to the class of radial
and medial centrality measures (Borgatti and Everett 2006).
We start by introducing some additional graph theory. For a weighted digraph with n nodes, the length
of a directed path is the sum of the weights of edges in the directed path. For i, j {1, . . . , n}, a shortest
path from a node i to a node j is a directed path of smallest length. Note: it is easy to construct examples
with multiple shortest paths, so that the shortest path is not unique. The geodesic distance dij from node i
to node j is the length of a shortest path from node i to node j; we also stipulate that the geodesic distance
dij takes the value zero if i = j and is infinite if there is no path from i to j. Note: in general dij 6= dji .
Finally, for i, j, k {1, . . . , n}, we let gikj denote the number of shortest paths from a node i to a node
j that pass through node k.
For a strongly-connected weighted digraph, the closeness of node i {1, . . . , n} is the inverse sum
over the geodesic distances dij from node i to all other nodes j {1, . . . , n}, that is:
1
ccloseness (i) = Pn . (5.15)
j=1 dij
For a strongly-connected weighted digraph, the betweenness of node i {1, . . . , n} is the fraction of
all shortest paths gkij from any node k to any other node j passing through node i, that is:
Pn
j,k=1 gkij
cbetweenness (i) = Pn Pn . (5.16)
h=1 j,k=1 gkhj
Summary To conclude this section, in Table 5.1, we summarize the various centrality definitions for a
weighted directed graph.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
5.5. Appendix: Centrality measures 69
Table 5.1: Definitions of centrality measures for a weighted digraph G with adjacency matrix A
Figure 5.9 illustrates some centrality notions on a small instructive example due to Brandes (2006).
Note that a different node is the most central one in each metric; this variability is naturally expected and
highlights the need to select a centrality notion relevant to the specific application of interest.
Figure 5.9: Degree, eigenvector, closeness, and betweenness centrality for an undirected unweighted graph. The dark
node is the most central node in the respective metric; a different node is the most central one in each metric.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
70 Chapter 5. Discrete-time Averaging Systems
5.6 Exercises
E5.1 A sample DeGroot panel. A conversation between 5 panelists is modeled according to the DeGroot model
by an averaging algorithm x+ = Apanel x, where
0.15 0.15 0.1 0.2 0.4
0 0.55 0 0 0.45
Apanel =
0.3 0.05 0.05 0 0.6
.
0 0.4 0.1 0.5 0
0 0.3 0 0 0.7
Assuming that the panel has sufficiently long deliberations, answer the following:
(i) Based on the associated digraph, do the panelists finally agree on a common decision?
(ii) In the event of agreement, does the initial opinion of any panelists get rejected? If so, which ones?
(iii) If the panelists initial opinions are their self-appraisals (i.e., the self-weights a11 , . . . , a55 ), what is the
final opinion?
E5.2 Three DeGroot panels. Recall the DeGroot model introduced in Chapter 1. Denote by xi (0) the initial
opinion of each individual, and xi (k) its updated opinion after k communications with its neighbors. Then
the vector of opinions evolves over time according to x(k + 1) = Ax(k) where the coefficient aij [0, 1] is
P influence of the opinion of individual j on the update of the opinion of agent i, subject to the constraint
the
j aij = 1. Consider the following three scenarios:
(i) Everybody gives the same weight to the opinion of everybody else.
(ii) There is a distinct agent (suppose the agent with index i = 1) that weights equally the opinion of all the
others, and the remaining agents compute the mean between their opinion and the one of first agent.
(iii) All the agents compute the mean between their opinion and the one of the first agent. Agent 1 does not
change her opinion.
In each case, derive the averaging matrix A, show that the opinions converge asymptotically to a final opinion
vector, and characterize this final opinion vector.
E5.3 Left dominant eigenvector for equal-neighbor row-stochastic matrices. Let A01 be the binary (i.e.,
each entry is either 0 or 1) adjacency matrix for a unweighted undirected graph. Assume the graph is connected.
Let D = diag(d1 , . . . , dn ) be the degree matrix, let |E| be the number of edges of the graph, and define
A = D1 A01 . Show that
(i) the definition of A is well-posed and A is row-stochastic, and
(ii) the left dominant eigenvector of A associated to the eigenvalue 1 and normalized so that 1>
n w = 1 is
d1
1 .
w= . .
2|E| .
dn
Next, consider the equal-neighbor averaging algorithm in equation (5.7) with associated row-stochastic matrix
Aequal-neighbor = (D + In )1 (A01 + In ).
(iii) Show that
1 Xn
lim x(k) = (1 + di )xi (0) 1n .
k 2|E| + n i=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 5 71
(iv) Verify that the left dominant eigenvector of the matrix Awsn = Aequal-neighbor defined in Section 1.2 is
[1/6, 1/3, 1/4, 1/4]> , as seen in Example 2.5.
E5.4 A stubborn agent. Pick ]0, 1[, and consider the discrete-time consensus algorithm
x1 (k + 1) = x1 (k),
x2 (k + 1) = x1 (k) + (1 )x2 (k).
Perform the following tasks:
(i) compute the matrix A representing this algorithm and verify it is row-stochastic,
(ii) compute the eigenvalues and eigenvectors of A,
(iii) draw the directed graph G representing this algorithm and discuss its connectivity properties,
(iv) compute the condensation digraph of G,
(v) compute the final value of this algorithm as a function of the initial values in two alternate ways:
invoking and without invoking Theorem 5.2.
E5.5 Agents with self-confidence levels. Consider 2 agents, labeled +1 and 1, described by the self-confidence
levels s+1 and s1 . Assume s+1 0, s1 0, and s+1 + s1 = 1. For i {+1, 1}, define
x+
i := si xi + (1 si )xi .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
72 Chapter 5. Discrete-time Averaging Systems
(Note: Friedkin and Johnsen (1999, 2011) make the additional assumption that i = 1 wii , for
i {1, . . . , n}; this assumption is not needed here. This model is also referred to the opinion dynamics
model with stubborn agents. See (Ravazzi et al. 2015) for an extention of this model.)
E5.7 Necessary and sufficient conditions for consensus. Let A be a row-stochastic matrix. Prove that the
following statements are equivalent:
(i) the eigenvalue 1 is simple and all other eigenvalues have magnitude strictly smaller than 1,
(ii) limk Ak = 1n w> , for some w Rn , w 0, and 1> n w = 1,
(iii) the digraph associated to A contains a globally reachable node and the subgraph of globally reachable
nodes is aperiodic.
Hint: Use the Jordan normal form to show that (i) = (ii).
E5.8 Computing centrality. Write in your favorite programming language algorithms to compute degree, eigen-
vector, Katz and pagerank centralities. Compute these four centralities for the following undirected unweighted
graphs (without self-loops):
(i) the ring graph with 5 nodes;
(ii) the star graph with 5 nodes;
(iii) the line graph with 5 nodes; and
(iv) the Zachary karate club network dataset. This dataset can be downloaded for example from: http:
//konect.uni-koblenz.de/networks/ucidata-zachary
To compute Katz centrality of a matrix A, select = 1/(2(A)). For pagerank, use = 1/2.
Hint: Recall that pagerank centrality is well-defined for a row-stochastic matrix.
E5.9 Central nodes in example graph. For the unweighted undirected graph in Figure 5.9, verify (possibly with
the aid of a computational package) that the dark nodes have indeed the largest degree, eigenvector, closeness
and betweenness centrality as stated in the figure caption.
E5.10 Iterative computation of Katz centrality. Given a graph with adjacency matrix A, show that the solution
to the iteration x(k + 1) := A> (x(k) + 1n ) with < 1/(A) converges to the Katz centrality vector cK ,
for all initial conditions x(0).
E5.11 Move away from your nearest neighbor and reducible averaging. Consider n 3 robots with positions
pi R, i {1, . . . , n}, dynamics pi (t + 1) = ui (t), where ui R is a steering control input. For simplicity,
assume that the robots are indexed according to their initial position: p1 (0) p2 (0) p3 (0) pn (0).
Consider two walls at the positions p0 p1 (0) and pn+1 pn (0) so that all robots are contained between
the walls. The walls are stationary, that is, p0 (t + 1) = p0 (t) = p0 and pn+1 (t + 1) = pn+1 (t) = pn+1 .
Consider the following coordination law: robots i {2, . . . , n 1} (each having two neighbors) move to
the centroid of the local subset {pi1 , pi , pi+1 }. The robots {1, n} (each having one robotic neighbor and one
neighboring wall) move to the centroid of the local subsets {p0 , p1 , p2 } and {pn1 , pn , pn+1 }, respectively.
Hence, the closed-loop robot dynamics are
1
pi (t + 1) =(pi1 (t) + pi (t) + pi+1 (t)) , i {1, . . . , n} .
3
Show that the robots become uniformly spaced on the interval [p0 , pn+1 ] using Theorem 5.3.
(Note: This exercise is a discrete-time version of E2.18(ii) based on averaging with multiple sinks.)
E5.12 The role of the out-degree in averaging systems. Let G be an undirected, connected graph without
self-loops. Let each node represent an agent in a network, with the following system dynamics:
1
x(k + 1) = Ax(k) where A = Dout A01 ,
xi (0) [0, 1],
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 5 73
where Dout is the out-degree matrix and by A01 the binary adjacency matrix:
(
1, if {i, j} E ,
(A01 )ij =
0, otherwise.
(i) Under which conditions on the network will the system converge to a final in span{1n }? What is this
steady state value?
(ii) Let e(k) = x(k) limk x(k) be the disagreement error at time instant k. Show that the error
dynamics evolve as e(k + 1) = Be(k) and determine the matrix B.
(iii) Find a function f (k, i , dout (i)) depending on the time step k, the eigenvalues i of A, and the out-
degrees of the nodes dout (i) such that
x(k + 1) = M x(k),
where y0 = x(0) is the stacked vector of initial hub and authority scores.
(v) Provide expressions for yeven (y0 ) and yodd (y0 ).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 6
The Laplacian Matrix
So far, we have studied adjacency matrices. In this chapter, we study a second relevant matrix associated
to a digraph, called the Laplacian matrix. More information on adjacency and Laplacian matrices can be
found in standard books on algebraic graph theory such as (Biggs 1994) and (Godsil and Royle 2001). Two
surveys about Laplacian matrices are (Mohar 1991; Merris 1994).
L = Dout A.
anij ,
if i 6= j,
`ij = X
aih , if i = j,
h=1,h6=i
1, if {i, j} is an edge and not self-loop,
`ij = d(i), if i = j,
0, otherwise.
75
76 Chapter 6. The Laplacian Matrix
Note:
(i) the sign pattern of L is important diagonal elements are nonnegative (zero or positive) and
off-diagonal elements are nonpositive (zero or negative);
(ii) the Laplacian matrix L of a digraph G does not depend upon the existence and values of self-loops in
G;
(iii) the graph G is undirected (i.e., symmetric adjacency matrix) if and only if L is symmetric. In this
case, Dout = Din = D and A = A> ;
(iv) in a directed graph, `ii = 0 (instead of `ii > 0) if and only if node i has zero out-degree;
(v) L is said to be irreducible if G is strongly connected.
We next define the same concept, but without starting from a digraph.
A Laplacian matrix L induces a weighted digraph G without self-loops in the natural way, that is, by
letting (i, j) be an edge of G if and only if `ij > 0. With this definition, L is the Laplacian matrix of G.
We conclude this section with some useful equalities. By the way, obviously
n
X
(Ax)i = aij xj . (6.1)
j=1
First, for x Rn ,
n
X n
X X
n n
X
(Lx)i = `ij xj = `ii xi + `ij xj = aij xi + (aij )xj
j=1 j=1,j6=i j=1,j6=i j=1,j6=i
Xn X
= aij (xi xj ) = aij (xi xj ) (6.2)
j=1,j6=i jN out (i)
for unit weights
= dout (i) xi average({xj , for all out-neighbors j}) .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
6.2. The Laplacian in mechanical networks of springs 77
These equalities are useful because it is common to encounter the array of differences Lx and the
quadratic error or disagreement function x> Lx. They provide the correct intuition for the definition of
the Laplacian matrix. In the following, we will refer to x 7 x> Lx as the Laplacian potential function; this
name is justified based on the energy and power interpetation we present in the next two examples.
Let xi R denote the displacement of the ith rigid body. Assume that each spring is ideal linear-elastic
and let aij be the spring constant for the spring connecting the ith and jth bodies.
Define a graph as follows: the nodes are the rigid bodies {1, . . . , n} with locations x1 , . . . , xn , and the
edges are the springs with weights aij . Each node i is subject to a force
X
Fi = aij (xj xi ) = (Lx)i ,
j6=i
where L is the Laplacian for the network of springs (modeled as an undirected weighted graph). Moreover,
recalling that the spring {i, j} stores the quadratic energy 21 aij (xi xj )2 , the total elastic energy is
1 X 1
Eelastic = aij (xi xj )2 = x> Lx.
2 2
{i,j}E
In this role, the Laplacian matrix is referred to as the stiffness matrix. Stiffness matrices can be defined
for spring networks in arbitrary dimensions (not only on the line) and with arbitrary topology (not only
a chain graph, or line graph, as in figure). More complex spring networks can be found, for example, in
finite-element discretization of flexible bodies and finite-difference discretization of diffusive media.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
78 Chapter 6. The Laplacian Matrix
1 2
Suppose the graph is an electrical network with only pure resistors and ideal voltage sources: (i) each
graph vertex i {1, . . . , n} is possibly connected to an ideal voltage source, (ii) each edge is a resistor, say
with resistance rij between nodes i and j. (This is an undirected weighted graph.)
Ohms law along each edge {i, j} gives the current flowing from i to j as
where aij is the inverse resistance, called conductance. We set aij = 0 whenever two nodes are not
connected by a resistance. Kirchhoffs current law says that at each node i:
n
X n
X
cinjected at i = cij = aij (vi vj ).
j=1,j6=i j=1,j6=i
Hence, the vector of injected currents cinjected and the vector of voltages at the nodes v satisfy
cinjected = L v.
Moreover, the power dissipated on resistor {i, j} is cij (vi vj ), so that the total dissipated power is
X
Pdissipated = aij (vi vj )2 = v> Lv.
{i,j}E
Historical Note: Kirchhoff (1847) is a founder of graph theory in that he was an early adopter of graph
models to analyze electrical circuits.
L1n = 0n .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
6.4. Properties of the Laplacian matrix 79
n
X n
X X
n n
X
`ij = `ii + `ij = aij + (aij ) = 0.
j=1 j=1,j6=i j=1,j6=i j=1,j6=i
Equivalently, in vector format (remembering the weighted out-degree matrix Dout is diagonal and contains
the row-sums of A):
dout (1) dout (1)
L1n = Dout 1n A1n = ... ... = 0n .
dout (n) dout (n)
Lemma 6.4 (Zero column-sums). Let G be a weighted digraph with Laplacian L and n nodes. The following
statements are equivalent:
n
X n
X
(1>
n L)j
>
= (L 1n )j = `ij = `jj + `ij = dout (j) din (j),
i=1 i=1,j6=i
n
X
`jj = dout (j) ajj and `ij = (din (j) ajj ).
i=1,j6=i
Lemma 6.5 (Spectrum of the Laplacian matrix). Given a weighted digraph G with Laplacian L, the
eigenvalues of L different from 0 have strictly-positive real part.
P
Proof. Recall `ii = nj=1,j6=i aij 0 and `ij = aij 0 for i 6= j. By the Gergorin Disks Theorem 2.8,
we know that each eigenvalue of L belongs to at least one of the disks
n n
X o
z C |z `ii | |`ij | = z C | |z `ii | `ii .
j=1,j6=i
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
80 Chapter 6. The Laplacian Matrix
`ii
`jj
These disks, with radius equal to the center, contain the origin and complex numbers with positive real
part.
For an undirected graph without self-loops and with symmetric adjacency matrix A = A> , we know
that L is symmetric and positive semidefinite, i.e., all eigenvalues of L are real and nonnegative and that
d(i) = `i . In this case, by convention, we write these eigenvalues as
0 = 1 2 n .
Note:
the second smallest eigenvalue 2 is called the Fiedler eigenvalue or the algebraic connectivity (Fiedler
1973);
the theorem proof also implies n 2 max{d(1), . . . , d(n)}; and
we refer the reader to Exercise E6.16 for a lower bound on n based on the maximum degree.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
6.5. Graph connectivity and the rank of the Laplacian 81
Proof. We start by simplifying the problem. Define a new weighted digraph G by modifying G as follows:
at each node, add a self-loop with unit weight if no self-loop is present, or increase the weight of the
self-loop by 1 if a self-loop is present. Also, define another weighted digraph G by modyfing G as follows:
for each node, divide the weights of its out-going edges by its out-degree, so that the out-degree of each
node is 1. In other words, define A = A + I and L = L, and define A = D
out
1 =D
A and L out
1
L = I A.
Clearly, the rank of L is equal to the rank of L. Therefore, without loss of generality, we consider in what
follows only digraphs with row-stochastic adjacency matrices.
Because the condensation digraph C(G) has d sinks, after a renumbering of the nodes, that is, a
permutation of rows and columns (see Exercise E3.1), the adjacency matrix A can be written in block lower
tridiagonal form as
A11 0 0 0 0
.. ..
0 A22 0 . . 0
.. .. .. ..
0 0 . . . .
A= .
Rnn .
.. . .. . .
.. .. . ..
0
.. ..
0 . . 0 A 0
dd
A1o A2o Ado Aothers
where the state vector x is correspondingly partitioned into the vectors x1 , . . . , xd and xothers of dimensions
n1 , . . . , nd and n (n1 + + nd ) respectively, corresponding to the d sinks and all other nodes.
Each sink of C(G) is a strongly connected and aperiodic digraph. Therefore, the square matrices
A11 , . . . , Add are nonnegative, irreducible, and primitive. By the PerronFrobenius Theorem for primitive
matrices 2.12, we know that the number 1 is a simple eigenvalue for each of them.
The square matrix Aothers is nonnegative and it can itself be written as a block lower triangular matrix,
whose diagonal block matrices, say (Aothers )1 , . . . , (Aothers )N are nonnegative and irreducible. Moreover,
each of these diagonal block matrices must be row-substochastic because (1) each row-sum for each
of these matrices is at most 1, and (2) at least one of the row-sums of each of these matrices must be
smaller than 1, otherwise that matrix would correspond to a sink of C(G). In summary, because the
matrices (Aothers )1 , . . . , (Aothers )N are irreducible and row-substochastic, the matrix Aothers has spectral
radius (Aothers ) < 1.
We now write the Laplacian matrix L = In A with the same block lower triangular structure:
L11 0 0 0 0
.. ..
0 L22 0 . . 0
. . . . . . ..
0 0 . . . .
L= .
,
(6.5)
. . . . . . . . . .
. . . . 0
. .
0 . . . . 0 Ldd 0
A1o A2o Ado Lothers
where, for example, L11 = In1 A11 . Because the number 1 is a simple eigenvalue of A11 , the number 0 is a
simple eigenvalue of L11 . Therefore, rank(L11 ) = n1 1. This same argument establishes that the rank of
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
82 Chapter 6. The Laplacian Matrix
L is at most n d because each one of the matrices L11 , . . . , Ldd is of rank n1 1, . . . , nd 1, respectively.
Finally, we note that the rank of Lothers is maximal, because Lothers = I Aothers and (Aothers ) < 1 together
imply that 0 is not an eigenvalue for Lothers .
V1 V2 = V, V1 V2 = , and V1 , V2 6= .
Of course, there are many such partitions. We measure the quality of a partition by the sum of the weights
of all edges that need to be cut to separate the vertices V1 and V2 into two disconnected components.
Formally, the size of the cut separating V1 and V2 is
X
J= aij .
iV1 ,jV2
We are interested in finding the cut with minimal size that identifies the two groups of nodes that are most
loosely connected. The problem of minimizing the cut size J is combinatorial and computationally hard
since we need to consider all possible partitions of the vertex set V . We present here a tractable approach
based on a so-called relaxation step. First, define a vector x {1, +1}n with entries xi = 1 for i V1
and xi = 1 for i V2 . Then the cut size J can be rewritten via the Laplacian potential as
n
1 X 1
J= aij (xi xj )2 = x> Lx
4 2
i,j=1
(Here we exclude the cases x {1n , 1n } because they correspond to one of the two groups being
empty.) Second, since this problem is still computationally hard, we relax the problem from binary decision
variables xi {1, +1} to continuous decision variables yi [1, 1] (or kyk 1), where we exclude
y span(1n ) (corresponding to one of the two groups being empty). Then the minimization problem
becomes
minimize y > Ly.
yRn ,y1n ,kyk =1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
6.6. Appendix: Community detection via algebraic connectivity 83
As a third and final step, we consider a 2-norm constraint kyk2 = 1 instead of an -norm constraint
kyk = 1 (recall that kyk kyk2 nkyk ) to obtain the following heuristic:
Notice that y > Ly 2 kyk2 and this inequality holds true with equality whenever y = v2 , the normalized
eigenvector associated to 2 . Thus, the unique minimum of the relaxed optimization problem is 2 and the
minimizer is y = v2 . We can then use as a heuristic x = sign(v2 ) to find the desired partition {V1 , V2 }.
Hence, the algebraic connectivity 2 is an estimate for the size of the minimum cut, and the signs of the
entries of v2 identify the associated partition in the graph. For these reasons 2 and v2 can be interpreted
as the size and the location of a bottleneck in a graph.
To illustrate the above concepts, we borrow an example problem with the corresponding Matlab code
from (Gleich 2006). we construct a randomly generated graph as follows. First, we partition n = 1000
nodes in two groups V1 and V2 of sizes 450 and 550 nodes, respectively. Second, we connect any pair of
nodes in the set V1 (respectively V2 ) with probability 0.3 (respectively 0.2). Third and finally, any two
nodes in distinct groups, i V1 and j V2 , are connected with a probability of 0.1. The sparsity pattern
of the associated adjacency matrix is shown in the left panel of Figure 6.1. No obvious partition is visible
at first glance since the indices are not necessarily sorted, that is, V1 is not necessarily {1, . . . , 450}. The
second panel displays the entries of the eigenvector v2 sorted according to their magnitude showing a
sharp transition between positive and negative entries. Finally, the third panel displays the correspondingly
sorted adjacency matrix A clearly indicating the partition V = V1 V2 .
The Matlab code to generate Figure 6.1 can be found below. For additional analysis of this problem, we
refer the reader to (Gleich 2006).
Figure 6.1: The first panel shows a randomly-generated sparse adjacency matrix A for a graph with 1000 nodes. The
second panel displays the eigenvector v2 which is identical to the normalized eigenvector v2 after sorting the entries
according to their magnitude, and the third panel displays the correspondingly sorted adjacency matrix A.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
84 Chapter 6. The Laplacian Matrix
5 x = randperm(n);
6 group_size = 450;
7 group1 = x(1:group_size);
8 group2 = x(group_size+1:end);
9
10 % assign probabilities of connecting nodes
11 p_group1 = 0.3;
12 p_group2 = 0.2;
13 p_between_groups = 0.1;
14
15 % construct adjacency matrix
16 A(group1, group1) = rand(group_size,group_size) < p_group1;
17 A(group2, group2) = rand(ngroup_size,ngroup_size) < p_group2;
18 A(group1, group2) = rand(group_size, ngroup_size) < p_between_groups;
19 A = triu(A,1); A = A + A';
20
21 % can you see the groups?
22 subplot(1,3,1); spy(A);
23 xlabel('$A$', 'Interpreter','latex','FontSize',28);
24
25 % construct Laplacian and its spectrum
26 L = diag(sum(A))A;
27 [V D] = eigs(L, 2, 'SA');
28
29 % plot the components of the algebraic connectivity sorted by magnitude
30 subplot(1,3,2); plot(sort(V(:,2)), '.');
31 xlabel('$\tilde v_2$', 'Interpreter','latex','FontSize',28);
32
33 % partition the matrix accordingly and spot the communities
34 [ignore p] = sort(V(:,2));
35 subplot(1,3,3); spy(A(p,p));
36 xlabel('$\tilde A$', 'Interpreter','latex','FontSize',28);
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
6.7. Exercises 85
6.7 Exercises
E6.1 The spectra of Laplacian and row-stochastic adjacency matrices. Consider a row-stochastic matrix
A Rnn . Let L be the Laplacian matrix of the digraph associated to A. Compute the spectrum of L as a
function of the spectrum spec(A) of A.
E6.2 The adjacency and Laplacian matrices for the complete graph. For any number n N, the complete
graph with n nodes, denoted by K(n), is the undirected and unweighted graph in which any two distinct
nodes are connected. For example, see K(6) in figure.
Compute, for arbitrary n,
(i) the adjacency matrix of K(n) and its eigenvalues; and
(ii) the Laplacian matrix of K(n) and its eigenvalues.
E6.3 The adjacency and Laplacian matrices for the complete bipartite graph. A bipartite graph is a graph
whose vertices can be divided into two disjoint sets U and V with the property that every edge connects a
vertex in U to one in V . A complete bipartite graph is a bipartite graph in which every vertex of U is connected
with every vertex of V . If U has n vertices and V has m vertices, for arbitrary n, m N, the resulting
complete bipartite graph is denoted by K(n, m). For example, see K(1, 6) and K(3, 3) in figure.
Compute, for arbitrary n and m,
(i) the adjacency matrix of K(n, m) and its eigenvalues; and
(ii) the Laplacian matrix of K(n, m) and its eigenvalues.
E6.4 The Laplacian matrix of an undirected graph is positive semidefinite. Give an alternative proof, with-
out relying on the Gergorin Disks Theorem 2.8, that the Laplacian matrix L of an undirected weighted graph
is symmetric positive semidefinite. (Note that the proof of Lemma 6.5 relies on Gergorin Disks Theorem 2.8).
E6.5 A lower bound. Let G be a weighted undirected graph with adjacency matrix A. Assume G is connected, let
2 be the smallest non-zero eigenvalue of L, and show that, for any x Rn ,
1
2
x> Lx 2
x (1>
n x)1n 2 .
n
E6.6 The Laplacian matrix plus its transpose. Let G be a weighted digraph with Laplacian matrix L. Prove the
following statements are equivalent:
(i) G is weight-balanced,
(ii) L + L> is positive semidefinite.
Next, assume G is weight-balanced with adjacency matrix A, show that
(iii) L + L> is the Laplacian matrix of the digraph associated to the symmetric adjacency matrix A + A> ,
and
(iv) (L + L> )1n = 0n ,
E6.7 Scaled Laplacian matrices. Let L = L> Rnn be the Laplacian matrix of a connected, undirected, and
symmetrically weighted graph. Consider a diagonal matrix D = diag{d1 , . . . , dn }. Define the matrices A
and B by
A := DL and B := LD.
(i) Give necessary and sufficient conditions on {d1 , . . . , dn } for A to be a Laplacian matrix.
(ii) Give necessary and sufficient conditions on {d1 , . . . , dn } for B to be a Laplacian matrix.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
86 Chapter 6. The Laplacian Matrix
Show that:
(i) G is the quadratic form associated with the symmetric positive-semidefinite matrix
1
P = (Dout + Din A A> ),
2
1
(ii) P = 2 L + L(rev) , where the Laplacian of the reverse digraph is L(rev) = Din A> .
E6.9 The pseudoinverse Laplacian matrix. The Moore-Penrose pseudoinverse of an n m matrix M is the
unique m n matrix M with the following properties:
(i) M M M = M ,
(ii) M M M = M , and
(iii) M M is symmetric and M M is symmetric.
Assume L is the Laplacian matrix of a weighted connected undirected graph with n nodes. Let U Rnn be
an orthonormal matrix of eigenvectors of L such that
0 0 ... 0
0 2 . . . 0
>
L = U . . . U .
.. .. . . ...
0 0 . . . n
Show that
0 0 ... 0
0 1/2 ... 0
>
(i) L = U . .. .. .. U ,
.. . . .
0 0
. . . 1/n
1
(ii) LL = L L = In 1n 1>n , and
n
(iii) L 1n = 0n .
E6.10 The Green matrix of a Laplacian matrix. Assume L is the Laplacian matrix of a weighted connected
undirected graph with n nodes. Show that
(i) the matrix L + n1 1n 1>
n is positive definite,
(ii) the so-called Green matrix
1 1 1
X = L + 1n 1>n 1n 1>
n (E6.1)
n n
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 6 87
(iii) X = L , where L is defined in Exercise E6.9. In other words, the Green matrix formula (E6.1) is an
alternative definition of the pseudoinverse Laplacian matrix.
E6.11 Monotonicity of Laplacian eigenvalues. Consider a symmetric Laplacian matrix L Rnn associated to
a weighted and undirected graph G = {V, E, A}. Assume G is connected and let 2 (G) > 0 be its algebraic
connectivity, i.e., the second-smallest eigenvalue of L. Show that
(i) 2 (G) is a monotonically non-decreasing function of each weight aij , {i, j} E; and
(ii) 2 (G) is monotonically non-decreasing function in the edge set in the following sense: 2 (G) 2 (G0 )
for any graph G0 = (V, E 0 , A0 ) with E E 0 and aij = a0ij for all {i, j} E.
Hint: Use the disagreement function.
E6.12 Invertibility of principal minors of the Laplacian matrix. Consider a connected and undirected graph
and an arbitrary partition of the node set V = V1 V2 . The associated symmetric and irreducible Laplacian
matrix L Rnn is partitioned accordingly as
L11 L12
L= > .
L12 L22
Show that the submatrices L11 R|V1 ||V1 | and L22 R|V2 ||V2 | are nonsingular.
E6.13 Gaussian elimination and Laplacian matrices. Consider an undirected and connected graph and its
associated Laplacian matrix L Rnn . Consider the associated linear Laplacian equation y = Lx, where
x Rn is unknown and y Rn is a given vector. Verify that an elimination of xn from the last row of this
equation yields the following reduced set of equations:
.. .. .
y1 L1n /Lnn . . .. x1
.. . .
yn = . . .
Lin Ljn
. + .. . . . Lij Lnn .. ,
yn1 Ln1,n /Lnn . .. .. xn1
| {z } .. . .
=A
| {z }
=Lred
where the (i, j)-element of Lred is given by Lij Lin Ljn /Lnn . Show that the matrices A Rn11 and
L R(n1)(n1) obtained after Gaussian elimination have the following properties:
(i) A is nonnegative and column-stochastic matrix with at least one strictly positive element; and
(ii) Lred is a symmetric and irreducible Laplacian matrix.
Hint: To show the irreducibility of Lred , verify the following property regarding the fill-in of the matrix Lred : The
graph associated to the Laplacian Lred has an edge between nodes i and j if and only if (i) either {i, j} was an
edge in the original graph associated to L, (ii) or {i, n} and {j, n} were edges in the original graph associated to
L.
E6.14 Thomsons principle and energy routing. Consider a connected and undirected resistive electrical network
with n nodes, with external nodal current injections c Rn satisfying the balance condition 1>n c = 0, and
with resistances Rij > 0 for every undirected edge {i, j} E. For simplicity, we set Rij = if there is no
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
88 Chapter 6. The Laplacian Matrix
edge connecting i and j. As shown earlier in this chapter, Kirchhoffs and Ohms laws lead to the network
equations
X Xn
1
cinjected at i = cji = (vi vj ) ,
Rij
jN (i) jN (i)
where vi is the potential at node i and cji = 1/Rij (vi vj ) is the current flow from node i to node
j. Consider now a more general set of current flows fij (for all i, j Rn ) routing energy through the
network and compatible with the following basic assumptions:
(i) Skew-symmetry: fij = fji for all i, j Rn ;
(ii) Consistency: fij = 0 if {i, j} 6 E;
P
(iii) Conservation: cinjected at i = jN (i) fji for all i Rn .
Show that among all possible current flows fij , the physical current flow fij = cij = 1/Rij (vj vi )
uniquely minimizes the energy dissipation:
n
1 X 2
minimize J= Rij fij
fij , i,j{1,...,n} 2 i,j=1
subject to fij = fji for all i, j Rn ,
fij = 0 for all {i, j} 6 E ,
X
cinjected at i = fji for all i Rn .
jN (i)
Hint: The solution requires knowledge of the Karush-Kuhn-Tucker (KKT) conditions for optimality; this is a
classic topic in nonlinear constrained optimization discussed in numerous textbooks, e.g., in (Luenberger 1984).
E6.15 Linear spring networks with loads. Consider the two (connected) spring networks with n moving masses
in figure. For the right network, assume one of the masses is connected with a single stationary object with a
spring. Refer to the left spring network as free and to the right network as grounded. Let Fload be a load force
applied to the n moving masses.
x x
For the left network, let Lfree,n be the n n Laplacian matrix describing the free spring network among
the n moving masses, as defined in Section 6.2. For the right network, let Lfree,n + 1 be the (n + 1) (n + 1)
Laplacian matrix for the spring network among the n masses and the stationary object. Let Lgrounded be
the n n grounded Laplacian of the n masses constructed by removing the row and column of Lfree,n + 1
corresponding to the stationary object.
For the free spring network subject to Fload ,
(i) do equilibrium displacements exist for arbitrary loads?
(ii) if the load force Fload is balanced in the sense that 1>
n Fload = 0, is the resulting equilibrium displacement
unique?
(iii) compute the equilibrium displacement if unique, or the set of equilibrium displacements otherwise,
assuming a balanced force profile is applied.
For the grounded spring network,
(iv) derive an expression relating Lgrounded to Lfree,n ,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 6 89
n dmax .
E6.17 Distributed averaging-based PI control. Consider a set of n controllable agents governed by the second-
order dynamics
x i = yi , (E6.2a)
y i = ui + i , (E6.2b)
where i {1, . . . , n} is the index set, ui R is a control input to agent i, and i R is an unknown
disturbance affecting agent i. Given an undirected, connected, and weighted graph G = (V, E, A) with node
set V = {1, . . . , n}, edge set E V V , and adjacency matrix A = AT Rnn , we assume each agent can
measure its velocity yi R as well as the relative position xi xj for each neighbor {i, j} E. Based on
these measurements, consider now the distributed averaging-based proportional-integral (PI) controller
Xn
ui = aij (xi xj ) yi qi , (E6.3a)
j=1
Xn
qi = yi aij (qi qj ) , (E6.3b)
j=1
where qi R is a dynamic control state for each agent i {1, . . . , n}. Your tasks are as follows:
Pn
(i) show that the center of mass n1 i=1 xi (t) is bounded for all t 0,
(ii) characterize the set of equilibria (x? , y ? , q ? ) of the closed-loop system (E6.2)-(E6.3), and
(iii) show that all trajectories converge to these closed-loop equilibria.
E6.18 Maximum power dissipation. As in Subsection 6.3, consider an electrical network composed by three
voltage sources (v1 , v2 , v3 ) connected by three resistors (each with unit resistance R = 1) in an undirected
ring topology. Recall that the total power dissipated by the circuit is
What is the maximum dissipated power if the voltages v are such that kvk2 = 1?
Hint: Recall the notion of induced 2-norm.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 7
Continuous-time Averaging Systems
In this chapter we consider averaging algorithms in which the variables evolve in continuous time, instead of
discrete time. Therefore we look at some interesting differential equations. We borrow ideas from (Mesbahi
and Egerstedt 2010; Ren et al. 2007).
Figure 7.1: Alignment rule: the center fish rotates clockwise to align itself with the average heading of its neighbors.
91
92 Chapter 7. Continuous-time Averaging Systems
where L is the Laplacian of an appropriate weighted digraph G: each bird is a node and each directed edge
(i, j) has weight 1/dout (i). Here it is useful to recall the interpretation of (Lx)i as a force perceived by
node i in a network of springs.
Note: it is weird (i.e., mathematically ill-posed) to compute averages on a circle, but let us not worry
about it for now.
Note: this incomplete model does not concern itself with positions. In other words, we do not discuss
collision avoidance and formation/cohesion maintenance. Moreover, note that the graph G should be really
state dependent. For example, we may assume that two birds see each other and interact if and only if their
pairwise Euclidean distance is below a certain threshold.
Figure 7.2: Many animal species exhibit flocking behaviors that arise from decentralized interactions. On the left:
pacific threadfins (Polydactylus sexfilis); public domain image from the U.S. National Oceanic and Atmospheric
Administration. On the right: flock of snow geese (Chen caerulescens); public domain image from the U.S. Fish and
Wildlife Service.
Consider an electrical network with only pure resistors and with pure capacitors connecting each node
to ground; this example is taken from (Mesbahi and Egerstedt 2010; Ren et al. 2007).
From the previous chapter, we know the vector of injected currents cinjected and the vector of voltages
at the nodes v satisfy
cinjected = L v,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.2. Continuous-time linear systems and their convergence properties 93
where L is the Laplacian for the graph with coefficients aij = 1/rij . Additionally, assuming Ci is the
capacitance at node i, and keeping proper track of the current into each capacitor, we have
d
Ci vi = cinjected at i
dt
so that, with the shorthand C = diag(C1 , . . . , Cn ),
d
v = C 1 L v.
dt
Note: C 1 L is again a Laplacian matrix (for a directed weighted graph).
Note: it is physically intuitive that after some transient all nodes will have the same potential. This
intuition will be proved later in the chapter.
The matrix exponential is a remarkable operation with numerous properties; we ask the reader to review a
few basic ones in Exercise E7.1. A matrix A Rnn is
The spectral abscissa of a square matrix A is the maximum of the real parts of the eigenvalues of A, that is,
Theorem 7.1 (Convergence and spectral abscissa). For a square matrix A, the following statements hold:
We leave the proof of this theorem to the reader and mention that most required steps are similar to
the dicussion in Section 2.1 and are discussed later in this chapter.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
94 Chapter 7. Continuous-time Averaging Systems
x = Lx, (7.2)
Theorem 7.2 (The matrix exponential of a Laplacian matrix). Let L be an n n Laplacian matrix
with associated digraph G and with maximum diagonal entry `max = max{`11 , . . . , `nn }. Then
Proof. From the equality L1n = 0n and the definition of matrix exponential, we compute
X (1)k
exp(L)1n = In + Lk 1n = 1n .
k!
k=1
X (1)k k
1>
n exp(L) = 1>
n In + L = 1>
n.
k!
k=1
AL = L + `max In L = `max In + AL .
Because AL In = In AL , we know
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.3. The Laplacian flow 95
Here we used the following properties of the matrix exponential operation: exp(A + B) = P
exp(A) exp(B)
if AB = BA and exp(aIn ) = ea In . Next, because AL 0, we know that exp(AL ) = k
k=0 AL /k! is
lower bounded by the first n 1 terms of the series so that
n1
X 1 k
exp(L) = e`max exp(AL ) e`max A . (7.3)
k! L
k=0
Next, we derive two useful lower bounds on exp(L) based on the inequality (7.3). First, by keeping just
the first term, we establish statement (iii):
exp(L) e`max In 0.
Second, we lower bound the coefficients 1/k! and write:
n1
X n1
1 k e`max X k
exp(L) e`max A A .
k! (n 1)!
k=0 k=0
Notice now that the digraph G associated to L is the same as that associated to AL (we do not need to
worry about self-loops
Pn1 here). Hence, if node j is globally reachable in G, then Lemma 4.3 implies that the
jth column of k=0 AkL is positive and, by inequality (7.3), also the jth column of exp(L) is positive.
This
Pn1statement establishes (iv). Moreover, if L irreducible, then AL is irreducible, that is, AL satisfies
A k > 0 so that also exp(L) > 0. This establishes statement (v).
k=0
Lemma 7.3 (Equilibrium points). If G contains a globally reachable node, then the set of equilibrium
points of the Laplacian flow (7.2) is {1n | R}.
Proof. A point x is an equilibrium for the Laplacian flow if Lx = 0n . Hence, any point in the kernel of the
matrix L is an equilibrium. From Theorem 6.6, if G contains a globally reachable node, then rank(L) = n1.
Hence, the dimension of the kernel space is 1. The lemma follows by recalling that L1n = 0n .
In what follows, we are interested in characterizing the evolution of the Laplacian flow (7.2). To build
some intuition, let us first consider an undirected graph G and write the modal decomposition of the solution
as we did in Remark 2.3 in Section 2.1 for a discrete-time linear system. We proceed in two steps. First,
because G is undirected, the matrix L is symmetric and has real eigenvalues 0 = 1 2 n with
corresponding orthonormal (i.e., orthogonal and unit-length) eigenvectors v1 , . . . , vn . Define yi (t) = vi> x(t)
and left-multiply x = Lx by vi :
d
yi (t) = i yi (t), yi (0) = vi> x(0).
dt
These n decoupled ordinary differential equations are immediately solved to give
x(t) = y1 (t)v1 + y2 (t)v2 + + yn (t)vn
= e1 t (v1> x(0))v1 + e2 t (v2> x(0))v2 + + en t (vn> x(0))vn .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
96 Chapter 7. Continuous-time Averaging Systems
Second, recall that 1 = 0 and v1 = 1n / n because L is a symmetric Laplacian matrix (L1n = 0n ).
Therefore, we compute (v1> x(0))v1 = average(x(0))1n and substitute
Now, let us assume that G is connected so that its second smallest eigenvalue 2 is strictly positive. In this
case, we can infer that
In summary, we discovered that, for a connected undirected graph, the disagreement vector converges
to zero with an exponential rate 2 . In what follows, we state a more convergence to consensus result for
the continuous-time Laplacian flow. This result is parallel to Theorem 5.2; early references for it include (Lin
et al. 2005; Ren and Beard 2005).
Theorem 7.4 (Consensus for Laplacian matrices with globally reachable node). If a Laplacian ma-
trix L has associated digraph G with a globally reachable node, then
(i) the eigenvalue 0 of L is simple and all other eigenvalues of L have negative real part,
(ii) limt eLt = 1n w> , where w 0 is the left eigenvector of L with eigenvalue 0 satisfying w1 + +
wn = 1,
(iii) wi > 0 if and only if node i is globally reachable. Accordingly, wi = 0 if and only if node i is not
globally reachable,
(iv) the solution to d
dt x(t) = Lx(t) satisfies
lim x(t) = w> x(0) 1n ,
t
1>
n x(0)
lim x(t) = 1n = average x(0) 1n .
t n
Note: as a corollary to the statement (iii), the left eigenvector w Rn associated to the 0 eigenvalue
has strictly positive entries if and only if G is strongly connected.
Proof. Because the associated digraph has a globally reachable node, Theorem 6.6 establishes that L has
rank n 1 and that all eigenvalues of L have nonnegative real part. Therefore, also remembering the
property L1n = 0n , we conclude that 0 is a simple eigenvalue with right eigenvector 1n and that all other
eigenvalues of L have positive real part. This concludes the proof of (i). In what follows we let w denote
the left eigenvector associated to the eigenvalue 0, that is, w> L = 0> >
n , normalized so that 1n w = 1.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.3. The Laplacian flow 97
To prove statement (ii), we proceed in three steps. First, we write the Laplacian matrix in its Jordan
normal form:
0 0 0
0 J2 . . . 0 1
L = P JP 1
= P . P , (7.4)
.. . . . . . . 0
0 0 Jm
where m n is the number of Jordan blocks, the first block is the scalar 0 (being the only eigenvalue we
know), the other Jordan blocks J2 , . . . , Jm (unique up to re-ordering) are associated with eigenvalues with
strictly positive real part, and where the columns of P are the generalized eigenvectors of L (unique up to
rescaling).
Second, using some properties from Exercise E7.1, we compute the limit as t of eLt = P eJt P 1
as
1 0 0
0 0 . . . 0 1
lim eLt = P lim eJt P 1
= P . P = (P e1 )(e>
1P
1
) = c1 r1 ,
t t . .
.. . . . . 0
0 0 0
where c1 is the first column of P and r1 is the first row of P 1 . The contributions of the Jordan blocks
J2 , . . . , Jm vanish because their eigenvalues have negative real part; e.g., for more details see (Hespanha
2009).
Third and final, we characterize c1 and r1 . By definition, the first column of P (unique up to rescaling)
is a right eigenvector of the eigenvalue 0 for the matrix L, that is, c1 = 1n for some scalar since we
know L1n = 0n . Of course, it is convenient to define c1 = 1n . Next, equation (7.4) can be rewritten as
P 1 L = JP 1 , whose first row is r1 L = 0> >
n . This equality implies r1 = w for some scalar . Finally,
we note that P 1 P = In implies r1 c1 = 1, that is, w> 1n = 1. Since we know w> 1n = 1, we infer that
= 1 and that r1 = w> . This concludes the proof of statement (ii).
Next, we prove statement (iii). Pick a positive constant < 1/dmax , where the maximum out-degree
is dmax = max{dout (1), . . . , dout (n)}. Define B = In L. It is easy to show that B is nonnegative,
row-stochastic, and has strictly positive diagonal elements. Moreover, w> L = 0> >
n implies w B = w
>
so that w is the left eigenvector with unit eigenvalue for B. Now, note that the digraph G(L) associated
to L (without self-loops) is identical to the digraph G(B) associated to B, except for the fact that B has
self-loops at each node. By assumption G(L) has a globally reachable node and therefore so does G(B),
where the subgraph induced by the set of globally reachable nodes is aperiodic (due to the self-loops).
Therefore, statement (iii) is now an immediate transcription of the same statement for row-stochastic
matrices established in Theorem 5.2 (statement (iii)).
Statements (iv) and (v) are straightforward and left as Exercise E7.3.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
98 Chapter 7. Continuous-time Averaging Systems
x
(t) + (kd In + d L)x(t)
+ (kp In + p L)x(t) = 0n . (7.6)
By introducing the second-order Laplacian matrix L R2n2n , we write the system in first-order form:
x(t)
0nn In x(t) x(t)
= =: L .
v(t)
kp In p L kd In d L v(t) v(t)
It turns out that it is possible to compute the eigenvalues of the second-order Laplacian matrix; we refer
to Exercise E7.12 for its eigenvectors.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.4. Second-order Laplacian flows 99
Theorem 7.5 (Eigenvalues of second-order Laplacian matrices). Given a Laplacian matrix L and
coefficients kp , kd , p , d R,
(ii) given the eigenvalues i , i {1, . . . , n}, of L, the 2n eigenvalues i, , i {1, . . . , n}, of L are solutions
to
2 + (kd + d i ) + (kp + p i ) = 0, i {1, . . . , n}. (7.7)
Proof. Regarding statement (i), we recall equality (E7.1b) from Exercise E7.11 and compute the characteristic
polynomial of L as:
In In
det(I2n L) = det
kp In + p L ( + kd )In + d L
= det (In )(( + kd )In + d L) (In )(kp In + p L)
= det 2 In + (kd In + d L) + (kp In + p L) .
Regarding statement (ii), let JL be the Jordan normal form of L, i.e., let L = T JL T 1 for an appropriate
invertible T , and note
det(I2n L) = det 2 In + (kd In + d JL ) + (kp In + p JL )
Yn
= 2 + (kd + d i ) + (kp + p i ) .
i=1
Therefore, the 2n solutions to the characteristic equation det(I2n L) = 0 are n pairs of solutions 2i,2ii ,
i {1, . . . , n}, for the second-order equations (7.7). This concludes task (ii).
Theorem 7.6 (Asymptotic second-order consensus). Consider the second-order Laplacian flow (7.6).
The following statements are equivalent:
(i) the second-order Laplacian flow achieves asymptotic second-order consensus, that is, |xi xj | 0
and |x i x j | 0 as t for all i, j {1, . . . , n}, and
(ii) the 2(n 1) eigenvalues i, , i {2, . . . , n}, of the second-order Laplacian matrix L have strictly
negative real part.
The following proof is based on elementary calculations. An equivalent proof can be obtained using
the Jordan normal form, e.g., see (Ren and Atkins 2005; Ren 2008b).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
100 Chapter 7. Continuous-time Averaging Systems
xave (t)
Proof of Theorem 7.6. We introduce the following change of coordinates: T x(t) = , where
(t)
1/n 1/n . . . 1/n
1 1
xave (t) = average(x(t)), (t) Rn1 , and, from Exercise E2.3, T = .. .. . Corre-
. .
1 1
x ave (t)
spondingly, we also have T x(t)
= . To write the system in the new coordinates, we observe
(t)
T 1n = e1 and compute
T LT 1 e1 = T LT 1 (T 1n ) = T L1n = 0n ,
where the last equality follows from L1n = 0n . This implies that the first column of T LT 1 is 0n , that is,
0 c>
T LT 1 = , for Lred R(n1)(n1) and c Rn1 , (7.8)
0n1 Lred
Based on equations (7.8) and (7.9), we write the system in these new coordinates as
xave 0 0>
n1 1 0 xave
d
= 0n1 0(n1)(n1)) 0n1 In1
dt x ave kp p c> kd d c> x ave .
0n1 kp In1 p Lred 0n1 kd In1 d Lred
We reorder the variables to obtain a block-diagonal matrix, whose eigenvalues are the eigenvalues of the
diagonal blocks:
xave 0 1 0>
n1 0 xave
d
x ave = kp kd p c> d c> x ave
dt 0n1 0n1 0(n1)(n1)) In1 .
0n1 0n1 kp In1 p Lred kd In1 d Lred
We are now ready to conclude the proof: asymptotic second order consensus is achieved if and only if
0 In1
0n1 and 0n1 as t if and only if all eigenvalues of
kp In1 p Lred kd In1 d Lred
have strictly negative real part. But these eigenvalues are precisely the 2(n 1) eigenvalues i, , i
{2, . . . , n}, of the second-order Laplacian matrix L.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.4. Second-order Laplacian flows 101
Finally, we present convergence results for undirected graphs and positive gains; we refer to (Zhu et al.
2009) for the general case.
Theorem 7.7 (Asymptotically convergence of second-order Laplacian flows). Consider the second-
order Laplacian flow (7.6). Assume L is symmetric and irreducible (i.e., its associated digraph
is undirected
and connected).
Define the state average and its time derivative by: xave (t) = average x(t) and x ave (t) =
average x(t)
. Then the state averages satisfy
d xave (t) 0 1 xave (t)
= , (7.10)
dt x ave (t) kp kd x ave (t)
and, moreover,
(i) for the second-order consensus protocol (kp = kd = 0, d = 1, p > 0), asymptotic consensus on a ramp
signal is achieved, that is, as t ,
x(t) xave (0) + x ave (0)t 1n ;
(iii) for the position-averaging flow with absoluted velocity damping (kp = d = 0, p = 1, kd > 0),
asymptotic consensus on a composite average value is achieved, that is, as t
x(t) xave (0) + x ave (0)/kd 1n .
Proof. First, we show that, in the similarity transformation (7.8), if L is symmetric, then c = 0n1 . To do
this, we observe e> >
1 T = (1/n)1n and compute
e>
1 T LT
1
= 1>
n LT
1
= 0>
n,
where the last equality follows from L> 1n = 0n . This implies that the first row of T LT 1 is 0n and, in
turn, that equation (7.10) are correct. Second, for the index range i {2, . . . , n}, in all three cases the
second-order polynomial (7.7) has strictly positive coefficients, which implies that the 2(n 1) eigenvalues
i, , i {2, . . . , n}, of the second-order Laplacian matrix L have strictly negative real part. Therefore,
by Theorem 7.6, the second-order Laplacian flow achieves asymptotic second-order consensus and, more
specifically, xi (t) (xave (t))i 0 and x i (t) (x ave (t))i 0 for all i {1, . . . , n}. Third and finally , the
specific values for xave (t) follow from explicitely solving the state average dynamics (7.10). We leave the
details to the reader.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
102 Chapter 7. Continuous-time Averaging Systems
Lrescaled = diag(w)L.
Note that:
Lrescaled is again a Laplacian matrix because (i) its row-sums are zero, (ii) its diagonal entries are
positive, and (iii) its non-diagonal entries are nonpositive;
Lrescaled is the Laplacian matrix for a new digraph Grescaled with the same nodes and directed edges
as G, but whose weights are rescaled as follows: aij 7 wi aij . In other words, the weight of each
out-edge of node i is rescaled by wi .
where fi : R R is a strictly convex and twice continuously differentiable cost function known only
to processor i {1, . . . , n}. In a centralized setup, the decision variable x is globally available and the
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.6. Appendix: Distributed optimization using the Laplacian flow 103
minimizers x R of the optimization problem (7.11) can be found by solving for the critical points of f (x)
X n
0n = f (x) = fi (x).
x x
i=1
A centralized continuous-time algorithm converging to the set of critical points is the negative gradient flow
x = f (x) .
x
To find a distributed approach to solving the optimization problem (7.11), we associate a local estimate
yi R of the global variable x R to every processor and solve the equivalent problem
n
X 1
minimize yRn f(y) = fi (yi ) + y > Ly subject to Ly = 0n , (7.12)
2
i=1
where the consistency constraint Ly = 0 assures that yi = yj for all i, j {1, . . . , n}, that is, the local
estimates of all processors coincide. We also augmented the cost function with the term y > Ly, which clearly
has no effect on the minimizers of (7.12) (due to the consistency constraint), but it provides supplementary
damping and favorable convergence properties for our algorithm. The minimizers of the optimization
problems (7.11) and (7.12) are then related by y = x 1n .
Without any further motivation, consider the function L : Rn Rn R given by
1
L(y, z) = f (y) + y > Ly + z > Ly.
2
In the literature on convex optimization this function is known as (augmented) Lagrangian function and
z Rn is referred to as Lagrange multiplier. What is important for us is that the augmented Lagrangian
function is strictly convex in y and linear (and hence1 concave) in z. Hence, the augmented Lagrangian
function admits a set of saddle points (y , z ) Rn Rn , that is points satisfying
Since L(y, z) is differentiable in y and z, the saddle points can be obtained as solutions to the equations
0n = L(y, z) = f (y) + Ly + Lz,
y y
0n = L(y, z) = Ly.
z
Our motivation for introducing the Lagrangian is the following lemma.
Lemma 7.8 (Properties of saddle points). Let L = L> Rnn be a symmetric Laplacian associated to
an undirected, connected, and weighted graph, and consider the Lagrangian function L, where each fi is strictly
convex and twice continuously differentiable for all i {1, . . . , n}. Then
1
A function f : Rn R is said to be concave (resp. strictly concave) if f (x) is a convex (resp. strictly convex) function.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
104 Chapter 7. Continuous-time Averaging Systems
We leave the proof to the reader in Exercise E7.15. Since the Lagrangian function is convex in y and
concave in z, we can compute its saddle points by following the so-called saddle-point dynamics, consisting
of a positive and negative gradient:
y = L(y, z) = f (y) Ly Lz, (7.13a)
y y
z = L(y, z) = Ly. (7.13b)
z
For processor i {1, . . . , n}, the saddle-point dynamics (7.13) read component-wise as
n
X X n
y i = fi (yi ) aij (yi yj ) aij (zi zj ),
yi
j=1 j=1
n
X
zi = L(y, z) = aij (yi yj ).
zi
j=1
Hence, the saddle-point dynamics can be implement in a distributed processor network using only local
knowledge of fi (yi ), local computation, nearest-neighbor communication andof courseafter discretizing
the continuous-time dynamics; see Exercise E7.18. As shown in (Wang and Elia 2010; Gharesifard and
Cortes 2014; Droge et al. 2013; Cherukuri and Corts 2015), this distributed optimization setup is very
versatile and robust and extends to directed graphs and non-differentiable convex objective functions. We
will later establish using a powerful tool termed LaSalle Invariance Principle to show that the saddle-point
dynamics (7.13) always converge to the set of saddle points; see Exercise E13.5.
For now we restrict our analysis to the case of quadratic cost functions fi (x) = (x xi )> Pi (x xi ),
where Pi > 0 and xi R is the minimizer of the cost fi (x). Thus, the cost reads up to a constant scalar as
n
X n
X
f (x) = (x xi )> Pi (x xi ) = (x x )> Pi (x x ) + O(x0 ) ,
i=1 i=1
Pn P
where x is the weighted average x = ( i=1 Pi )1 ni=1 Pi xi , which is the global minimizer of f (x) (as
obtained by f (x)/x = 0); see Exercise E7.17. In this case, the saddle-point dynamics (7.13) reduce to
the linear system
y P L L y
= , (7.14)
z L 0 z
| {z }
=A
where y = y x 1nand P = diag({Pi }i{1,...,n} ). The matrix A is a so-called saddle matrix (Benzi et al.
2005). We will in the following establish the convergence of the dynamics (7.14) to the set of saddle points.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
7.6. Appendix: Distributed optimization using the Laplacian flow 105
First, observe that 0 is an eigenvalue of A with multiplicity 1 and the corresponding eigenvector, given by
> >
0n 1> n corresponds to the set of saddle points:
0n P L L y
= = (P + L) y = 0n = y span(1n )
y + Lz = 0n and L
0n L 0 z
= y> P y = 0n obtained by multiplying (P + L)
y + Lz = 0n by y>
= y = 0n and z = 1n .
Lemma 7.9 (Absence of sustained oscillations in saddle matrices). Consider a negative semidefinite
matrix B Rnn and a not necessarily square matrix C Rnm . If kernel(B) image(C) = {0n }, then
the composite block-matrix
B C
A=
C > 0
has no eigenvalues on the imaginary axis except for 0.
>
It follows that the saddle point dynamics (7.14) converge to the set of saddle points y> z >
>
span 0> n 1n
> . Since 1>
n z = 0, it follows that average(z(t)) = average(z0 ), we can further conclude
that the dynamics converge to a unique saddle point satisfying limt y(t) = x 1n and limt z(t) =
z0 1n .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
106 Chapter 7. Continuous-time Averaging Systems
7.7 Exercises
P 1 k
E7.1 Properties of the matrix exponential. Recall the definition of eA = k=0 k! A for any square matrix A.
Complete the following tasks:
P 1 k
(i) show that k=0 k! A converges absolutely for all square matrices A,
P P
Hint: Recall that a matrix series k=1 Ak is said to converge absolutely if k=1 kAk k converges, where
k k is a matrix norm. Introduce a sub-multiplicative matrix norm k k and show k eA k ekAk .
(ii) show that, if A = diag(a1 , . . . , an ), then eA = diag(ea1 , . . . ean ),
1
(iii) show that eT AT = T eA T 1 for any invertible T ,
(iv) show that AB = BA implies eAB = eBA ,
(v) give an example of matrices A and B such that eAB 6= eBA , and
(vi) compute the matrix exponential of etJ where J is a Jordan block of arbitrary size and t R.
E7.2 Continuous-time affine systems. Given A Rnn and b Rn , consider the continuous-time affine
systems
x(t)
= Ax(t) + b.
Assume A is Hurwitz and, similarly to Exercise E2.11, show that
(i) the matrix A is invertible,
(ii) the only equilibrium point of the system is A1 b, and
(iii) limt x(t) = A1 b for all initial conditions x(0) Rn .
E7.3 Consensus for Laplacian matrices: missing proofs. Complete the proof of Theorem 7.4, that is, prove
statements (iv) and (v).
E7.4 Laplacian average consensus in directed networks. Consider the directed network in Figure E7.1 with
arbitrary positive weights and its associated Laplacian flow x(t)
= L(x(t).
4 2 1
(i) Can the network reach consensus, that is, as t does x(t) converge to a limiting point in span{1n }?
(ii) Does x(t) achieve average consensus, that is, limt x(t) = average(x0 )1n ?
(iii) Will your answers change if you smartly add one directed edge and adapt the weights?
E7.5 Convergence of discrete-time and continuous-time averaging. Consider the following two weighted
digraphs and their associated nonnegative adjacency matrices A and Laplacian matrices L of appropriate
dimensions. Consider the associated discrete-time iterations x(t + 1) = Ax(t) and continuous-time Laplacian
flows x(t)
= Lx(t). For each of these two digraphs, argue about whether the discrete and/or continuous-time
systems converge as t . If they converge, what do they converge to? Please justify your answers.
E7.6 Euler discretization of the Laplacian. Given a weighted digraph G with Laplacian matrix L and maximum
out-degree dmax = max{dout (1), . . . , dout (n)}. Show that:
(i) if < 1/dmax , then the matrix In L is row-stochastic,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 7 107
0.95 7
8
0.95
Digraph 1 Digraph 2 0.05
0.05
5
0.05 0.05
1 0.34
4 0.95 0.9 0.05
0.34
1
4 0.32 0.05
0.66 0.56
0.32 0.68
6
0.66 0.68 3 0.07
0.32
0.68 2 0.03
0.68
0.31
3 0.01
2 0.34 9
0.32
0.01 0.63
10 0.99
(ii) if < 1/dmax and G is weight-balanced, then the matrix In L is doubly-stochastic, and
(iii) if < 1/dmax and G is strongly connected, then In L is primitive.
Given these results, note that (no additional assignment in what follows)
In L is the one-step Euler discretization of the continuous-time Laplacian flow and is a discrete-time
consensus algorithm; and
In L is a possible choice of weights for an undirected unweighted graph (which is therefore also
weight-balanced) in the design of a doubly-stochastic matrix (as we did in the discussion about Metropolis-
Hastings).
E7.7 Doubly-stochastic matrices on strongly-connected digraphs. Given a strongly-connected unweighted
digraph G, design weights along the edges of G (and possibly add self-loops) so that the weighted adjacency
matrix is doubly-stochastic.
E7.8 Constants of motion. In the study of mechanics, energy and momentum are two constants of motion, that
is, these quantities are constant along each evolution of the mechanical system. Show that
(i) If A is a row stochastic matrix with w> A = w> , then w> x(k) = w> x(0) for all times k Z0 where
x(k + 1) = Ax(k).
n , then w x(t) = w x(0) for all times t R0 where
(ii) If L is a Laplacian matrix with with w> L = 0> > >
x(t)
= Lx(t).
E7.9 Weight-balanced digraphs with a globally reachable node. Given a weighted directed graph G, show
that if G is weight-balanced and has a globally reachable node, then G is strongly connected.
E7.10 The Lyapunov equation for the Laplacian matrix of a strongly-connected digraph. Let L be the
Laplacian matrix of a strongly-connected weighted digraph. Find a positive-definite matrix P such that
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
108 Chapter 7. Continuous-time Averaging Systems
E7.12 Eigenvectors of the second-order Laplacian matrix. Consider a Laplacian matrix L, coefficients kp , kd , p , d
R and the induced second-order Laplacian matrix L. Let vl,i and vr,i be the left and right eigenvectors of L
corresponding to the eigenvalue i ,
(i) the right eigenvectors of L corresponding to the eigenvalues i, are
vr,i
,
i, vr,i
(ii) for kp > 0, the left eigenvectors of L corresponding to the eigenvalues i, are
vl,i
i,
vl,i .
kp + p i
E7.13 Laplacian oscillators. Given the Laplacian matrix L = L> Rnn of an undirected, weighted, and
connected graph with edge weights aij , i, j {1, . . . , n}, define the Laplacian oscillator flow by
x
(t) + Lx(t) = 0n . (E7.3)
This flow is written as first-order differential equation as
x(t)
0 In x(t) x(t)
= nn =: L .
z(t)
L 0nn z(t) z(t)
(i) Write the second-order Laplacian flow in components.
(ii) Write the characteristic polynomial of the matrix L using only the determinant of an n n matrix.
(iii) Given the eigenvalues 1 = 0, 2 , . . . , n of L, show that the eigenvalues 1 , . . . , 2n of A satisfy
p
1 = 2 = 0, 2i,2i1 = i i, for i {2, . . . , n},
where i is the imaginary unit.
(iv) Show that the solution is the superposition of a ramp signal and of n 1 harmonics, that is,
n
X p
x(t) = average(x(0)) + average(x(0))t
1n + ai sin( i t + i )vi ,
i=2
where {1n / n, v2 , . . . , vn } are the orthonormal eigenvector of L and where the amplitudes ai and
phases i are determined by the initial conditions x(0), x(0)
.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 7 109
E7.14 Delayed Laplacian flow. Define the delayed Laplacian flow dynamics over a connected, weighted, and graph
G by:
X
x i (t) = aij (xj (t ) xi (t )), i {1, . . . , n},
jN
where aij > 0 is the weight on the edge {i, j} E, and > 0 is a positive scalar delay term. The Laplace
domain representation of the system is X(s) = G(s)x(0) where G(s) is associated transfer function
and L = L> Rnn is the network Laplacian matrix. Show that the transfer function G(s) admits poles
on the imaginary axis if the following resonance condition is true for an eigenvalue i , i {1, . . . , n}, of the
Laplacian matrix:
= .
2i
E7.15 Properties of saddle points. Prove Lemma 7.8.
E7.16 Absence of sustained oscillations in saddle matrices. Prove Lemma 7.9.
E7.17 Centralized formulation of sum-of-squares cost. Consider a distributed optimization problem with n
agents, where the cost function fi (x) of each agent i {1, . . . , n} is described by fi (x) = (xxi )> Pi (xxi ),
where Pi > 0 and xi R is the minimizer of the cost fi (x). Consider the joint sum-of-squares cost function
Xn
fsos (x) = (x xi )> Pi (x xi ) .
i=1
Calculate the global minimizer x of fsos (x), and show that the sum-of-squares cost fsos (x) is, up to constant
terms, equivalent to the centralized cost function
X n
fcentralized (x) = (x x )> Pi (x x ) .
i=1
E7.18 Discrete saddle-point algorithm for distributed optimization. Consider the centralized optimization
problem
n
1X
z ? := argminzR pi (z ri )2 , (E7.4)
2 i=1
where pi > 0 and ri R are fixed scalar quantities for each i {1, . . . , n}. Our aim is to solve this
optimization problem in a distributed fashion, that is, distributing the computation among a population of n
agents. Each agent i has access only to pi and ri and can communicate with the other agents via a network
defined by the Laplacian matrix L. We assume that this network is undirected and connected.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
110 Chapter 7. Continuous-time Averaging Systems
(iii) Analogous to (7.13), consider the discrete-time distributed saddle point algorithm
X
xi (k + 1) = xi (k) pi (xi (k) ri ) + Lji j (k) , (E7.6a)
jNin (i)
X
i (k + 1) = i (k) + Lij xj (k) , (E7.6b)
jNout (i)
where > 0 is a sufficiently small step size. Show that, if the algorithm (E7.6) converges, then it
converges to a solution of Problem (E7.5).
(iv) Define the error vector by
x(k) x
e(k) := .
(k)
Find the error dynamics of the algorithm (E7.6), that is, the matrix G such that e(k + 1) = Ge(k).
(v) Show
that, for > 0 small enough, if is an eigenvalue of G then either = 1 or || < 1 and that
0n
is the only eigenvector relative to the eigenvalue = 1. Find an expression for . Use these
1n
results to study the convergence properties of the distributed algorithm (E7.6). Will x(k) x? as
k ?
Hint: Use Lemma 7.8.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 8
The Incidence Matrix and its Applications
After studying adjacency and Laplacian matrices, in this chapter we introduce one final matrix associated
with a graph: the incidence matrix. We study the properties of incidence matrices and their application
to a class of estimation problems with relative measurements. For simplicity we restrict our attention to
undirected graphs. We borrow ideas from (Barooah 2007; Barooah and Hespanha 2007; Bolognani et al.
2010; Piovan et al. 2013) and refer to (Foulds 1995; Biggs 1994; Godsil and Royle 2001) for more information.
Here, we adopt the convention that an edge (i, j) has the source i and the sink j.
It is useful to consider the following example graph, as depicted in figure.
e4
3 4 3 4
e2 e3
1 2 1 2
e1
As depicted on the right, we add an orientation to all edges, we order them and label them as follows:
111
112 Chapter 8. The Incidence Matrix and its Applications
e1 = (1, 2), e2 = (2, 3), e3 = (4, 2), and e4 = (3, 4). Accordingly, the incidence matrix is
+1 0 0 0
1 +1 1 0
B= 0 1 0 +1 .
0 0 +1 1
(B > x)e = xi xj .
Lemma 8.1 (From the incidence to the Laplacian matrix). If diag({ae }e{1,...,m} ) is the diagonal
matrix of edge weights, then
L = B diag({ae }e{1,...,m} )B > .
Lemma 8.2 (Rank of the incidence matrix). Let B be the incidence matrix of an undirected graph G
with n nodes. Let d be the number of connected components of G. Then
rank(B) = n d.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
8.3. Distributed estimation from relative measurements 113
Proof. We prove this result for a connected graph with d = 1, but the proof strategy easily extends to d > 1.
Recall that the rank of the Laplacian matrix L equals n d = n 1. Since the Laplacian matrix can be
factorized as L = B diag({ae }e{1,...,m} )B > , where diag({ae }e{1,...,m} ) has full rank m (and m n 1
due to connectivity), we have that necessarily rank(B) n 1. On the other hand rank(B) n 1 since
B > 1n = 0n . It follows that B has rank n 1.
The factorization of the Laplacian matrix as L = B diag({ae }e{1,...,m} )B > plays an important role of
relative sensing networks. For example, we can decompose, the Laplacian flow x = Lx into
open-loop plant: x i = ui , i {1, . . . , n} , or x = u ,
measurements: yij = xi xj , {i, j} E , or y = B>x ,
control gains: zij = aij yij , {i, j} E , or z = diag({ae }e{1,...,m} )y ,
X
control inputs: ui = zij , i {1, . . . , n} , or u = Bz .
{i,j}E
Indeed, this control structure, illustrated as a block-diagram in Figure 8.2, is required to implement flocking-
type behavior as in Example 7.1.1. The control structure in Figure 8.2 has emerged as a canonical control
structure in many relative sensing and flow network problems also for more complicated open-loop
dynamics and possibly nonlinear control gains (Bai et al. 2011).
..
.
u x
x i = ui
_ ..
.
B BT
..
.
aij
z .. y
.
Figure 8.2: Illustration of the canonical control structure for a relative sensing network.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
114 Chapter 8. The Incidence Matrix and its Applications
absolute reference
frame
x
xj
xi xj
xi
Figure 8.3: A wireless sensor network in which sensors can measure each others relative distance and bearing. We
assume that, for each link between node i and node j, the relative distance along the x-axis xi xj is available,
where xi is the x-coordinate of node i.
where B is the graph incidence matrix and the measurement noises v(i,j) , (i, j) E, are independent
jointly-Gaussian variables with zero-mean E[v(i,j) ] = 0 and variance E[v(i,j)
2 ] = 2
(i,j) > 0. The joint
matrix covariance is the diagonal matrix = diag({(i,j) }(i,j)E ) R
2 mm . (For later use, it is convenient
to define also y(j,i) = y(j,i) = xj xi v(i,j) .)
The optimal estimate x b of the unknown vector x Rn via the relative measurements y Rm is the
solution to
min kB > x
b yk21 .
x
b
Since no absolute information is available about x, we add the additional constraint that the optimal estimate
should have zero mean and summarize this discussion as follows.
Definition 8.3 (Optimal estimation based on relative measurements). Given an incidence matrix B,
a set of relative measurements y with covariance , find x
b satisfying
min kB > x
b yk21 . (8.2)
x
b1n
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
8.3. Distributed estimation from relative measurements 115
Specifically:
0= kB > x
b yk21 = 2B1 B > x
b 2B1 y.
b
x
b Rn satisfying
The optimal solution is therefore obtained as the unique vector x
B1 B > x
b = B1 y x = B1 y,
Lb
(8.3)
1>
nxb = 0,
where the Laplacian matrix L is defined by L = B1 B > . This matrix is the Laplacian for the weighted
graph whose weights are the noise covariances associated to each relative measurement edge.
Before proceeding we review the definition and properties of the pseudoinverse Laplacian matrix given
in Exercise E6.9. Recall that the Moore-Penrose pseudoinverse of an n m matrix M is the unique m n
matrix M with the following properties:
(i) M M M = M ,
(ii) M M M = M , and
(iii) M M is symmetric and M M is symmetric.
For our Laplacian matrix L, let U Rnn be an orthonormal matrix of eigenvectors of L. It is known that
0 0 ... 0 0 0 ... 0
0 2 . . . 0 0 1/2 . . . 0
> >
L = U . . . .. U = L = U . . . .. U .
.
. .. . . . .. .
. . . .
0 0 . . . n 0 0 . . . 1/n
1
Moreover, it is known that LL = L L = In 1n 1>
n and L 1n = 0n .
n
Lemma 8.4 (Unique optimal estimate). If the undirected graph G is connected, then
(i) there exists a unique solution to equations (8.3) solving the optimization problem in equation (8.2); and
(ii) this unique solution is given by
b = L B1 y.
x
Proof. We claim there exists a unique solution to equation (8.3) and prove it as follows. Since G is connected,
the rank of L is n 1. Moreover, since L is symmetric and since L1n = 0n , the image of L is the
(n 1)-dimensional vector subspace orthogonal to the subspace spanned by the vector 1n . The vector
B1 y belongs to the image of L because the column-sums of B are zero, that is, 1> >
n B = 0n , so that
> 1 >
1n B y = 0n . Finally, the requirement that 1n x >
b = 0 ensures x
b is perpendicular to the kernel of L.
1
b = L B y follows from left-multiplying left and right hand side of equation (8.3)
The expression x
by the pseudoinverse Laplacian matrix L and using the property L L = In n1 1n 1> n . One can also verify
> 1
that 1n L B y = 0, because L 1n = 0n .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
116 Chapter 8. The Incidence Matrix and its Applications
Lemma 8.5. Given a graph G describing a relative measurement problem for the unknown variables x Rn ,
with measurements y Rm , and measurement covariance matrix = diag({(i,j)
2 }
(i,j)E ) R
mm . The
(ii) if G is connected and if < 1/dmax where dmax is the maximum weighted out-degree of G, then the
solution k 7 xb(k) of the affine averaging algorithm (8.4) converges to the unique solution x
b of the
optimization problem 8.2.
Proof. To show fact (i), note that the algorithm can be written in vector form as
b(k) B1 (B > x
b(k + 1) = x
x b(k) y),
x (In L)b
(k + 1) = (In L + L)b x(k) B1 y
x B1 y)
= (In L)(k) + (Lb
= (In L)(k).
Now, according to Exercise E7.6, is sufficiently small so that In L is nonnegative. Moreover, (In L)
is doubly-stochastic and symmetric, and its corresponding undirected graph is connected and aperiodic,
Corollary 5.1 implies that (k) average((0))1n = 0n .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
8.4. Appendix: Cycle and cutset spaces 117
that is, all relative measurements around the ring cancel out. Equivalently, 1n kernel(B). This consistency
check can be used as additional information to process corrupted measurements.
These insights generalize to arbitrary graphs, and the nullspace of B and its orthogonal complement,
the image of B > , can be related to cycles and cutsets in the graph. In what follows, we present some of
these generalizations; the presentation in this section is inspired by (Biggs 1994; Zelazo 2009). As a running
example in this section we use the graph and the incidence matrix illustrated in Figure 8.4.
3 2 3
1 2 6 6 +1 +1 0 0 0 0 0
6 1 0 +1 0 0 0 07
1 6 7
5 60 1 0 +1 +1 +1 0 7
4 B=6
60
7
6 0 1 1 0 0 +177
5 40 0 0 0 1 0 15
2 3 7
4 0 0 0 0 0 1 0
Figure 8.4: An undirected graph with arbitrary edge orientation and its associated incidence matrix B R67 .
Definition 8.6 (Signed path vector). Given an undirected graph G with an arbitrary orientation of its m
edges, let be a simple path. The signed path vector v {1, 0, +1}m of the simple path is defined by, for
e {1, . . . , m},
+1, if edge e is traversed positively by ,
ve = 1, if edge e is traversed negatively by ,
0, otherwise.
Proposition 8.7 (Cycle space). Consider an undirected graph G with an arbitrary orientation of its edges
and incidence matrix B Rnm . The kernel of B, called the cycle space of G, is the subspace of Rm spanned
by the signed path vectors corresponding to all the cycles in G.
Lemma 8.8. Given an undirected graph G, consider an arbitrary orientation of its edges, its incidence matrix
B Rnm , and a simple path with distinct initial and final nodes described by a signed path vector v Rm .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
118 Chapter 8. The Incidence Matrix and its Applications
Proof. We write y = B diag(v)1n . The (i, e) element of the matrix B diag(v) takes the value 1 (respec-
tively +1) if edge e is used by the path to enter (respectively leave) node i. Now, if node i is not the initial
or final node of the path , then the ith row-sum of B diag(v), (B diag(v)1n )i , is zero. For the initial node
(B diag(v)1n )i = 1, and for the final node (B diag(v)1n )i = 1.
For the example graph in Figure 8.4, two cycles and their signed path vectors are illustrated in Figure 8.5.
Observe that v1 , v2 kernel(B) and the cycle traversing the edges (1, 3, 7, 5, 2) in counter-clockwise
orientation has a signed path vector given by the linear combination v1 + v2 .
02 31
+1 0
3 6 B6 1 0 7C
1 2 6 B6 7C
B6+1 0 7C
1 B6 7C
5 kernel(B) = span B 6 7C
v1 4 B6 1 +17C = span(v1 , v2 )
v2 B6 0 17 C
B6 7C
5 @4 0 0 A5
2 3 7
4 0 +1
Figure 8.5: Two cycles and their respective signed path vectors in kernel(B).
Definition 8.9 (Cutset orientation vector). Given an undirected graph G, consider an arbitrary orientation
of its edges and a partition of its vertices V in two non-empty and disjoint sets V1 and V2 . The cutset orientation
vector v {1, 0, +1}m corresponding to the partition V = V1 V2 has components
+1, if edge e has its source node in V1 and sink node in V2 ,
ve = 1, if edge e has its sink node in V1 and source node in V2 ,
0, otherwise.
Proposition 8.10 (Cutset space). Given an undirected graph G, consider an arbitrary orientation of its
edges and its incidence matrix B Rnm . The image of B > , called the cutset space of G, is subspace of Rm
spanned by all cutset orientation vectors corresponding to all partitions of G.
where b>i is the ith row of the incidence matrix. If Bx = 0n for some x R , then bi x = 0 for all
m >
i {1, . . . , n}. It follows that v > x = 0, or equivalently, v belongs to the orthogonal complement of
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
8.4. Appendix: Cycle and cutset spaces 119
kernel(B) which is the image of B > . Finally, notice that the image of B > can be constructed this way: the
kth column of B > is obtained by choosing the partition V1 = {k} and V2 = V \ {k}. Thus, the cutset
orientation vectors span the image of B > .
Since rank(B) = n 1, any n 1 columns of the matrix B > form a basis for the cutset space. For
instance, the ith column corresponds to the cut isolating node i as V = {i} V \ {i}. For the example
in Figure 8.4, five cuts and their cutset orientation vectors are illustrated in Figure 8.6. Observe that
vi image(B > ), for i {1, . . . , 5}, and the cut isolating node 6 has a cutset orientation vector given
by the linear combination (v1 + v2 + v3 + v4 + v5 ). Likewise, the cut separating nodes {1, 2, 3} from
{4, 5, 6} has the cutset vector v1 + v2 + v3 corresponding to the sum of the first three columns of B > .
02 31
+1 1 0 0 0
3 v3 B6+1 0 1 0 07 C
v1 1 2 6 6 B6 7C
B6 0 +1 0 1 0 7C
7
B6 C
1 5 image B T = span B 6
B6 0 0 +1 1 07 7C
C
4 B6 0 0 +1 0 17 C
7
B6 C
5 @4 0 0 +1 0 0 5A
2 3 7 0 0 0 +1 1
v2 v5
4 v4
= span (v1 , v2 , v3 , v4 , v5 )
Figure 8.6: Five cuts and their cutset orientation vectors in image(B > ).
Example 8.11 (Kirchhoffs and Ohms laws revisited). In the following, we revisit the electrical resistor
network from Section 6.3, and re-derive its governing equations via the incidence matrix. Recall that with
each node i {1, . . . , n} of the network, we associate an external current injection cinjected at i . With each edge
{i, j} E we associate a positive conductance (i.e., the inverse of the resistance) aij > 0 and (after introducing
an arbitrary direction for each edge) a current flow cij and a voltage drop uij .
Kirchhoffs voltage law states that the sum of all voltage drops around each cycle must be zero, that is, for
each cycle in the network there is a signed path vector c Rm so that cT u = 0. Equivalently, by Proposition
8.7, there is a vector v Rn so that u = B T v, where B Rnm is the incidence matrix of (oriented) network.
In Section 6 we referred to v as the vector of nodal voltages or potentials.
Kirchhoffs current law states that the sum of all current injections at Pevery node must be zero, that is,
for each node i {1, . . . , n} in the network, we have that cinjected at i = nj=1 cji . Consider now the cut
isolating node i characterizedPby a cutset vector corresponding to the ith column bTi of B T ; see Figure 8.6. Then,
n
we have that cinjected at i = j=1 cji = cbTi = bi cT . Equivalently, we have that cinjected = Bc.
Finally, Ohms law states that the current cji and the voltage drop uij over a resistor with resistance
1/aij are related as cji = aij uij . By combining Kirchhoffs and Ohms laws, we arrive at
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
120 Chapter 8. The Incidence Matrix and its Applications
Example 8.12 (Nonlinear network flow problem). Consider a (static) network flow problem where a
commodity (e.g., power or water) is transported through a network (e.g., a power grid or a piping system). We
model this scenario with an undirected and connected graph with n nodes. With each node we associate an
external supply/demand P variable (positive for a source and negative for a sink) yi and assume that the overall
network is balanced: ni=1 yi = 0. We also associate a potential variable xi with every node (e.g., voltage or
pressure), and assume the flow of commodity between two connected nodes i and j depends on the potential
difference as fij (xi xj ), where fij is a strictly increasing function satisfying fij (0) = 0. For example, for
piping systems and power grids these functions fij are given by the rational Hazen-Williams flow and the
trigonometric power flow, which are both monotone in the region of interest. By balancing the flow at each
node (akin to the Kirchhoffs current law), we obtain at node i
n
X
yi = aij fij (xi xj ) , i {1, . . . , n} ,
j=1
where aij {0, 1} is the (i, j) element of the network adjacency matrix. In vector notation, the flow balance is
y = Bf B > x ,
where f RE is the vector-valued function with components fij . Consider also the associated linearized
problem y = BB > x = Lx, where L is the network Laplacian matrix, where we implicitly assumed fij0 (0) = 1.
The flows in the linear problem are obtained as B > x? = B > L y, where L is the Moore-Pennrose inverse of
L; see Exercises E6.9 and E6.10. In the following we restrict ourselves to an acyclic network and show that the
nonlinear solution can be obtained from the solution of the linear problem.
We formally replace the flow f (B > x) by a new variable v := f (B > x) and arrive at
y = Bv , (8.7a)
>
v=f B x . (8.7b)
In the acyclic case, kernel(B) = {0} and necessarily v image(B > ), or v = B > w for some w Rn . Thus,
equation (8.7a) reads as y = Bv = BB > w = Lw and its solution is w = L y. Equation (8.7b) then reads as
f (B > x) = v = B > w = B > L y, and its unique solution (due to monotonicity) is B > x? = f 1 (B > L y).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
8.5. Exercises 121
8.5 Exercises
E8.1 Continuous distributed estimation from relative measurements. Consider the continuous distributed
estimation algorithm given by the affine Laplacian flow (8.5). Show that for an undirected and connected
graph G and appropriately initial conditions x
(0) = 0n , the affine Laplacian flow (8.5) converges to the unique
b of the estimation problem given in Lemma 8.4.
solution x
E8.2 The edge Laplacian matrix (Zelazo and Mesbahi 2011). For an unweighted undirected graph with n
nodes and m edges, introduce an arbitrary orientation for the edges. Recall the notions of incidence matrix
B Rnm and Laplacian matrix L = BB > Rnn and define the edge Laplacian matrix by
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
122 Chapter 8. The Incidence Matrix and its Applications
..
.
u x
x i = ui
_ ..
.
B BT
..
+ + .
aij
z .. y
.
Figure E8.1: A relative sensing network with a constant disturbance input R|E| .
..
.
u x
x i = ui
_ ..
.
B BT
..
+ + .
aij
+ z .. y
p .
..
.1
s ..
.
Figure E8.2: Relative sensing network with a disturbance R|E| and distributed integral action.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 8 123
E8.5 Sensitivity of Laplacian eigenvalues. Consider an unweighted undirected graph G = (V, E) with incidence
matrix B Rnm , and Laplacian matrix of L = BB T Rnn . Consider now a graph G0 obtained by adding
one unweighted edge e / E to G, that is, G0 = (V, E e). Show that
Hint: You may want to take a detour via the edge Laplacian matrix Ledge = B T B Rmm (see Exercise
E8.2) and use the following fact (Horn and Johnson 1985, Theorem 4.3.17): if A is a symmetric matrix with
eigenvalues ordered as 1 2 . . . n , and B is a principal submatrix of A with eigenvalues ordered as
1 2 . . . n1 , then the eigenvalues of A and B interlace, that is,
1 1 2 . . . n1 n .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 9
Positive and Compartmental Systems
This chapter is inspired by the excellent text (Walter and Contreras 1999) and the tutorial treatment
in (Jacquez and Simon 1993); see also the texts (Luenberger 1979; Farina and Rinaldi 2000; Haddad et al.
2010). Additional results on Mezler matrices are available in (Berman and Plemmons 1994; Santesso and
Valcher 2007). For nonlinear extensions of the material in this chapter, including recent studies of traffic
networks, we refer to Como et al. (2013); Coogan and Arcak (2015).
Ecological and environmental systems The flow of energy and nutrients (water, nitrates, phosphates,
etc) in ecosystems is typically studied using compartmental modelling. For example, Figure 9.1 illustrates a
widely-cited water flow model for a desert ecosystem (Noy-Meir 1973). Other classic ecological network
systems include models for dissolved oxygen in stream, nutrient flow in forest growth and biomass flow in
fisheries (Walter and Contreras 1999).
Epidemiology of infectious deseases To study the propagation of infectious deseases, the population at
risk is typically divided into compartments consisting of individiduals who are susceptible (S), infected
(I), and, possibly, recovered and no longer susceptible (R). As illustrated in Figure 9.2, the three basic
epidemiological models are (Hethcote 2000) called SI, SIS, SIR, depending upon how the desease spreads. A
detailed discussion is postponed until Chapter 16.
Drug and chemical kinetics in biomedical systems Compartmental model are also widely adopted to
characterize the kinetics of drugs and chemicals in biomedical systems. Here is a classic example (Charkes
125
126 Chapter 9. Positive and Compartmental Systems
drinking herbivory
animals evaporation
Figure 9.1: Water flow model for a desert ecosystem. The blue line denotes an inflow from the outside environment.
The red lines denote outflows into the outside environment.
Figure 9.2: The three basic models SI, SIS and SIR for the propagation of an infectious desease
et al. 1978) from nuclear medicine: bone scintigraphy (also called bone scan) is a medical test in which
the patient is injected with a small amount of radioactive material and then scanned with an appropriate
radiation camera. .
Figure 9.3: The kinetics of a radioactive isotope through the human body (ECF = extra-cellular fluid).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
9.2. Positive systems 127
x(t)
= Ax(t), and x(t)
= Ax(t) + b.
Note that the set of affine systems includes the set of linear systems (each linear system is affine with
b = 0n ).
It is now convenient to introduce a second useful definition.
Definition 9.2 (Metzler matrix). A matrix A Rnn , n 2, is Metzler (sometimes also referred to as
quasi-positive or essentially nonnegative) if all its off-diagonal elements are nonnegative.
In other words, A is Metzler if and only if there exists a scalar a > 0 such that A + aIn is nonnegative.
For example, if G is a weighted digraph with Laplacian matrix L, then L is a Metzler matrix with zero
row-sums.
A Metzler matrix A induces a weighted digraph G without self-loops in the natural way, that is, by
letting (i, j) be an edge of G if and only if aij > 0. We say a Metzler matrix is irreducible if its induced
digraph is strongly connected.
We are now ready to classify which affine systems are positive.
Theorem 9.3 (Positive affine systems and Metzler matrices). For the affine system x(t)
= Ax(t) + b,
the following statements are equivalent:
(i) the system is positive, that is, x(t) 0n for all t and all x(0) 0n ,
(ii) A is Metzler and b 0n .
Proof. We start by showing that statement (i) implies statement (ii). If x(0) = 0n , then x cannot have any
negative components, hence b 0n . If any off-diagonal entry (i, j), i 6= j, of A is strictly negative, then
consider an initial condition x(0) with all zero entries except for x(j) > bi /|aij |. It is easy to see that
x i (0) < 0 which is a contradiction.
P there exists i such
Next, we show that statement (ii) implies statement (i). It suffices to note that, anytime
that xi (t) = 0, the conditions x(t) 0n , A Metzler and b 0n together imply x i (t) = i6=j aij xj (t)+bi
0.
This results motivates the importance of Metzler matrices. Therefore we now study their properties in
two theorems. We start by writing a version of Perron-Frobenius Theorem 2.12 for nonnegative matrices.
(i) there exists a real eigenvalue such that <() for all other eigenvalues , and
(ii) the right and left eigenvectors of can be selected nonnegative.
(iii) there exists a real simple eigenvalue such that <() for all other eigenvalues , and
(iv) the right and left eigenvectors of are unique and positive (up to rescaling).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
128 Chapter 9. Positive and Compartmental Systems
As for nonnegative matrices, we refer to as to the dominant eigenvalue. We invite the reader to
work out the details of the proof in Exercise E9.2. Next, we give necessary and sufficient conditions for the
dominant eigenvalue of a Metzler matrix to be strictly negative.
Theorem 9.5 (Properties of Hurwitz Metzler matrices). For a Metzler matrix A, the following statements
are equivalent:
(i) A is Hurwitz,
(ii) A is invertible and A1 0, and
(iii) for all b 0n , there exists x 0n solving Ax + b = 0n .
Moreover, if A is Metzler, Hurwitz and irreducible, then A1 > 0.
Proof. We start by showing that (i) implies (ii). Clearly, if A is Hurwitz, then it is also invertible. So it suffices
to show that A1 is nonnegative. Pick > 0 and define A,A = In + A, that is, (A) = (In A,A ).
Because A is Metzler, can be selected small enough so that A,A 0. Moreover, because the spectrum
of A is strictly in the left half plane, one can verify that, for small enough, spec(A) is inside the disk
of unit radius centered at the point 1; as illustrated in Figure 9.4. In turn, this last property implies
Figure 9.4: For any C with strictly negative real part, there exists such that the segment from the origin to is
inside the disk of unit radius centered at the point 1.
that spec(In + A) is strictly inside the disk of unit radius centered at the origin, that is, (A,A ) < 1.
We now adopt the Neumann series as defined in Exercise E2.13: because (A,A ) < 1, we know that
(In A,A ) = (A) is invertible and that
X
1 1
(A) = (In A,A ) = Ak,A . (9.1)
k=0
Note now that the right-hand side is nonnegative because it is the sum of nonnegative matrices. In summary,
we have shown that A is invertible and that A1 0. This statement proves that (i) implies (ii).
Next we show that (ii) implies (i). We know A is Metzler, invertible and satisfies A1 0. By the
Perron-Frobenius Theorem 9.4 for Metzler matrices, we know there exists v 0n , v 6= 0n , satisfying
Av = Metzler v, where Metzler = max{<() | spec(A)}. Clearly, A invertible implies Metzler 6= 0 and,
moreover, v = Metzler A1 v. Now, we know v is nonnegative and A1 v is nonpositive. Hence, Metzler must
be negative and, in turn, A is Hurwitz. This statement establishes the equivalence between (ii) implies (i)
Finally, regarding the equivalence between statement (ii) and statement (iii), note that, if A1 0 and
b 0n , then clearly x = A1 b 0n solves Ax + b = 0n . This proves that (ii) implies (iii). Vice versa,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
9.2. Positive systems 129
if statement (iii) holds, then let xi be the nonnegative solution of Axi = ei and let X be the nonnegative
matrix with columns x1 , . . . , xn . Therefore, we know AX = In so that A is invertible, X is its inverse,
and A1 = (X) = X is nonnegative. This statement proves that (iii) implies (ii).
Finally, the statement that A1 > 0 for each Metzler, Hurwitz and irreducible matrix A is proved as
follows. Because A is irreducible, the matrix A,A = In + A is nonnegative (for sufficiently small) and
primitive. Therefore, the right-hand side of equation (9.1) is strictly positive.
This theorem about Metzler matrices immediately leads to the following corollary about positive affine
systems, which extends the results in Exercise E7.2.
Corollary 9.6 (Existence, positivity and stability of equilibria for positive affine systems). Con-
sider a continuous-time positive affine system x = Ax + b, where A is Metzler and b is nonnegative. If the
matrix A is Hurwitz, then
(i) the system has a unique equilibrium point x Rn , that is, a unique solution to Ax + b = 0n ,
(ii) the equilibrium point x is nonnegative, and
(iii) all trajectories converge asymptotically to x .
Several other properties of positive affine systems and Metzler matrices are reviewed in (Berman and
Plemmons 1994), albeit with a slightly different language.
The following properties extend the previous results and show that positive systems admit very nice
convergence properties.
Theorem 9.7 (Convergence of positive systems). Consider a continuous-time positive system x = Ax,
where A is Metzler and Hurwitz. Then,
P
(i) there is a positive vector w Rn so that the weighted 1-norm (i.e., the weighted mass) ni=1 wi |xj (t)|
is exponentially convergent;
(ii) there is a positive vector v Rn so that the weighted -norm (i.e., the weighted max) maxi{1,...,n} |xi (t)|/vi
is exponentially convergent; and
P
(iii) there is a positive vector p Rn so that the weighted 2-norm (i.e., the weighted energy) ni=1 pi x2i is
exponentially convergent.
Proof. In the following, let v and w be the dominant right and left eigenvectors corresponding to the
dominant real eigenvalue with largest real part; see Theorem 9.4. Since A is Hurwitz by assumption, it
follows that < 0.
To prove claim (i), consider the function
n
X n
X
V1 (x(t)) = wi |xi (t)| = wi xi (t) = wT x(t) ,
i=1 i=1
where we used the positivity of the dynamical system in the second equality. The derivative of V1 (x(t)) is
then
V 1 (x(t)) = wT x(t)
= wT Ax(t) = wT x(t) = V1 (x(t)) .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
130 Chapter 9. Positive and Compartmental Systems
where we used the positivity of the dynamical system in the second equality. Pick a time t and assume that
k arg maxi{1,...,n} xi (t)/vi . Then, at time t, the time derivative of V2 (x(t)) is given by
d
V 2 (x(t)) = diag(v)1 x(t) k = diag(v)1 Ax(t) k
dt
xk (t)
= diag(v)1 A diag(v) diag(v)1 x(t) k diag(v)1 A diag(v)1n ,
vk k
where we used the element-wise inequality diag(v)1 x(t) 1n xkvk(t) that follows from the fact that all
elements of the vector diag(v)1 x(t), except for the kth, are multiplied by the off-diagonal elements of the
Metzler matrix diag(v)1 A diag(v). We continue by further analyzing the derivative V 2 (x(t)) as
xk (t) xk (t)
V 2 (x(t)) diag(v)1 A diag(v)1n = diag(v)1 Av
vk k vk k
xk (t) xk (t)
= diag(v)1 v = = V2 (x(t)) .
vk k vk
Next notice that Q = P A + AT P is a symmetric matrix, Q is Metzler since P is diagonal and positive, and
Qv = (P A + AT P )v = P Av + AT w < 0, that is, there is a vector v so that Qv < 0. Thus, according to
claim (ii), the function
V2 (x(t)) = max |xi (t)|/vi
i{1,...,n}
and accordingly V3 (x(t)) = emax (Q)min (P )t V3 (x(0)) is exponentially convergent since max (Q) < 0 and
min (P ) > 0.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
9.3. Compartmental systems 131
u3 q3 F4!3 q4 F4!0
ui qi Fi!0
F3!2
F2!4
Fi!j
Fj!i F2!3
u1 q1 F1!2 q2 F2!0
In general, the flow along (i, j) is a function of the entire system state q = (q1 , . . . , qn ) and of time t, so
that Fij = Fij (q, t).
Remarks 9.8 (Basic properties). (i) The mass in each of the compartments as well as the mass flowing
along each of the edges must be nonnegative at all times (recall we assume ui 0). Specifically, we
require the mass flow functions to satisfy
Fij (q, t) 0 for all (q, t), and Fij (q, t) = 0 for all (q, t) such that qi = 0. (9.3)
Under these conditions, if at some time P t0 one of the compartments has no mass, that is, qi (t0 ) = 0 and
q(t0 ) Rn0 , it follows that qi (t0 ) = nj=1,j6=i Fji (q(t0 ), t0 ) + ui 0 so that qi does not become
negative. The compartmental system (9.2) is therefore a positive system, as introduced in Definition 9.1.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
132 Chapter 9. Positive and Compartmental Systems
Pn
(ii) If M (q) = i=1 qi n q denotes the total mass in the system, then along the solutions of (9.2)
= 1>
d Xn Xn
M (q(t)) = 1> =
n q(t) Fi0 (q(t), t) + ui . (9.4)
dt | i=1
{z } | i=1
{z }
outflow into environment inflow from environment
This equality implies that the total mass t 7 M (q(t)) is constant in systems without inflows and
outflows.
Definition 9.9 (Linear compartmental systems). A linear compartmental system with n compartments
is a triplet (F, f0 , u) consisting of
(i) a nonnegative n n matrix F = (fij )i,j{1,...,n} with zero diagonal, called the flow rate matrix,
(ii) a vector f0 0n , called the outflow rates vector, and
(iii) a vector u 0n , called the inflow vector.
The flow rate matrix F is the adjacency matrix of the compartmental digraph GF (a weighted digraph without
self-loops).
The flow rate matrix F encodes the following information: the nodes are the compartments {1, . . . , n},
there is an edge (i, j) if there is a flow from compartment i to compartment j, and the weight fij of the
(i, j) edge is the corresponding flow rate constant. In a linear compartmental system,
Fij (q, t) = fij qi , for j {1, . . . , n},
Fi0 (q, t) = f0i qi , and
ui (q, t) = ui .
Indeed, this model is also referred to as donor-controlled flow. Note that this model satisfies the physically-
meaningful contraints (9.3). The affine dynamics describing a linear compartmental system is
n
X n
X
qi (t) = f0i + fij qi (t) + fji qj (t) + ui . (9.5)
j=1,j6=i j=1,j6=i
Definition 9.10 (Compartmental matrix). The compartmental matrix C = (cij )i,j{1,...,n} of a com-
partmental system (F, f0 , u) is defined by
(
fji , if i 6= j,
cij = Pn
f0i h=1,h6=i fih , if i = j.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
9.3. Compartmental systems 133
In what follows it is convenient to call compartmental any matrix C with the following properties:
With the notion of compartmental matrix, the dynamics of the linear compartmental system (9.5) can
be written as
q(t)
= Cq(t) + u. (9.7)
Moreover, since LF 1n = 0n , we know 1> >
n C = f0 and, consistently with equation (9.4), we know
d > >
dt M (q(t)) = f0 q(t) + 1n u.
Lemma 9.11 (Spectral properties of compartmental matrices). For a compartmental system (F, f0 , u)
with compartmental matrix C,
Proof. Statement (i) is akin the result in Lemma 6.5 and can be proved by an application of the Gergorin
Disks Theorem 2.8. We invite the reader to fill out the details in Exercise E9.5. Statement (i) immediately
implies statement (ii).
Next, we introduce some useful graph-theoretical notions, illustrated in Figure 9.6. In the compartmental
digraph, a set of compartments S is
(i) outflow-connected if there exists a directed path from every compartment in S to the environment,
that is, to a compartment j with a positive flow rate constant f0j > 0,
(ii) inflow-connected if there exists a directed path from the environment to every compartment in S,
that is, from a compartment i with a positive inflow ui > 0,
(iii) a trap if there is no directed path from any of the compartments in S to the environment or to any
compartment outside S, and
(iv) a simple trap is a trap that has no traps inside it.
It is immediate to realize the following equivalence: the system is outflow connected (i.e., all compartments
are outflow-connected) if and only if the system contains no trap.
Theorem 9.12 (Algebraic graph theory of compartmental systems). Consider the linear compartmen-
tal system (F, f0 , u) with dynamics (9.7) with compartmental matrix C and compartmental digraph GF . The
following statements are equivalent:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
134 Chapter 9. Positive and Compartmental Systems
(a) An example compartmental system and its strongly connected components: this (b) This compartmental system is not
system is outflow-connected because its two sinks in the condensation digraph are outflow-connected because one of its sink
outflow-connected. strongly-connected components is a trap.
Moreover, the sinks of the condensation of GF that are not outflow-connected are precisely the simple traps of
the system and their number equals the multiplicity of 0 as a semisimple eigenvalue of C.
Therefore, A is row-substochastic, i.e., all its row-sums are at most 1 and one row-sum is strictly less than
1. Moreover, because A is irreducible, Corollary 4.10 implies that (A) < 1. Now, let 1 , . . . , n denote the
eigenvalues of A. Because A = In +C > , we know that the eigenvalues 1 , . . . , n of C satisfy i = 1+i
so that maxi <(i ) = 1 + maxi <(i ). Finally, we note that (A) < 1 implies maxi <(i ) < 1 so that
1
max <(i ) = max <(i ) 1 < 0.
i i
This concludes the proof that if G is strongly connected, then F has eigenvalues with strictly negative real
part. The converse is easy to prove by contradiction: if f0 = 0n , then the matrix C has zero row-sums, but
this is a contradiction with the assumption that C is invertible.
Next, to prove the equivalence between (ii) and (iii) for a graph GF whose condensation digraph has an
arbitrary number of sinks, we proceed as in the proof of Theorem 6.6: we reorder the compartments as
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
9.3. Compartmental systems 135
described in Exercise E3.1 so that the Laplacian matrix LF is block lower-triangular as in equation (6.5).
We then define an appropriately small and the matrix A = In C > as above. We leave the remaining
details to the reader.
An alternative clever proof strategy for the equivalence between (ii) and (iii) is given as follows. Define
the matrix
C 0
Caugmented = T R(n+1)(n+1) ,
f0 1Tn f0
and consider the augmented linear system x = Caugmented x with x Rn+1 . Note that Laugmented =
T
Caugmented is the Laplacian matrix of the graph whose nodes {1, . . . , n, n + 1} are the n compartments
and the environment as (n + 1)st node, and whose edges are the edges of the compartmental graph GF
as well as the outflow edges to the environment. By Theorem 7.4, the Laplacian flow x = Laugmented x
with x Rn+1 is semi-convergent, limt x(t) = (wT x(0))1n+1 , and w1 = wn = 0 if and only if
node n + 1 (the environment) is globally reachable. Equivalently, the system x = Caugmented x satisfies
limt x(t) = (0n , 1Tn x0 ) if and only if the environment is globally reachable (i.e., system is outflow-
connected). Equivalently, (x1 , . . . , xn ) converges to zero (i.e., C is Hurwitz) if and only if the system is
outflow-connected.
(a) A compartmental system that is not (b) The corresponding reduced compart-
outflow-connected mental system
We now state our main result about the asymptotic behavior of linear compartmental systems.
Theorem 9.13 (Asymptotic behavior of compartmental systems). The linear compartmental system
(F, f0 , u) with compartmental matrix C and compartmental digraph GF has the following possible asymptotic
behaviors:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
136 Chapter 9. Positive and Compartmental Systems
(i) if the system is outflow-connected, then the compartmental matrix C is invertible, every solution tends
exponentially to the unique equilibrium q = C 1 u 0n , and in the ith compartment qi > 0 if and
only if the ith compartment is inflow-connected to a positive inflow;
(ii) if the system contains one or more simple traps, then:
a) the reduced compartmental system (Frd , f0,rd , urd ) is outflow-connected and all its solutions con-
1
verge exponentially fast to the unique nonnegative equilibrium Crd urd , for Crd = Frd>
diag(Frd 1n + f0,rd );
b) any simple trap H contains non-decreasing mass along time. If H is inflow-connected to a positive
inflow, then the mass inside H goes to infinity. Otherwise, the mass inside H converges to a scalar
multiple of the right eigenvector corresponding to the eigenvalue 0 of the compartmental submatrix
for H.
Proof. Statement (i) is an immediate consequence of Corollary 9.6. We leave the proof of statement (ii) to
the reader.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
9.4. Table of asymptotic behaviors for averaging and positive systems 137
averaging system the associated digraph has a globally reachable node Convergence properties:
x(k + 1) = Ax(k) = Theorem 5.2.
A row-stochastic limk x(k) = (w> x(0))1n where w 0 is the left Examples: opinion dynamics
eigenvector of A with eigenvalue 1 satisfying 1>
nw = 1 & averaging in Chapter 1
affine system A convergent (that is, its spectral radius is less than 1) Convergence properties: Ex-
x(k + 1) = Ax(k) + b = limk x(k) = (In A)1 b ercise E2.11.
Examples: Friedkin-Johnsen
system in Exercise E5.6
positive affine system x(0) 0n = x(k) 0n for all k, and Positivity properties: Exer-
x(k + 1) = Ax(k) + b cise E9.9
A 0, b 0n A convergent (that is, || < 1 for all spec(A)) Examples: Leslie population
= limk x(k) = (In A)1 b 0n model in Exercise E4.12
averaging system the associated digraph has a globally reachable node Convergence properties:
x(t)
= Lx(t) = Theorem 7.4.
L Laplacian matrix limt x(t) = (w> x(0))1n where w 0 is the left Examples: Flocking system
eigenvector of L with eigenvalue 0 satisfying 1>
nw = 1 in Section 7.1.1
affine system A Hurwitz (that is, its spectral abscissa is negative) Convergence properties: Ex-
x(t)
= Ax(t) + b = limt x(t) = A1 b ercise E7.2
positive affine system x(0) 0n = x(t) 0n for all t, and Positivity properties: Theo-
x(t)
= Ax(t) + b rem 9.3 and Corollary 9.6.
A Metzler, b 0n A Hurwitz (that is, <() < 0 for all spec(A)) Example: compartmental
= limt x(t) = A1 b 0n systems in Section 9.1.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
138 Chapter 9. Positive and Compartmental Systems
9.5 Exercises
E9.1 The matrix exponential of a Metzler matrix. In this exercise we extend and adapt Theorem 7.2 about the
matrix exponential of a Laplacian matrix to the setting of Metzler matrices.
Let M be an n n Metzler matrix with minimum diagonal entry mmin = min{m11 , . . . , mnn }. As usual,
associate to M a digraph G without self-loops in the natural way, that is, (i, j) is an edge if and only if
mij > 0. Prove that
(i) exp(M ) emmin In 0, for any digraph G,
(ii) exp(M ) ej > 0, for a digraph G whose j-th node is globally reachable,
(iii) exp(M ) > 0, for a strongly connected digraph G (i.e., for an irreducible M ).
Morever, prove that, for any square matrix A,
(iv) exp(At) 0 for all t 0 if and only if A is Metzler.
E9.2 Proof of the Perron-Frobenius Theorem for Metzler matrices. Prove Theorem 9.4.
E9.3 Metzler invariance under nonnegative change of basis. Consider a positive system with Mezler matrix
A and constant input b 0:
x = Ax + b.
Show that, under the change of basis
z = T 1 x,
with T invertible and T 1 0, the transformed matrix T 1 AT is also Metzler.
E9.4 Equilibrium points for positive systems. Consider two continuous-time positive affine systems
x = Ax + b,
b + bb.
x = Ax
Assume that A and Ab are Hurwitz and, by Corollary 9.6, let x and x
b denote the equilibrium points of the
b b
two systems. Show that the inequalities A A and b b imply x x b .
E9.5 Establishing the spectral properties of compartmental matrices. Prove Lemma 9.11 about the spectral
properties of compartmental matrices.
E9.6 Simple traps and strong connectivity. Show that a compartmental system that has no outflows and that is
a simple trap, is strongly connected.
E9.7 On Metzler matrices and compartmental systems with growth and decay. Let M be an nn symmetric
Mezler matrix. Recall Lemma 9.11 and define v Rn by M = L+diag(v), where L is a symmetric Laplacian
matrix. Show that:
(i) if M is Hurwitz, then 1>
n v < 0.
Next, assume n = 2 and assume v has both nonnegative and nonpositive entries. (If v is nonnegative, lack of
stability can be established from statement (i); if v is nonpositive, stability can be established via Theorem 9.12.)
Show that
(ii) there exist nonnegative numbers f , d and g such that, modulo a permutation, M can be written in the
form:
1 1 g 0 (g f ) f
M = f + = ,
1 1 0 d f (d f )
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 9 139
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
140 Chapter 9. Positive and Compartmental Systems
combined with the value of V along the boundary of the enclosure; see the left image in Figure E9.1.
b1 b2 b3 b4
h
b5 V1 V2 V3 V4 b7
2 2
@ V @ V
+ =0
y @x2 @y 2 V5 V6 V7 V8
b6 b8
Figure E9.1: Laplaces equation over a rectangular enclosure and a regular Cartesian grid.
For arbitrary enclosures and boundary conditions, it is impossible to solve the Laplaces equation in closed
form. An approximate solution is computed by (i) introducing a regular Cartesian grid of points with spacing
h, e.g., see the right image in Figure E9.1, and (ii) approximating the second-order derivatives by second-order
finite differences. Specifically, at node 2 of the grid, we have along the x direction
2V 1 1 1
(V2 ) 2 (V3 V2 ) 2 (V2 V1 ) = 2 (V3 + V2 2V2 ),
x2 h h h
so that equation (E9.1) is approximated as follows:
2V 2V 1
0= 2
(V2 ) + 2
(V2 ) 2 (V1 + V3 + V6 + b2 4V2 ) = 4V2 = V1 + V3 + V6 + b2 .
x y h
This approximation translates into the matrix equation:
where V Rn is the vector of unknown potentials, b Rm is the vector of boundary conditions, Agrid
{0, 1}nn is the binary adjacency matrix of the (interior) grid graph (that is, (Agrid )ij = 1 if and only if
the interior nodes i and j are connected), and Cgrid-boundary {0, 1}nm is the connection matrix between
interior and boundary nodes (that is, (Cgrid-boundary )i = 1 if and only if grid interior node i is connected with
boundary node ). Show that
(i) Agrid is irreducible but not primitive,
(ii) (Agrid ) < 4,
Hint: Recall Theorem 4.8.
(iii) there exists a unique solution V to equation (E9.2),
(iv) the unique solution V satisfies V 0n if b 0m , and
(v) each solution to the following iteration converges to V :
whereby, at each step, the value of V at each node is updated to be equal to the average of its neighboring
nodes.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Part II
141
Chapter 10
Convergence Rates, Scalability and
Optimization
In this chapter we discuss the convergence rate of averaging algorithms. We borrow ideas from (Xiao and
Boyd 2004; Carli et al. 2009; Garin and Schenato 2010; Fagnani 2014). We focus on discrete-time systems
and their convergence factors. The study of continuous-time systems is analogous.
Before proceeding, we recall a few basic facts. Given a square matrix A,
(i) the spectral radius of A is (A) = max{|| | spec(A)};
(ii) the p-induced norm of A, for p N {}, is
kAxkp
kAkp = max kAxkp | x Rn and kxkp = 1 = max ,
x6=0n kxkp
and, specifically, the induced 2-norm of A is kAk2 = max{ | spec(A> A)};
(iii) for any p, (A) kAkp ; and
(iv) if A = A> , then kAk2 = (A).
Definition 10.1 (Essential spectral radius of a row-stochastic matrix). The essential spectral radius
of a row-stochastic matrix A is
(
0, if spec(A) = {1, . . . , 1},
ess (A) =
max{|| | spec(A) \ {1}}, otherwise.
143
144 Chapter 10. Convergence Rates, Scalability and Optimization
Note that G undirected implies that A is symmetric. Therefore, A has real eigenvalues 1 2 n
and corresponding orthonormal eigenvectors v1 , . . . , vn . Because A is row-stochastic, 1 = 1 and v1 =
1n / n. Next, along the same lines of the modal decomposion given in Section 2.1, we know that the
solution can be decoupled into n independent evolution equations as
Moreover, A being primitive implies that max{|2 |, . . . , |n |} < 1. Specifically, for a symmetric and
primitive A, we have ess (A) = max{|2 |, |n |} < 1. Therefore, as predicted by Corollary 2.15
To upper bound the error, since the vectors v1 , . . . , vn are orthonormal, we compute
v
Xn
uX
2
u n
>
x(k) average(x(0))1n
=
k >
(v
j j x(0))v j
= t |j | 2k (v
j x(0))v j
2 2 2
j=2 j=2
v
uX
u n
>
2
k
ess (A)k t (v
j x(0))v j
= ess (A)
x(0) average(x(0))1n
, (10.1)
2 2
j=2
A note on convergence factors for asymmetric matrices Consider now the asymmetric matrix
0.1 1010
Alarge-gain = .
0 0.1
Clearly, the two eigenvalues are 0.1 and so is the spectral radius. This is therefore a convergent matrix. It is
however false that the evolution of the system
with an initial condition with non-zero second entry, satisfies a bound of the form in equation (10.1). It is
still true, of course, that the solution does eventually converge to zero exponentially fast.
The problem is that the eigenvalues (alone) of a non-symmetric matrix do not fully describe the state
amplification that may take place during a transient period of time. (Note that the 2-norm of Alarge-gain is
order 1010 .)
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
10.2. Convergence factors for row-stochastic matrices 145
x(k + 1) = Ax(k),
where A is doubly-stochastic and not necessarily symmetric. If A is primitive (i.e., the associated digraph is
aperiodic and strongly connected), we know
lim x(k) = average(x(0))1n = 1n 1> n /n x(0).
k
We now define two possible notions of convergence factors. The per-step convergence factor is
kx(k + 1) xfinal k2
rstep (A) = sup ,
x(k)6=xfinal kx(k) xfinal k2
where xfinal = average(x(0))1n = average(x(k))1n and where the supremum is taken over any possible
sequence. Moreover, the asymptotic convergence factor is
!1/k
kx(k) xfinal k2
rasym (A) = sup lim .
x(0)6=xfinal k kx(0) xfinal k2
Given these definitions the preliminary calculations in the previous Section 10.1, we can now state our
main results.
Theorem 10.2 (Convergence factor and solution bounds). Let A be doubly-stochastic and primitive.
Moreover, rasym (A) rstep (A), and rstep (A) = rasym (A) if A is symmetric.
(ii) For any initial condition x(0) with corresponding xfinal = average(x(0))1n ,
x(k) xfinal
rstep (A)k
x(0) xfinal
, (10.3)
2 2
x(k) xfinal
c (rasym (A) + )k
x(0) xfinal
, (10.4)
2 2
where > 0 is an arbitrarily small constant and c is a sufficiently large constant independent of x(0).
Note: A sufficient condition for rstep (A) < 1 is given in Exercise E10.1.
Before proving Theorem 10.2, we introduce an interesting intermediate result. For xfinal = average(x(0))1n ,
the disagreement vector is the error signal
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
146 Chapter 10. Convergence Rates, Scalability and Optimization
Lemma 10.3 (Disagreement or error dynamics). Given a doubly-stochastic matrix A, the disagreement
vector (k) satisfies
a) limk Ak = 1n 1> n /n, (that is, the averaging algorithm achieves average consensus)
b) A is primitive, (that is, the digraph is aperiodic and strongly connected)
c) (A 1n 1> n /n) < 1. (that is, the error dynamics is convergent)
Proof. To study the error dynamics, note that 1> > > >
n x(k + 1) = 1n Ax(k) and, in turn, that 1n x(k) = 1n x(0);
see also Exercise E7.8. Therefore, average(x(0)) = average(x(k)) and (k) 1n for all k. This completes
the proof of statement (i). To prove statement (ii), we compute
(k + 1) = Ax(k) xfinal = Ax(k) (1n 1> >
n /n)x(k) = A 1n 1n /n x(k),
and the equation in statement (ii) follows from A 1n 1> n /n 1n = 0n .
Next, let us prove the equivalence among the three properties. From PerronFrobenius Theorem 2.12
for primitive matrices in Chapter 2 and from Corollary 2.15, we know that A primitive (statement (iii)b)
implies average consensus (statement (iii)a). The converse is true because 1n 1> n /n is a positive matrix and,
by the definition of limit, there must exist k such that each entry of Ak becomes positive.
Finally, we prove the equivalence between statement (iii)a and (iii)c. First, note that P = In 1n 1> n /n
is a projection matrix, that is, P = P . This can be easily verified by expanding the matrix power P 2 .
2
Ak 1n 1> k >
n /n = A (In 1n 1n /n) (because A row-stochastic)
= Ak (In 1n 1>
n /n)
k
(because In 1n 1>
n /n is a projection)
k k
= A(In 1n 1> n /n) = A 1n 1>
n /n .
The statement follows from taking the limit as k in this identity and by recalling that a matrix is
convergent if and only if its spectral radius is less then one.
Proof of Theorem 10.2. Regarding the equalities (10.2), the formula for rstep is an consequence of the defini-
tion of induced 2-norm:
kx(k + 1) xfinal k2
rstep (A) = sup
x(k)6=xfinal kx(k) xfinal k2
k(k + 1)k2 k(A 1n 1>
n /n)(k)k2 k(A 1n 1>
n /n)yk2
= sup = sup = sup ,
(k)1n k(k)k2 (k)1n k(k)k2 y6=0n kyk2
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
10.3. Cumulative quadratic index for symmetric matrices 147
The equality rasym (A) = (A 1n 1> n /n) is a consequence of the error dynamics in Lemma 10.3,
statement (ii).
Next, note that (A) = 1 is a simple eigenvalue and A is semiconvergent. Hence, by Exercise E2.2 on
the Jordan normal form of A, there exists a nonsingular T such that
1 0>
A=T n1 T 1 ,
0n1 B
where B R(n1)(n1) is convergent, that is, (B) < 1. Moreover we know ess (A) = (B).
Usual properties of similarity transformations imply
k 1 0> 1 k 1 0>
A =T n1 T , = lim A = T n1 T 1 .
0n1 B k k 0n1 0(n1)(n1)
Because A is doubly-stochastic and primitive, we know limk Ak = 1n 1> n /n so that A can be decomposed
as
> 0 0>
A = 1n 1n /n + T n1 T 1 ,
0n1 B
and conclude with ess (A) = (B) = (A 1n 1> n /n). This concludes the proof of the equalities (10.2).
The bound (10.3) is an immediate consequence of the definition of induced norm.
Finally, we leave to the reader the proof of the bound (10.4) in Exercise E10.3. Note that the arbitrarily-
small positive parameter is required because the eigenvalue corresponding to the essential spectral radius
may have an algebraic multiplicity strictly larger than its geometric multiplicity.
Recall the disagreement vector (k) defined in (10.5) and the associated disagreement dynamics
(k + 1) = A 1n 1> n /n (k) ,
and observe that the initial conditions of the disagreement vector (0) satisfy
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
148 Chapter 10. Convergence Rates, Scalability and Optimization
To define an average transient and asymptotic performance of this averaging algorithm, we define the
cumulative quadratic index of the matrix A by
K
1X
Jcum (A) = lim E k(k)k22 . (10.6)
K n
k=0
Theorem 10.4 (Cumulative quadratic index for symmetric matrices). The cumulative quadratic in-
dex (10.6) of a row-stochastic, primitive, and symmetric matrix A satisfies
1 X 1
Jcum (A) = .
n 1 2
spec(A)\{1}
PK
Proof. Pick a terminal time K N and define JK (A) = 1
n k=0 E k(k)k22 . From the definition (10.6)
and the disagreement dynamics, we compute
1X
K
JK (A) = trace E (k)(k)>
n
k=0
k k >
1 XK
> >
>
= trace A 1n 1n /n E (0)(0) A 1n 1n /n
n
k=0
1X
K k k >
> >
= trace A 1n 1n /n A 1n 1n /n .
n
k=0
1 XK
= trace k k (because trace(AB) = trace(BA))
n
k=0
XK X
1
= 2k
n
k=0 spec(A)\{1}
1 X 1 2(K1)
= . (because of the geometric series)
n 1 2
spec(A)\{1}
The formula for Jcum follows from taking the limit as K and recalling that A primitive implies
ess (A) < 1.
Note: All eigenvalues of A appear in the computation of the cumulative quadratic index (10.6), not
only the dominant eigenvalue as in the asymptotic convergence factor. Similar results can be obtained
for normal matrices, as opposed to symmetric, as illustrated in (Carli et al. 2009); it is not known how to
compute the cumulative quadratic index for arbitrary doubly-stochastic primitive matrices.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
10.4. Circulant network examples and scalability analysis 149
This matrix is circulant, that is, each row-vector is equal to the preceding row-vector rotated one element
to the right. The associated digraph is illustrated in the Figure 10.1. Circulant matrices have remarkable
properties (Davis 1994). For example, from Exercise E10.4, the eigenvalues of An, can be computed to be
(not ordered in magnitude)
2(i 1)
i = 2 cos + (1 2), for i {1, . . . , n}. (10.7)
n
An illustration is given in Figure 10.2. For n even (similar results hold for n odd), plotting the eigenvalues
on the segment [1, 1] shows that
2
ess (An, ) = max{|2 |, |n/2+1 |}, where 2 = 2 cos + (1 2), and n/2+1 = 1 4.
n
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
150 Chapter 10. Convergence Rates, Scalability and Optimization
fk (x) = 2 cos(2x) + (1 2)
1.0 i = f ((i 1)/n), i 2 {1, . . . , n}, n = 5
1 =1
= .1
= .4
0.5
= .2 =
2 5
= .3
x
0.0
0.2 0.4 0.6 0.8 1.0
= .4 0.2 0.4 0.6 0.8 1.0
-0.5 = .5 3 = 4
-1.0
Figure 10.2: The eigenvalues of An, as given in equation (10.7). The left figure illustrate also the case of = .5, even
if that value is strictly outside the allowed range [0, .5[.
If we fix ]0, 1/2[ and consider sufficiently large values of n, then |2 | > |n/2+1 |. In the limit of large
graphs n , the Taylor expansion cos(x) = 1 x2 /2 + O(x4 ) leads to
1 1
ess (An, ) = 1 4 2 2 + O 4 .
n n
Note that ess (An, ) < 1 for any n, but the separation from ess (An, ) to 1, called the spectral gap, shrinks
with 1/n2 .
In summary, this discussion leads to the broad statement that certain large-scale graphs have slow
convergence factors. For more results along these lines (specifically, the case of elegant study of Cayley
graphs), we refer to (Carli et al. 2008). These results can also be easily mapped to the eigenvalues of the
associated Laplacian matrices; e.g., see Exercise E6.1.
We conclude this section by computing the cumulative quadratic cost introduced in Section 10.3. For
the circulant network example, one can compute (Carli et al. 2009)
1 1
C1
Jcum (An, ) C2 ,
n n
where C1 and C2 are positive constants. It is instructive to compare this result with the worst-case
asymptotic or per-step convergence factor that scale as ess (An, ) = 1 4 2 n12 .
where A is compatible with G if its only non-zero entries correspond to the edges E of the graph. In
>
P if Eij = ei ej is the matrix with entry (i, j) equal to one and all other entries equal to zero,
other words,
then A = (i,j)E aij Eij for arbitrary weights aij R. We refer to such problems as fastest distributed
averaging (FDAs) problems.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
10.5. Design of fastest distributed averaging 151
Note: In what follows, we remove the constraint A 0 to widen the set of matrices of interest. Accord-
ingly, we remove the constraint of A being primitive. Convergence to average consensus is guaranteed by
(1) achieving convergence factors less than 1, (2) subject to row-sums and column-sums equal to 1.
Problem 1: Asymmetric FDA with asymptotic convergence factor
minimize A 1n 1> n /n
X
subject to A = aij Eij , A1n = 1n , 1>
n A = 1n
>
(i,j)E
The asymmetric FDA is a hard optimization problem. Even though the constraints are linear, the objective
function, i.e., the spectral radius of a matrix, is not convex (and, additionally, not even Lipschitz continuous).
Problem 2: Asymmetric FDA with per-step convergence factor
minimize
A 1n 1>
n /n 2
X
subject to A = aij Eij , A1n = 1n , 1>n A = 1n
>
(i,j)E
Both Problems 2 and 3 are convex and can be rewritten as so-called semi-definite programs (SDPs);
see (Xiao and Boyd 2004). An SDP is an optimization problem where (1) the variable is a positive semidefinite
matrix, (2) the objective function is linear, and (3) the constraints are affine equations. SDPs can be efficiently
solved by software tools such as CVX; see (Grant and Boyd 2014).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
152 Chapter 10. Convergence Rates, Scalability and Optimization
10.6 Exercises
E10.1 Induced norm of certain doubly stochastic matrices. Assume A is doubly stochastic, primitive and has
a strictly-positive diagonal. Show that
E10.2 Spectrum of A 1n 1> n /n. Consider a matrix A doubly stochastic, primitive and symmetric. Assume
1 n are its real eigenvalue with corresponding orthonormal eigenvectors v1 , . . . , vn . Show that
the matrix A 1n 1> n /n has eigenvaalues 0, 2 n with eigenvectors v1 , . . . , vn .
E10.3 Bounds on the norm of a matrix power. Given a matrix B Rnn and an index k N, show that
(i) there exists c > 0 such that
kB k k2 c k n1 (B)k ,
(ii) for all > 0, there exists c > 0 such that
kB k k2 c ((B) + )k .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Exercises for Chapter 10 153
LP + P L = Q> Q. (E10.1)
p
(ii) Show kHk2 = trace (L Q> Q) /2, where L is the pseudoinverse of L.
(iii) Define short-range and long-range output matrices Qsr and Qlr by Q> >
sr Qsr = L and Qlr Qlr = In
1 >
n 1n 1n , respectively. Show:
n 1, for Q = Qsr ,
2 n
X
kHk2 = 1
, for Q = Qlr .
(L)
i=2 i
Hint: The H2 norm has several interesting interpretations, including the total output signal energy in response
to a unit impulse input or the root mean square of the output signal in response to a white noise input with
identity covariance. You may find useful Theorem 7.4 and Exercise E6.9.
E10.8 Convergence rate for the Laplacian flow. Consider a weight-balanced, strongly connected digraph G
with self-loops, degree matrices Dout = Din = In , doubly-stochastic adjacency matrix A, and Laplacian
matrix L. Consider the associated Laplacian flow
x(t)
= Lx(t).
1>
n x(0)
For xave := n , define the disagreement vector by (t) = x(t) xave 1n .
1> x(t)
(i) Show that the average t 7 n n is conserved and that, consequently, 1>
n (t) = 0 for all t 0.
(ii) Derive the matrix E describing the disagreeement dynamics
= E(t).
(t)
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
154 Chapter 10. Convergence Rates, Scalability and Optimization
(iii) Describe the spectrum spec(E) of E as a function of the spectrum spec(A) of the doubly-stochastic
adjacency matrix A associated with G. Show that spec(E) has a simple eigenvalue at = 0 with
corresponding normalized eigenvector v1 := 1n / n.
(iv) The Jordan form J of E can be described as follows
0 0 0 0
0 J2 0 0
1 0 0 r1
E=P P =: c1 C
0 0 . . . 0 0 J R ,
0 0 0 Jm
where c1 is the first column of P and r1 is the first row of P 1 . Show that
(t) = C exp(Jt)
R(0).
(v) Use statements (iii) and (iv) to show that, for all > 0, there exists C > 0 satisfying
k(t)k C (e + )t k(0)k,
where = max{<() 1 | spec(A)\{1}} < 0. Show that, if A = A> , then ess (A) 1.
Hint: Use arguments similar to those in Exercise E10.3 and in the proof of Theorem 7.4.
E10.9 Convergence factors in digraphs with equal out-degree. Consider the unweighted digraphs in Fig-
ure E10.1 with their associated discrete-time consensus protocols x(t + 1) = Aa x(t) and x(t + 1) = Ab x(t).
For which digraph is the worst-case discrete-time consensus protocol (i.e., the evolution starting from the
worst-case initial condition) guaranteed to converge faster? Assign to each edge the same weight equal to 13 .
Digraph 1 Digraph 2
1 2 1 2
4 3 4 3
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 11
Time-varying Averaging Algorithms
In this chapter we discuss time-varying consensus algorithms. We borrow ideas from (Hendrickx 2008;
Bullo et al. 2009). Relevant references include (Tsitsiklis 1984; Tsitsiklis et al. 1986; Cao et al. 2008; Carli
et al. 2008).
Gshared-comm
3 4
1 2
Figure 11.1: Example communication digraph
155
156 Chapter 11. Time-varying Averaging Algorithms
3 4 3 4 3 4 3 4
1 2 1 2 1 2 1 2
time = 1, 5, 9, . . . time = 2, 6, 10, . . . time = 3, 7, 11, . . . time = 4, 8, 12, . . .
Formally, let Ai denote the averaging matrix corresponding to the transmission by agent i to its
out-neighbors. With round robin scheduling, we have
We let {G(k)}kZ0 be the sequence of weighted digraphs associated to the matrices {A(k)}kZ0 .
Note that (1, 1n ) is an eigenpair for each matrix A(k). Hence, all points in the consensus set 1n |
R are equilibria for the algorithm. We aim to provide conditions under which each solution converges to
consensus.
We start with a useful definition, for two digraphs G = (V, E) and G0 = (V 0 , E 0 ), union of G and G0 is
defined by
G G0 = (V V 0 , E E 0 ).
In what follows, we will need to compute only the union of digraphs with the same set of vertices; in
that case, the graph union is essentially defined by the union of the edge sets. Some useful properties of
the product of multiple row-stochastic matrices and of the unions of multiple digraphs are presented in
Exercise E11.1.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
11.2. Convergence over time-varying connected graphs 157
Theorem 11.1 (Convergence under point-wise connectivity). Let {A(k)}kZ0 be a sequence of sym-
metric and doubly-stochastic matrices with associated digraphs {G(k)}kZ0 so that
(i) each non-zero edge weight aij (k), including the self-loops weights aii (k), is larger than a constant > 0;
and
(ii) each graph G(k) is connected and aperiodic point-wise in time.
Then the solution to x(k + 1) = A(k)x(k) converges exponentially fast to average x(0) 1n .
The first assumption in Theorem 11.1 prevents the weights from becoming arbitrarily close to zero
as k and assures that ess (A(k)) is upper bounded by a number strictly lower than 1 at every time
k Z0 . To gain some intuition into this non-degeneracy assumption, consider a sequence of symmetric
and doubly-stochastic averaging matrices {A(k)}kZ0 with entries given by
exp(1/(k + 1) ) 1 exp(1/(k + 1) )
A(k) =
1 exp(1/(k + 1) ) exp(1/(k + 1) )
for k Z0 and exponent 1. Clearly, for k and for any 1 this matrix converges to
A = [ 01 10 ] with spectrum spec(A ) = {1, +1} and essential spectral radius ess (A ) = 1. One can
show that, for = 1, the convergence of A(k) to A is sufficiently slow so that {x(k)}k converges to
average(x(0)1n , whereas this property is not satisfied for faster convergence rates > 1, and the iteration
oscillates indefinitely.1
Proof of Theorem 11.1. Under assumptions (i) and (ii), there exists a c [0, 1[ so that ess (A(k)) c < 1
for all l Z0 . Recall the notion of the disagreement vector (k) = x(k) average(x(0))1n and define
V () = kk2 . It is immediate to compute
It follows that V ((k)) c2k V ((0)) or k(k)k ck k(0)k, that is, (k) converges
to zero exponentially
fast. Equivalently, as k , x(k) converges exponentially fast to average x(0) 1n .
The proof idea of Theorem 11.1 is based on the disagreement vector and a so-called common Lyapunov
function, that is, a positive function that decreases along the systems evolutions (we postpone the general
definition of Lyapunov function to Chapter 13). The quadratic function V proposed above is useful also
for sequences of primitive row-stochastic matrices {A(k)}kZ0 with a common positive left eigenvector
associated to the eigenvalue (A(k)) = 1, see Exercise E11.5. If the matrices {A(k)}kZ0 do not share a
1
To understand the essence of this example, consider
Pthe scalar iteration x(k + 1) = exp(1/(k + 1) )x(k). In logarithmic
k1 1
coordinates the solution is given by log(x(k)) = =0 (+1) x0 . For = 1, limk log(x(k)) diverges to , and
limk x(k) converges to zero. Likewise, for > 1, limk log(x(k)) exists finite, and thus limk x(k) does not converge
to zero.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
158 Chapter 11. Time-varying Averaging Algorithms
common left eigenvector associated to the eigenvalue (A(k)) = 1, then there exists generally no common
quadratic Lyapunov function of the form V () = > P with P being a positive definite matrix; e.g.,
see (Olshevsky and Tsitsiklis 2008). Likewise, if a sequence of symmetric matrices {A(k)}kZ0 does not
induce a connected and aperiodic graph point-wise in time, then the above analysis fails, and we need to
search for non-quadratic common Lyapunov functions.
Theorem 11.2 (Consensus for time-varying algorithms). Let {A(k)}kZ0 be a sequence of row-
stochastic matrices with associated digraphs {G(k)}kZ0 . Assume that
Then
Note: In a sequence with property (A2), edges can appear and disappear, but the weight of each edge
(that appears an infinite number of times) does not go to zero as k .
Note: This result is analogous to the time-invariant result that we saw in Chapter 5. The existence of a
globally reachable node is the connectivity requirement in both cases.
Note: Assumption (A3) is a uniform connectivity requirement, that is, any interval of length must
have the connectivity property. In equivalent words, the connectivity property holds for any contiguous
interval of duration .
Note: the theorem provides only a sufficient condition. For results on necessary and sufficient conditions
we refer the reader to the recent works (Blondel and Olshevsky 2014; Xia and Cao 2014) and references
therein.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
11.3. Convergence over digraphs connected over time 159
Consider now the assumptions in Theorem 11.2. Assumption (A1) is satisfied because in equation (11.1)
the self-loop weight is equal to 1/2. Similarly, Assumption (A2) is satisfied because the edge weight is equal
to 1/2. Finally, Assumption (A3) is satisfied with duration selected equal to n, because after n rounds
each node has transmitted precisely once and so all edges of the communication graph Gshared-comm are
present in the union graph. Therefore, the algorithm converges to consensus. However, the algorithm does
not converge to average consensus since it is false that the averaging matrices are doubly-stochastic.
Note: round robin is not necessarily the only scheduling protocol with convergence guarantees. Indeed,
consensus is achieved so long as each node is guaranteed a transmission slot once every bounded period of
time.
Then
n;
(i) limk A(k)A(k 1) A(0) = n1 1n 1>
(ii) each solution to x(k + 1) = A(k)x(k) converges exponentially fast to average x(0) 1n .
Note: this result is analogous to the time-invariant result that we saw in Chapter 5. For symmetric
row-stochastic matrices and undirected graphs, the connectivity of an appropriate graph is the requirement
in both cases.
Note: Assumption (A3) in Theorem 11.2 requires the existence of a finite time-interval of duration
so that the union graph k k+1 G( ) contains a globally reachable node for all times k 0. This
assumption is weakened in the symmetric case in Theorem 11.3 to Assumption (A4) requiring that the
union graph k G( ) is connected for all times k 0.
Step 2: Perform x+ + +
1 := x1 , x2 := x2 , x3 := (x2 + x3 )/2 a number of times 2 until
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
160 Chapter 11. Time-varying Averaging Algorithms
Step 3: Perform x+ + +
1 := x1 , x2 := (x1 + x2 )/2, x3 := x3 a number of times 3 until
1 1 1 1
=
2 3 2 3 2 3 2 3
Observe that on steps 1, 7, 15, . . . , the variable x1 is made to become larger than +1 by computing averages
with x3 > +1. Note that every time this happens the variable x3 > +1 is increasingly smaller and closer
to +1. Hence, 1 < 7 < 15 < . . . , that is, it takes more steps for x1 to become larger than +1. Indeed,
one can formally show the following:
Hence, it is not possible to predict the convergence of arbitrary products of matrices, just based on their
spectral radii and we need to work harder and with sharper tools.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
11.4. Analysis methods and proofs 161
Note that:
Lemma 11.4 (Monotonicity and bounded evolutions). If A is row-stochastic, then for all x Rn
For any sequence of row-stochastic matrices, the solution x(k) of the corresponding time-varying averaging
algorithm satisfies, from any initial condition x(0) and at any time k,
(i) there exists a duration N such that, for all times k Z0 , the digraph G(k) G(k + 1)
contains a directed spanning tree;
(ii) there exists a duration N such that, for all times k Z0 , there exists a node j = j(k) that reaches
all nodes i {1, . . . , n} over the interval {k, k + 1} in the following sense: there exists a sequence
of nodes {j, h1 , . . . , h1 , i} such that (j, h1 ) is an edge at time k, (h1 , h2 ) is an edge at time k + 1,
. . . , (h2 , h1 ) is an edge at time k + 2, and (h1 , i) is an edge at time k + 1;
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
162 Chapter 11. Time-varying Averaging Algorithms
(iii) there exists a duration N such that, for all times k Z0 , the digraph G(k) G(k + 1)
contains a globally reachable node;
(iv) there exists a duration N such that, for all times k Z0 , there exists a node j reachable from all
nodes i {1, . . . , n} over the interval {k, k + 1} in the following sense: there exists a sequence of
nodes {j, h1 , . . . , h1 , i} such that (h1 , j) is an edge at time k, (h2 , h1 ) is an edge at time k + 1, . . . ,
(h1 , h2 ) is an edge at time k + 2, and (i, h1 ) is an edge at time k + 1.
Note: It is sometimes easy to see if a sequence of digraphs satisfies properties (i) and (iii). Property (iv)
is directly useful in the analysis later in the chapter. Regarding the proof of the lemma, it is easy to check
that (ii) implies (i) and that (iv) implies (iii) with = . The converse is left as Exercise E11.3.
By Assumption (A3), we know that there exists a node j reachable from all nodes i over the interval
{h, (h + 1) 1} in the following sense: there exists a sequence of nodes {j, h1 , . . . , h1 , i} such
that all following edges exist in the sequence of digraphs: (h1 , j) at time h, (h2 , h1 ) at time h + 1, . . . ,
(i, h1 ) at time (h + 1) 1. Therefore, Assumption (A2) implies
ah1 ,j h , ah2 ,h1 h + 1 , . . . , ai,h1 (h + 1) 1 ,
Remarkably, this product is one term in the (i, j) entry of the row-stochastic matrix A = A((h + 1)
1) A(h). In other words, Assumption (A3) implies Aij .
Hence, for all nodes i, given globally reachable node j during interval {h, (h + 1)}, we compute
Xn
xi (h + 1) = Ai,j xj (h) + Ai,p xp (h) (by definition)
p6=j,p=1
Ai,j xj (h) + (1 Ai,j ) max x(h) (because xp (h) max x(h) )
max Ai,j xj (h) + (1 Ai,j ) max x(h)
Ai,j
xj (h) + (1 ) max x(h) . (because xj (h) max x(h) )
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
11.5. Time-varying algorithms in continuous-time 163
so that
Vmax-min x((h + 1)) = max xi (h + 1) min xi (h + 1)
i i
xj (h) + (1 ) max x(h) xj (h) + (1 ) min x(h)
(1 )Vmax-min x(h) .
This final inequality, together with Lemma 11.4, proves exponential convergence of the cost function
k 7 Vmax-min (x(k)) to zero and convergence of x(k) to a multiple of 1n . We leave the other statements in
Theorem 11.2 to the reader and refer to (Moreau 2005; Hendrickx 2008) for further details.
x(t)
= L(t)x(t).
We associate a time-varying graph G(t) (without self loops) to the time-varying Laplacian L(t) in the usual
manner.
For example, in Chapter 7, we discussed how the heading in some flocking models is described by the
continuous-time Laplacian flow:
= L,
where each is the heading of a bird, and where L is the Laplacian of an appropriate weighted digraph
G: each bird is a node and each directed edge (i, j) has weight 1/dout (i). We discussed also the need to
consider time-varying graphs: birds average their heading only with other birds within sensing range, but
this sensing relationship may change with time.
Recall that the solution to a continuous-time time-varying system can be given in terms of the state
transition matrix:
x(t) = (t, 0)x(0),
We refer to (Hespanha 2009) for the proper definition and study of the state transition matrix.
Theorem 11.6 (Convergence under point-wise connectivity). Let t 7 L(t) = L(t)> be a time-
varying Laplacian matrix with associated time-varying digraph t 7 G(t), t R0 . Assume
(A1) each non-zero edge weight aij (t) is larger than a constant > 0,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
164 Chapter 11. Time-varying Averaging Algorithms
(A2) for all t R0 , the digraph associated to the symmetric Laplacian matrix L(t) is undirected and
connected.
Then
(i) the state transition matrix (t, 0) associated to L(t) satisfies limt (t, 0) = 1n 1>
n /n,
(ii) the solution to x(t)
= L(t)x(t) converges exponentially fast to
lim x(t) = average x(0) 1n .
t
The first assumption in Theorem 11.1 prevents that the weights become arbitrarily close to zero as
t , and it assures that 2 (L) is strictly positive for all t R0 . To see the necessity of this non-
degeneracy assumption, consider the time-varying Laplacian
L(t) = a(t)L, (11.3)
Theorem 11.7 (Limitations of quadratic Lyapunov functions). Let L be a Laplacian matrix associated
with a weighted digraph G. The following statements are equivalent:
Proof sketch. The equivalence of statements (i) and (ii) has been shown in Lemma 6.4. The equivalence
of (i) and (iii) can be proved with a Lyapunov argument similar to the discrete-time case; see Theorem 11.1.
The implication (iv) = (iii) is trivial. To complete the proof, we show that (ii) = (iv). Recall that
the matrix exponential of a Laplacian matrix exp(Lt) is a nonnegative doubly stochastic matrix (see
Theorem 7.2) that can be decomposed into a convex combination of finitely many permutation P matrices
by the the Birkhoff-Von-Neumann Theorem (see Exercise E2.15). In particular, exp(Lt) = i i (t)Pi ,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
11.5. Time-varying algorithms in continuous-time 165
where Pi are permutation matrices and i (t) are convex coefficients for every t 0. By convexity of V (x)
and invariance under coordinate permutations we have for any initial condition x0 Rn and for any t 0
X X X
V (exp(Lt)x0 ) = V i (t)Pi x0 i (t)V (Pi x0 ) = i (t)V (x0 ) = V (x0 ) .
i i i
It follows that V () = kk2 serves as a common Lyapunov function for the time-varying Laplacian
flow x(t)
= L(t)x(t) only if L(t) is weight-balanced and connected point-wise in time. To partially
remedy these strong assumptions, consider now the case when L(t) induces an undirected graph at any
point in time t 0 and an integral connectivity condition holds similar to the discrete-time case. To
motivate the general case, recall the example in (11.3) with a single time-varying parameter a(t). In this
simple
R example, a necessary and sufficient condition for convergence to consensus was that the integral
0 a( )d is divergent. The following result from (Hendrickx and Tsitsiklis 2013) generalizes this case.
Theorem 11.8 (Convergence under integral connectivity). Let t 7 A(t) = A(t)> be a time-varying
symmetric adjacency
R matrix. Consider an associated undirected graph G = (V, E), t R0 , that has an edge
(i, j) E if 0 aij ( )d is divergent. Assume
(A1) each non-zero edge weight aij (t) is larger than a constant > 0,
(A2) the graph G is connected.
Then
(i) the state transition matrix (t, 0) associated to L(t) satisfies limt (t, 0) = 1n 1>
n /n,
(ii) the solution to x(t)
= L(t)x(t) converges exponentially fast to
lim x(t) = average x(0) 1n .
t
Theorem 11.8 is the continuous-time analog of Theorem 11.3. We remark that the original statement
in (Hendrickx and Tsitsiklis 2013) does not require Assumption (A1) thus allowing for weights such as
aij = 1/t which lead to non-uniform convergence, i.e., the convergence rate depends on the time t0
when the system is initialized. The proof method of Theorem 11.8 is based on the fact that the minimal
(respectively maximal) element of x(t), the sum of the two smallest (respectively two largest) elements, the
sum of the three smallest (respectively three largest) elements, etc., are all bounded and non-decreasing
(respectively non-increasing). A continuity argument can then be used to show average consensus.
Theorem 11.9 (Consensus for time-varying algorithms in continuous time). Let t 7 A(t) be a
time-varying adjacency matrix with associated time-varying digraph t 7 G(t), t R0 . Assume
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
166 Chapter 11. Time-varying Averaging Algorithms
(A1) each non-zero edge weight aij (t) is larger than a constant > 0,
(A2) there exists a duration T > 0 such that, for all t R0 , the digraph associated to the adjancency matrix
Z t+T
A( )d
t
Then
(i) there exists a nonnegative w Rn normalized to w1 + + wn = 1 such that the state transition
matrix (t, 0) associated to L(t) satisfies limt (t, 0) = 1n w> ,
(ii) the solution to x(t)
= L(t)x(t) converges exponentially fast to w> x(0) 1n ,
(iii) if additionally, the 1>n L(t) = 0n for almost all times t (that is, the digraph is weight-balanced at all
>
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
11.6. Exercises 167
11.6 Exercises
E11.1 On the product of stochastic matrices (Jadbabaie et al. 2003). Let k 2 and A1 , A2 , . . . , Ak be non-
negative n n matrices with positive diagonal entries. Let amin (resp. amax ) be the smallest (resp. largest)
diagonal entry of A1 , A2 , . . . , Ak and let G1 , . . . , Gk be the digraphs associated with A1 , . . . , Ak .
Show that
2 k1
amin
(i) A1 A2 Ak (A1 + A2 + + Ak ), and
2amax
(ii) if the digraph G1 . . . Gk is strongly connected, then the matrix A1 Ak is irreducible.
Hint: Set Ai = amin In + Bi for a nonnegative Bi , and show statement (i) by induction on k.
E11.2 Products of primitive matrices with positive diagonal. Let A1 , A2 , . . . , An1 be primitive n n ma-
trices with positive diagonal entries. Show that A1 A2 An1 > 0.
E11.3 A simple proof. Prove Lemma 11.5.
Hint: You will want to use Exercise E3.6.
E11.4 Alternative sufficient condition. As in Theorem 11.2, let {A(k)}kZ0 be a sequence of row-stochastic
matrices with associated digraphs {G(k)}kZ0 . Prove that the same asymptotic properties in Theorem 11.2
hold true under the following Assumption (A5), instead of Assumptions (A1), (A2), and (A3):
(A5) there exists a node j such that, for all times k Z0 , each edge weight aij (k), i {1, . . . , n}, is larger
than a constant > 0.
In other words, Assumption (A5) requires that all digraphs G(k) contain all edges aij (k), i {1, . . . , n},
and that all these edges have weights larger than a strictly positive constant.
Hint: Modify the proof of Theorem 11.2.
E11.5 Convergence for strongly-connected graphs point-wise in time: discrete time. Consider a sequence
{A(k)}kZ0 of row-stochastic matrices with associated graphs {G(k)}kZ0 so that
(A1) each non-zero edge weight aij (k), including the self-loops weights aii (k), is larger than a constant
> 0;
(A2) each graph G(k) is strongly connected and aperiodic point-wise in time; and
(A3) there is a positive vector w Rn satisfying w> 1n = 1 and w> A(k) = w> for all k Z0 .
Without relying on Theorem 11.2, show that the solution to x(k+1) = A(k)x(k) converges to limk x(k) =
w> x(0)1n .
Hint: Search for a common quadratic Lyapunov function.
E11.6 Convergence for strongly-connected graphs point-wise in time: continuous time. Let t 7 L(t) be
a time-varying Laplacian matrix with associated time-varying digraph t 7 G(t), t R0 so that
(A1) each non-zero edge weight aij (t) is larger than a constant > 0;
(A2) each graph G(t) is strongly connected point-wise in time; and
(A3) there is a positive vector w Rn satisfying 1>
n w = 1 and w L(t) = 0n for all t R0 .
> >
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 12
Randomized Averaging Algorithms
In this chapter we discuss averaging algorithms defined by sequences of random stochastic matrices. In
other words, we imagine that at each discrete instant, the averaging matrix is selected randomly according
to some stochastic model. We refer to such algorithms as randomized averaging algorithms.
Randomized averaging algorithms are well behaved and easy to study in the sense that much information
can be learned simply from the expectation of the averaging matrix. Also, as compared with time-varying
algorithms, it is possible to study convergence rates for randomized algorithms. In this chapter we present
results from (Fagnani and Zampieri 2008; Tahbaz-Salehi and Jadbabaie 2008; Garin and Schenato 2010;
Frasca 2012). Relevant references include (Chatterjee and Seneta 1977; Cogburn 1984; Hatano and Mesbahi
2005; Touri and Nedi 2014).
In this book we will not discuss averaging algorithms in the presence of quantization effects, we refer
the reader instead to (Kashyap et al. 2007; Nedi et al. 2009; Frasca et al. 2009). Similarly, regarding averaging
in the presence of noise, we refer to (Xiao et al. 2007; Bamieh et al. 2012; Lovisari et al. 2013; Jadbabaie and
Olshevsky 2015).
Uniform Symmetric Gossip. Given an undirected graph G, at each iteration, select uniformly likely one
of the graph edges, say agents i and j talk, and they both perform (1/2, 1/2) averaging, that is:
1
xi (k + 1) = xj (k + 1) := xi (k) + xj (k) .
2
A detailed analysis of this model is given by Boyd et al. (2006).
Packet Loss in Communication Network. Given a strongly connected and aperiodic digraph, at each
communication round, packets travel over directed edges and, with some likelihood, each edge may
drop the packet. (If information is not received, then the receiving node can either do no update
whatsoever, or adjust its averaging weights to compensate for the packet loss).
169
170 Chapter 12. Randomized Averaging Algorithms
Opinion Dynamics with Stochastic Interactions and Prominent Agents. (Somehow similar to uni-
form gossip) Given an undirected graph and a probability 0 < p < 1, at each iteration, select
uniformly likely one of the graph edges and perform: with probability p both agents perform the
(1/2, 1/2) update, and with probability (1 p) only one agent performs the update and the promi-
nent agent does not. A detailed analysis of this model is given by (Acemoglu and Ozdaglar 2011);
see also (Acemoglu et al. 2013).
Note that, in the second, third and fourth example models, the row-stochastic matrices at each iteration
are not symmetric in general, even if the original digraph was undirected.
Loosely speaking, a random variable X : E is a measurable function from the set of possible
outcomes to some set E which is typically a subset of R.
The probability of an event (i.e., a subset of possible outcomes) is the measure of the likelihood that
the event will occur. An event occurs almost surely if it occurs with probability equal to 1.
The random variable X is called discrete if its image is finite or countably infinite. In this case, X is
described by a probability mass function assigning a probability to each value in the image of X.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
12.3. Randomized averaging algorithms 171
x(k + 1) = A(k)x(k).
We now present the main result of this chapter; for its proof we refer to (Tahbaz-Salehi and Jadbabaie
2008), see also (Fagnani and Zampieri 2008).
Theorem 12.1 (Consensus for randomized algorithms). Let {A(k)}kZ0 be a sequence of random
row-stochastic matrices with associated digraphs {G(k)}kZ0 . Assume
Note: if each random matrix is doubly-stochastic, then E[A(k)] is doubly-stochastic. The converse is
easily seen to be false.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
172 Chapter 12. Randomized Averaging Algorithms
Proof based on Theorem 12.1. The corollary can be established by verifying that Assumptions (A1)(A3) in
Theorem 12.1 are satisfied. Regarding (A3), note that the graph associated to the expected averaging matrix
is G.
Proof based on Theorem 11.3. For any time k0 0 and any edge (i, j) E, consider the event the edge
(i, j) is not selected for update at any time larger than k0 . Since the probability that (i, j) is not selected at
any time k is 1 1/|E|, the probability that (i, j) is not selected at any times after k0 is
1 kk0
lim 1 = 0.
k |E|
With this fact one can verify that all assumptions in Theorem 11.3 are satisfied by the random sequence
of matrices almost surely. Hence, almost sure convergence follows. Finally, since each matrix is doubly
stochastic, average(x(k) is preserved, and the solution converges to average(x(0))1n .
We now present upper and lower bounds for the mean-square convergence factor; for a comprehensive
proof we refer to (Fagnani and Zampieri 2008, Proposition 4.4).
Theorem 12.3 (Upper and lower bounds on the mean-square convergence factor). Under the same
assumptions as in Theorem 12.1, the mean-square convergence factor satisfies
2 h i
> >
ess E[A(k)] rmean-square E A(k) (In 1n 1n /n)A(k) .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
12.4. Table of asymptotic behaviors for averaging systems 173
time-varying discrete-time: (i) at each time k, G(k) has self-loop at each node, Thm 11.2
x(k + 1) = A(k)x(k), (ii) each aij (k) 0 is larger than > 0,
A(k) row-stochastic adjacency (iii) there exists duration s.t., for all time k,
matrix of digraph G(k), G(k) G(k + 1) has a globally reachable node
k Z0 =
limk x(k) = (w> x(0))1n , where w 0, 1>
nw = 1
time-varying symmetric (i) at each time k, G(k) has self-loop at each node, Thm 11.3
discrete-time: (ii) each aij (k) 0 is larger than > 0,
x(k + 1) = A(k)x(k), (iii) for all time k, k G( ) is connected
A(k) symmetric stochastic =
adjacency of G(k), k Z0 limk x(k) = average x(0) 1n
time-varying continuous-time: (i) each aij (k) 0 is larger than > 0, Thm 11.9
x(t)
= L(t)x(t), (ii) there exists duration T
R t+T s.t., for all time t,
L(t) Laplacian matrix of digraph associated to t A( )d has a globally reachable
digraph G(t), t R0 node
=
limk x(k) = (w> x(0))1n , where w 0, 1>
nw = 1
Table 12.1: Averaging systems: definitions, assumptions, asymptotic behavior, and reference
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Part III
Nonlinear Systems
175
Chapter 13
Nonlinear Systems and Robotic
Coordination
Coordination in relative sensing networks: rendezvous, flocking, and formations The material
in this section is self-contained. Further information on flocking can be found in (Tanner et al. 2007;
Olfati-Saber 2006), and further material formation control and graph rigidity can be found in (Drfler and
Francis 2010; Krick et al. 2009; Anderson et al. 2008; Oh et al. 2015).
(i) Agent dynamics: We consider a simple and fully actuated agent model: pi = ui , where pi R2
and ui R2 are the position and steering control input of agent i.
(ii) Relative sensing model: We consider the following sensing model.
Each agent is equipped with onboard sensors only and has no communication devices.
The sensing topology is encoded by an undirected and connected graph G = (V, E)
Each agent i can measure the relative position of neighboring agents: pi pj for {i, j} E.
To formalize the relative sensing model, we introduce an arbitrary orientation and labeling k
{1, . . . , |E|} for each undirected edge {i, j} E. Recall the incidence matrix B Rn|E| of the
associated oriented graph and define the 2n 2|E| matrix B = B I2 via the Kronecker product.
The Kronecker product A B is the element-wise matrix product so that each scalar entry Aij of
A is replaced by a block-entry Aij B in the matrix A B. For example, if B is given by
+1 0 0 0 +I2 0 0 0
1 +1 1 0 I2 +I2 I2 0
B=
0 1 0 +1 , then B is given by B = B I2 = 0 I2
.
0 +I2
0 0 +1 1 0 0 +I2 I2
177
178 Chapter 13. Nonlinear Systems and Robotic Coordination
p2
e1
e2
p1
e3
p3
Figure 13.1: A ring graph with three agents. The first panel shows the agents embedded in the plane R2 with positions
pi and relative positions ei . The second panel shows the artificial potentials as springs connecting the robots, and the
third panel shows the resulting forces.
> p.
With this notation the vector of relative positions is given by e = B
(iii) Geometric objective: The objective is to achieve desired geometric configuration which can be
expressed as a function of relative distances kpi pj k for each {i, j} E. Examples include
rendezvous (kpi pj k = 0), collision avoidance (kpi pj k > 0), and desired relative spacings
(kpi pj k = dij > 0).
(iv) Potential-based control: We specify the geometric objective for each edge {i, j} E as the
minimum of an artificial potential function Vij : Dij R R0 . We require the potential functions
to be twice continuously differentiable on their domain Dij .
It is instructive to think of Vij (kpi pj k) as a spring coupling neighboring agents {i, j} E. The
resulting spring forces acting on agents i and j are fij (pi pj ) = p i
Vij (kpi pj k) and fji (pi pj ) =
fij (pi pj ) = pj Vij (kpi pj k); see Figure 13.1 for an illustration. The overall network potential
function is then X
V (p) = Vij (kpi pj k) .
{i,j}E
Lemma 13.1 (Symmetries of relative sensing networks). Consider the closed-loop relative sensing net-
work (13.1) with an undirected and connected graph G = (V, E). For every initial condition p0 R2n , we
have that
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.1. Coordination in relative sensing networks 179
..
.
u x i = ui x
..
.
B T
B
..
.
z fij () y
..
.
(i) the center of mass is stationary: average(p(t)) = average(p0 ) for all t 0; and
>
(ii) the closed-loop p = Vp(p) is invariant under rigid body transformations: if i = Rpi + q, where
>
R O(2) and q R2 is a translation vector, then = V() .
P P P
Proof. Regarding statement (i), since ni=1 pi = 0, it follows that ni=1 pi (t) = ni=1 pi0 .
Regarding statement (ii), first, notice that potential function is invariant under translations since
V (p) = V (p + 1n q) for any translation q R2 . Second, notice that the potential function is invariant
= V (p) where
under rotations and reflections since Vij (kR(pi pj )k) = Vij (kpi pj k) and thus V (Rp)
R = In R. From the chain rule we obtain V (Rp) R = >
p p V (p) or p V (Rp) = p V (p)R . By
combining these insights when changing coordinates via i = Rpi + q (or = Rp + 1n q), we find that
> > >
V (p) V (p) >
V (Rp) V () >
= R
p = R = R = = .
p p p
Example 13.2 (The linear-quadratic rendezvous problem). An undirected consensus system is a relative
sensing network coordination problem where the objective is rendezvous: pi = pj for all {i, j} E. For
each edge {i, j} E consider the artificial potential Vij : R2n R0 which has a minimum at the desired
objective. For example, for the quadratic potential function
1
Vij (pi pj ) = aij kpi pj k22 ,
2
where L
the overall potential function is obtained as the Laplacian potential V (p) = 12 p> Lp, = L I2 . The
resulting gradient descent control law gives rise to the linear Laplacian flow
X
pi = ui = V (p) = aij (pi pj ) . (13.2)
pi
{i,j}E
So far, we analyzed the consensus problem (13.2) using matrix theory and exploiting the linearity of the problem.
In the following, we introduce numerous tools that will allow us to analyze nonlinear consensus-type interactions
and more general nonlinear dynamical systems.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
180 Chapter 13. Nonlinear Systems and Robotic Coordination
x(t)
= f (x(t)), x(0) = x0 .
A dynamical system (X, f ) is linear if x 7 f (x) = Ax for some square matrix A. Typically, the map f
is assumed to have some continuity properties so that the solution exists and is unique for at least small
times; we do not discuss this topic here and refer, for example, to (Khalil 2002).
Examples of continuous-time dynamical systems include the (linear) Laplacian flow x = PLx (see equa-
tion (7.2) in Section 7.3) and the (nonlinear) Kuramoto coupled oscillator model i = i K
n
n
j=1 sin( i j )
(which we discuss in Chapter 14).
An equilibrium point for the dynamical systems (X, f ) is a point x X such that f (x ) = 0n . If the
initial state is x(0) = x , then the solution exists unique for all time and is constant: x(t) = x for all
t R0 .
Convergence and invariant sets A curve t 7 x(t) approaches a set S Rn as t + if the distance
from x(t) to the set S converges to 0 as t +. If the set S consists of a single point s, then x(t) converges
to s in the usual sense: limt+ x(t) = s.
Given a dynamical system (X, f ), a set W X is invariant if each solution starting in W remains in
W , that is, if x(0) W implies x(t) W for all t 0. We also need the following general properties: a
set W Rn is
(i) stable (or Lyapunov stable) if, for each > 0, there exists = () > 0 so that if kx(0) x k < ,
then kx(t) x k < for all t 0,
(ii) unstable if it is not stable,
(iii) locally asymptotically stable if it is stable and if there exists > 0 such that limt x(t) = x for all
trajectories satisfying kx(0) x k < .
(i) the set of initial conditions x0 X whose corresponding solution x(t) converges to x is a closed
set termed the region of attraction of x ,
(ii) x is said to be globally asymptotically stable if its region of attraction is the whole space X,
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.2. Stability theory for dynamical systems 181
(iii) x is said to be globally (respectively, locally) exponentially stable if it is globally (respectively, locally)
asymptotically stable and all trajectories starting in the region of attraction satisfy
kx(t) x k c1 kx(0) x kec2 t ,
for some positive constants c1 , c2 > 0.
Some of these concepts are illustrated in Figure 13.3.
x x x
"
(a) Stable equilibrium (b) Unstable equilibrium (c) Asymptotically stable equilib-
rium
Energy functions: non-increasing functions, sublevel sets and critical points In order to establish
the stability and convergence properties of a dynamical system, we will use the concept of an energy
function that is non-increasing along the systems solution.
The Lie derivative (also called directional derivative) of a function V : Rn R with respect to a vector
field f : Rn Rn is the function Lf V : Rn R defined by
Lf V (x) =
V (x)f (x). (13.3)
x
A differentiable function V : Rn R is said to be non-increasing along every trajectory of the system if
each solution x : R0 X satisfies
V (x(t)) = Lf V (x(t)) 0,
or, equivalently, if each point x X satisfies
Lf V (x) 0.
A critical point for a differentiable function V : Rn R is a point x
X satisfying
V
(
x) = 0n .
x
Every critical point of a differentiable function is either a local minimum, local maximum or a saddle point.
Given a function V : Rn R and a constant ` R, the `-level set of V is {y Rn | V (y) = `}, and the
`-sublevel set of V is {y Rn | V (y) `}. These concepts are illustrated in Figure 13.4.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
182 Chapter 13. Nonlinear Systems and Robotic Coordination
`1
`2
`3
x1 x2 x3 x4 x5 x
{x | V (x) `2 }
Figure 13.4: A differentiable function, its sublevel set and its critical points. The sublevel set {x | V (x) `1 } is
unbounded. The sublevel set {x | V (x) `2 } = [x1 , x5 ] is compact and contains three critical points (x2 and x4 are
local minima and x3 is a local maximum). Finally, the sublevel set {x | V (x) `3 } is compact and contains a single
critical point x4 .
Theorem 13.3 (Lyapunov Theorem). Consider a dynamical system (Rn , f ) with differentiable vector field
f and with an equilibrium point x Rn . The equilibrium point x is
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.2. Stability theory for dynamical systems 183
Note: Lyapunov Theorem assumes the existence of a Lyapunov function with certain properties, but
does not provide any constructive method to design or compute one. In what follows we will see that
Lyapunov functions can be designed easily for certain classes of systems. But, in general, the computation
of Lyapunov function is a challenging task.
Theorem 13.4 (LaSalle Invariance Principle). Consider a dynamical system (X, f ) with differentiable
f . Assume there exist
Then each solution t 7 x(t) starting in W , that is, x(0) W , converges to the largest invariant set contained
in
x W | Lf V (x) = 0 .
Note: If the set S is composed of multiple disconnected components and t 7 x(t) approaches S, then it
must approach one of its disconnected components. Specifically, if the set S is composed of a finite number
of points, then t 7 x(t) must converge to one of the points.
Theorem 13.5 (Convergence of linear systems). For a matrix A Rnn , the following properties are
equivalent:
One can show that statement (iii) implies statement (i) using the LaSalle Invariance Principle with
function V (x) = x> P x, whose derivative along the systems solutions is V = x> (A> P + P A)x =
x> Qx 0.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
184 Chapter 13. Nonlinear Systems and Robotic Coordination
The linearization at the equilibrium point x of the dynamical system (X, f ) is the linear dynamical
system defined by the differential equation x = Ax, where
f
A= (x ).
x
Theorem 13.6 (Convergence of nonlinear systems via linearization). Consider a dynamical system
(X, f ) with an equilibrium point x , with twice differentiable vector field f , and with linearization A at x .
The following statements hold:
(i) the equilibrium point x is locally exponentially stable if all the eigenvalues of A have strictly-negative
real parts; and
(ii) the equilibrium point x is unstable if at least one eigenvalue of A has strictly-positive real part.
Theorem 13.6 can often be invoked to analyze local stability of a nonlinear system. For example, for
R, consider the dynamical system
= f () = sin() ,
which we will study extensively in Chapters 14 and 15. If [0, 1[, then two equilibrium points are
1 = arcsin() [0, /2[ and 2 = arcsin() ]/2, +]. Moreover, the 2-periodic set of equilibria
are given by {1 + 2k | k Z} and {2 + 2k | k Z}. The linearization matrix A(i ) = f
(i ) =
cos(i ) for i {1, 2} shows that 1 is locally stable and 2 is unstable.
On the other hand, pick a scalar c and, for x R, consider the dynamical system
x = f (x) = c x3 .
The linearization at the equilibrium x = 0 is indefinite: A(x ) = 0. Thus, Theorem 13.6 offers no
conclusions other than the equilibrium cannot be exponentially stable. On the other hand, the LaSalle
Invariance Principle shows that for c < 0 every trajectory converges to x = 0. Here, a non-increasing and
differentiable function is given by V (x) = x2 with Lie derivative Lf V (x) = 2cx4 0. Since V (x(t)) is
non-increasing along the solution to the dynamical system, a compact invariant set is then readily given by
any sublevel set {x | V (x) `} for ` 0.
Theorem 13.7 (Convergence of negative gradient flows). Let U : Rn R be twice differentiable and
assume its sublevel set {x | U (x) `} is compact for some ` R. Then the negative gradient flow (13.4) has
the following properties:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.2. Stability theory for dynamical systems 185
Proof. To show statements (i) and (ii), we verify that the assumptions of the LaSalle Invariance Principle
are satisfied as follows. First, as set W we adopt the sublevel set {x | U (x) `} which is compact by
assumption and is invariant because, as we show next, the value of t 7 U (x(t)) is non-increasing. Second,
the derivative of the function U along its negative gradient flow is
2
U
U (x) =
(x)
x
0.
The first two facts are now an immediate consequence of the LaSalle Invariance Principle. The statements (iii)
and (iv) follow from observing that the linearization of the negative gradient system at the equilibrium x
is the Hessian matrix evaluated at x and from applying Theorem 13.6.
Note: If the function U has isolated critical points, then the negative gradient flow evolving in a compact
set must converge to a single critical point. In such circumstances, it is also true that from almost all initial
conditions the solution will converge to a local minimum rather than a local maximum or a saddle point.
Note: given a critical point x , a positive definite Hessian matrix Hess U (x ) is a sufficient but not a
necessary condition for x to be a local minimum. As a counterexample, consider the function U (x) = x4
and the critical point x = 0.
Note: If the function U is radially unbounded, that is, limkxk U (x) = (where the limit is taken
along any path resulting in kxk ), then all its sublevel sets are compact.
Note from (ojasiewicz 1984): if the function U is analytic, then every solution starting in a compact
sublevel set has finite length and converges to a single equilibrium point.
Example 13.8 (Dissipative mechanical system). Consider a dissipative mechanical system of the form
p = v,
mv = dv U (p) ,
p
where (p, v) R2 are the position and velocity coordinates, m and d are the positive inertia and damping
coefficients, and U : R R is a twice differentiable potential energy function. We assume that U is strictly
convex with a unique global minimum at p . Consider the mechanical energy E : R R R0 given by the
sum of kinetic and potential energy:
1
E(p, v) = mv 2 + U (p).
2
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
186 Chapter 13. Nonlinear Systems and Robotic Coordination
E(p, v) = mv v + U (p)p = dv 2 0 .
p
Notice that the assumptions of the LaSalle Invariance Principle in Theorem 13.4 are satisfied: the function E and
the vector field (the right-hand side of the mechanical system) are continuously differentiable; the derivative E is
nonpositive; and for any initial condition (p0 , v0 ) R2 the sublevel set {(p, v) R2 | E(p, v) E(p0 , v0 )}
is compact due to the strict convexity of U . It follows that (p(t), v(t)) converges to largest invariant set
contained in {(p, v) R2 | E(p, v) E(p0 , v0 ), v = 0}, that is, {(p, v) R2 | E(p, v) E(p0 , v0 ), v =
0, p U (p) = 0}. Because U is strictly convex and twice differentiable, p
U (p) = 0 if and only if p = p .
Therefore, we conclude
lim (p(t), v(t)) = (p , 0).
t+
where (for each {i, j} E) gij = gji is a continuously differentiable, strictly increasing, and anti-symmetric
function satisfying e gij (e) 0 and gij (e) = 0 if and only if e = 0. Notice that the linearization of the
system around the consensus subspace may be zero and thus not very informative, for example, when
gij (e) = kek2 e. The nonlinear rendezvous system (13.5) can be written as a gradient flow:
X n
pi = V (p) = Vij (kpi pj k) .
pi pi
j=1
R kpi pj k
with the associated edge potential function Vij (kpi pj k) = 0 gij () d.
Theorem 13.9 (Nonlinear rendezvous). Consider the nonlinear rendezvous system (13.5) with an undi-
rected and connected graph G = (V, E). Assume that the associated edge potential functions Vij (kpi pj k) =
R kpi pj k
0 gij () d are radially unbounded. For every initial condition p0 R2n , we have that
Proof. Note that the nonlinear rendezvous system (13.5) is the negative gradient system defined by the
network potential function X
V (p) = Vij (kpi pj k) .
{i,j}E
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.4. Flocking and Formation Control 187
Recall from Lemma 13.1 that the center of mass is stationary, and observe that the function V (p) is radially
unbounded with exception of the direction span(12n ) associated with a translation of the stationary center
of mass. Thus, for every initial condition p0 R2n , the set of points (with fixed center of mass)
defines a compact set. By the LaSalle Invariance Principle in Theorem 13.4, each solution converges to the
largest invariant set contained in
n V (p) o
p R2n average(p(t)) = average(p0 ) , V (p) V (p0 ) , = 0>n .
p
It follows that the only positive limit set is the set of equilibria: limt p(t) span(1n average(p0 )).
We embed the graph G into the plane R2 by assigning to each node i a location pi R2 . We refer to the
pair (G, p) as a framework, and we denote the set of frameworks (G, F) as the target formation. A target
formation is a realization of F in the configuration space R2 . A triangular example is shown in Figure 13.5.
p2
p2 p1
e13 2 = d13
e12 2 = d12 e12 2 = d12
p3
e23 2 = d23 e23 2 = d23 p1
p1
e23 2 = d23
e13 2 = d13 p3 p2 p3 e13 2 = d13
Figure 13.5: A triangular formation specified by the distance constraints d12 , d13 , and d23 . The left subfigure shows
one possible target formation, the middle subfigure shows a rotation of this target formation, and the right subfigure
shows a flip of the left target formation. All of this triangles satisfy the specified distance constraints and are
elements of F.
We make the following three observations on the geometry of the target formation:
To be non-empty, the formation F has to be realizable in the plane. For example, for the triangular
formation in Figure 13.5 the distance constraints dij need to satisfy the triangle inequalities:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
188 Chapter 13. Nonlinear Systems and Robotic Coordination
A framework (G, p) with p F is invariant under rigid body transformations, that is, rotation or
translation, as seen in Figure 13.5. Hence, the formation F is a set of at least of dimension 3.
The formation F may consist of multiple disconnected components. For instance, for the triangular
example in Figure 13.5 there is no continuous deformation from the left framework to the right
flipped framework, even though both are target formations. In the state space R6 , this absence of a
continuous deformation corresponds to two disconnected components of the set F.
To steer the agents towards the target formation consider an artificial potential function for each edge
{i, j} E which mimics the Hookean potential of a spring with rest length dij :
1 2
Vij (kpi pj k) = kpi pj k2 dij .
2
Since this potential function is not differentiable, we choose the modified potential function
1 2
Vij (kpi pj k) = kpi pj k22 d2ij . (13.6)
4
The resulting closed loop under the gradient control law u = p V (p) is given by
X
pi = ui = V (p) = kpi pj k22 d2ij (pi pj ) . (13.7)
pi
{i,j}E
Observe that the set of equilibria of the closed loop (13.7) is the set of critical points of V (p) which is a
strict super-set of the target formation F. For example, it includes the set of points when two neighbors
are collocated: pi = pj for {i, j} E. In the following, we show convergence to the equilibrium set.
Theorem 13.10 (Flocking). Consider the nonlinear flocking system (13.7) with an undirected and connected
graph G = (V, E) and a realizable formation F. For every initial condition p0 Rn , we have that
Proof. As in the proof of Theorem 13.9, the center of mass is stationary and the potential is non-increasing:
V (p) >
2
V (p) =
0.
p
Observe further that for a fixed initial center of mass, the sublevel sets of V (p) form a compact set. By the
LaSalle Invariance Principle in Theorem 13.4, p(t) converges to the largest invariant set contained in
n V (p) o
p R2n average(p(t)) = average(p0 ) , V (p) V (p0 ) , = 0>
n .
p
It follows that the positive limit set is the set of critical points of the potential function.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.4. Flocking and Formation Control 189
Observe that Theorem 13.10 guarantees at most convergence to the critical points of the potential
function. Depending on the problem scenario of interest, we still have to investigate which of these critical
points are locally asymptotically stable or unstable on a case-by-case basis; see Exercise E13.8 for an
application to a linear formation and Section 13.5 for a more general analysis.
The above Theorem 13.10 also holds true for non-smooth potential functions Vij : ] d2i , [ R that
satisfy
An illustration of possible potential functions can be found in Figure 13.6. These potential functions can
also be easily modified to include input constraints; see Exercise E13.4.
Vij kfij k
kpi pj k kpi pj k
d2ij d2ij
(a) Artificial potential functions (b) Induced artificial spring forces
Figure 13.6: Illustration of the quadratic potential function (13.6) (blue solid plot) and a logarithmic barrier potential
function (red dashed plot) that approaches as two neighboring agents become collocated
Theorem 13.11 (Flocking with collision avoidance). Consider the gradient flow (13.1) with an undirected
and connected graph G = (V, E), a realizable formation F, and artificial potential functions satisfying (P1)
through (P4). For every initial condition p0 R2n satisfying pi (0) 6= pj (0) for all {i, j} E, we have that
the solution to the non-smooth dynamical system exists for all times t 0;
the center of mass average(p(t)) = average(p(0)) is stationary for all t 0;
neighboring robots will not collide, that is, pi (t) 6= pj (t) for all {i, j} E and for all t 0; and
the agents asymptotically converge to the set of critical points of the potential function.
Proof. The proof of Theorem 13.11 is identical to that of Theorem 13.10 after realizing that, for initial
conditions satisfying pi (0) 6= pj (0) for all {i, j} E, the dynamics are confined to the compact and
forward invariant set
n o
p R2n average(p(t)) = average(p0 ) , V (p) V (p0 ) .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
190 Chapter 13. Nonlinear Systems and Robotic Coordination
Within this set, the dynamics (13.7) are twice continuously differentiable and collisions are avoided.
(i) Do the agents actually stop, that is, does there exist an p Rn so that limt p(t) = p ?
(ii) The formation F is a subset of the set of critical points of the potential function. How can we render
this particular subset stable (amongst possible other critical points)? What are the other critical
points?
(iii) Does our specification of the target formation make sense? For example, in Figure 13.7 the target
formation can be infinitesimally deformed, such that the resulting geometric configurations are not
congruent.
Figure 13.7: A rectangular target formation among four robots, which is specified by four distance constraints.
The initial geometric configuration (solid circles) can be continuously deformed such that the resulting geometric
configuration is not congruent anymore. All of the displayed configurations are part of the target formation set and
satisfy the distance constraints, even the case when the agents are collinear.
The answers to all this question is tied to a graph-theoretic concept called rigidity.
1 >
rG : R2n R|E| , rG (p) , . . . , kpi pj k22 , . . . ,
2
where each component in rG (p) corresponds the length of the relative position pi pj for {i, j} E.
Definition 13.12 (Rigidity). Given an undirected graph G(V, E) and p R2n , the framework (G, p) is
said to be rigid if there is an open neighbourhood U of p such that if q U and rG (p) = rG (q), then (G, p) is
congruent to (G, q).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.5. Rigidity and stability of the target formation 191
Figure 13.8: The framework in Figure 13.8a is not rigid since a slight perturbation of the upper two points of the
framework results in a framework that is not congruent to the original one although their rigidity functions coincide.
If an additional cross link is added to the framework as in Figure 13.8b, small perturbations that do not change the
rigidity function result in a congruent framework. Thus, the framework in Figure 13.8b is rigid.
rG (p)
rG (p + p) = rG (p) + p + O2 (p) .
p
rG (p) rG (p)
The rigidity function then remains constant up to first order if p kernel p . The matrix p
R|E|2n is called the rigidity matrix of the graph G. If the perturbation p is a rigid body motion, that is
a translation and rotation of the framework, then, by Definition 13.12, the framework is still rigid. Thus,
the dimension of the kernel of the rigidity matrix is at most 3. The idea that rigidity is preserved under
infinitesimal perturbations motivates the following definition of infinitesimal rigidity.
If a framework is infinitesimally rigid, then it is also rigid but the converse is not necessarily true
(Asimow and Roth 1979). Also note that an infinitesimally rigid framework must have at least 2n 3
edges E. If it has exactly 2n 3 edges, then we call it a minimally rigid framework. Finally, if (G, p) is
infinitesimally rigid at p, so is (G, p0 ) for p0 in an open neighborhood of p. Thus, infinitesimal rigidity
is a generic property that depends almost only on the graph G and not on the specific point p R2n .
Throughout the literature (infinitesimally, minimally) rigid frameworks are often denoted as (infinitesimally,
minimally) rigid graphs.
Example 13.14 (Rigidity and infinitesimal rigidity of triangular formation). Consider the triangular
framework in Figure 13.9a and the collapsed triangular framework in Figure 13.9b which are both embeddings
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
192 Chapter 13. Nonlinear Systems and Robotic Coordination
of the same triangular graph. The rigidity function for both frameworks is given by
kp p1 k2
1 2
rG (p) = kp3 p2 k2 .
2
kp1 p3 k2
Both frameworks are rigid but only the left framework is infinitesimally rigid. To see this, consider the rigidity
matrix >
p1 p>
2 p>
2 p>1 0>2
rG (p) > p> p> .
= 0>
2 p>
2 p3 3 2
p > > >
p1 p3 02 p>
3 p1
>
The rank of the rigidity matrix at a collinear point is 2 < 2 n 3. Hence, the collapsed triangle in Figure 13.9b
is not infinitesimally rigid. All non-collinear realizations are infinitesimally and minimally rigid. Hence, the
triangular framework in Figure 13.9a is generically minimally rigid (for almost every p R6 ).
Minimally rigid graphs can be constructed by adding a new node with two undirected edges to an
existing minimally rigid graph; see Figure 13.10. This construction is known under the name Henneberg
sequence.
The flocking result in Theorem 13.10 identifies the critical points of the potential function as the positive
limit set. For minimally rigid graphs, we can perform a more insightful stability analysis. To do so, we
first reformulate the formation control problem in the coordinates of the relative positions e = B > p. The
rigidity function can be conveniently rewritten in terms of the relative positions eij = pi pj for every
edge {i, j} E:
1 >
rG : B > R2n R|E| , rG (e) = . . . , keij k22 , . . . .
2
p2
p2
p1 p1
p3
p3
(a) A rigid and infinitesimally rigid framework (trian- (b) A rigid but not infinitesimally rigid framework
gle inequalities are strict) (triangle inequalities are equalities)
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.5. Rigidity and stability of the target formation 193
Theorem 13.15 (Stability of minimally rigid formations (Drfler and Francis 2009)). Consider the
nonlinear flocking system (13.7) with an undirected and connected graph G = (V, E) and a realizable and
minimally rigid formation F. For every initial condition p0 Rn , we have that
Wp0 = {p R2n | average(p) = average(p0 ) , V (p) V (p0 ) , kR(e)> [v(e) d]k2 = 0|E| } .
In particular, the limit set Wp0 is a union of realizations of the target formation (G, p) with p Wp0 F
and the set of points p Wp0 where the framework (G, p) is not infinitesimally rigid; and
For every p0 R2n such that the framework (G, p) is minimally rigid for all p in the set
the agents converge exponentially fast to a stationary target formation (G, p ) with p Wp0 F.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
194 Chapter 13. Nonlinear Systems and Robotic Coordination
In particular, the limit set We0 includes (i) realizations of the target formation (G, p) with p Wp0 F,
e=B > p, and [v(e)d] = 0|E| , and (ii) the set of points e We where the rigidity matrix R(e)> Rn|E|
0
looses rank corresponding to points p Wp0 where the framework (G, p) is not infinitesimally rigid.
Due to minimal rigidity of the target formation the matrix R(e)> R2nm has full rank |E| = 2n 3
for all e B > F, or said differently R(e)R(e)> has no zero eigenvalues for all e B > F. The minimal
eigenvalue of R(e)R(e)> is positive for all e B > F and thus (due to continuity of eigenvalues with
respect to the matrix elements) also in an open neighborhood of B > F. In particular, for any strictly positive
> 0, we can find = () so that everywhere in the sublevel set () the matrix R(e)R(e)> is positive
definite with eigenvalues lower-bounded by . Formally, is obtained by
= argmaxe,
subject to e (
)
min eig R(e)R(e)> .
e(
)
Then, for all e (), we can upper-bound the derivative of V (e) along trajectories as
By the Grnwall-Bellman Comparison Lemma in Exercise E13.1, we have that for every e0 (),
V (e(t)) V (e0 )e4t . It follows that the the target formation set (parameterized in terms of relative
positions) B > F is exponentially stable with () as guaranteed region of attraction.
Although the e-dynamics (13.8) and the p-dynamics (13.7) both have the formation F as a limit set,
convergence of the e-dynamics does not automatically imply convergence to a stationary target formation
(but only convergence of the point-to-set distance to F). To establish stationarity, we rewrite the p-dynamics
(13.7) as Z t
p(t) = p0 + f ( ) d , (13.11)
0
where f (t) = B diag(e(t)) v(e(t) d . Due to the exponential convergence rate of the e-dynamics in
We0 the function f (t) is exponentially decaying in time and thus an integrable (L1 ) function. It follows that
the integral on the right-hand side of (13.11) exists even in the limit as t and thus a solution of the
p-dynamics converges to a finite point in F, that is, the agents converge to a stationary target formation.
In conclusion for every p0 R2n so that e0 = B > p0 (), the agents converge exponentially fast to a
stationary target formation.
Theorem 13.15 formulated for minimally rigid formations can also be extended to more redundant
infinitesimally rigid formations; see (Oh et al. 2015).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
13.6. Exercises 195
13.6 Exercises
E13.1 Grnwall-Bellman Comparison Lemma. Given a continuous function of time t 7 a(t) R, suppose
the signal t 7 x(t) satisfies
x(t)
a(t)x(t).
Define a new signal t 7 y(t) satisfying y(t)
= a(t)y(t). Show that
Rt
(i) y(t) = y(0) exp 0 a( )d , and
(ii) x(t) y(t).
E13.2 The Lotka-Volterra predator/prey dynamics. In mathematical ecology (Takeuchi 1996), the Lotka-
Volterra equations are frequently used to describe the dynamics of biological systems in which two animal
species interact, a predator and a prey. According to this model, the animal populations change through time
according to
x(t)
= x(t) x(t)y(t),
(E13.1)
= y(t) + x(t)y(t),
y(t)
where x is the nonnegative number of preys, y is the nonnegative number of predators individuals, and , ,
and are fixed positive systems parameters.
(i) Compute the unique non-zero equilibrium point (x , y ) of the system.
(ii) Determine, if possible, the stability properties of the equilibrium points (0, 0) and (x , y ) via lineariza-
tion (Theorem 13.6).
(iii) Define the function V (x, y) = x y + ln(x) + ln(y) and note its level sets as illustrated in
Figure (E13.1).
a) Compute the Lie derivative of V (x, y) with respect to the Lotka-Volterra vector field.
b) What can you say about the stability properties of (x , y )?
c) Sketch the trajectories of the system for some initial conditions in the x-y positive orthant.
y (x , y )
Figure E13.1: Level sets of the function V (x, y) for unit parameter values
E13.3 On the gradient flow of a strictly convex function. Let f : Rn R be a strictly convex and twice
differentiable function. Show convergence of the associated negative gradient flow, x = x f (x), to the
>
global minimizer x of f using the Lyapunov function V (x) = (x x ) (x x ) and the LaSalle Invariance
Principle in Theorem 13.4.
Hint: Use the global underestimate property of a strictly convex function stated as follows: f (x0 ) f (x) >
x f (x)(x x) for all distinct x and x in the domain of f .
0 0
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
196 Chapter 13. Nonlinear Systems and Robotic Coordination
E13.4 Consensus with input constraints. Consider a set of n agents each with first-order dynamics x i = ui .
(i) Design a consensus protocol that respects input constraints ui (t) [1, 1] for all t 0, and prove
that your protocol achieves consensus.
Hint: Adopt the hyperbolic tangent function (or the arctangent function) and Theorem 13.9.
(ii) Extend the protocol and the proof to the case of second-order dynamics x i = ui to achieve consensus
of position states and convergence of velocity states to zero.
Hint: Recall Example 13.8.
E13.5 Distributed optimization using the Laplacian flow. Consider the saddle point dynamics (7.13) that
solve the optimization problem (7.12) in a distributed fashion. Assume that the objective functions are
strictly convex and twice differentiable and that the underlying communication graph among the distributed
processors is connected and undirected. By using the LaSalle Invariance Principle show that all solutions of
the saddle point dynamics converge to the set of saddle points.
Hint: Use the following global underestimate property of a strictly convex function: f (x0 )f (x) > x
f (x)(x0
x) for all distinct x and x in the domain of f ; and the following global overestimate property of a concave
0
Is the origin locally asymptotically stable? Can you comment on the region of attraction?
E13.7 Pentagon formation. Consider n = 5 agents that should form a pentagon with unit side lengths ac-
cording to the formation control protocol (13.7). Design a graph so that the pentagon formation is locally
asymptotically stable.
E13.8 Global analysis of a linear formation. Consider two agents with positions pi = (xi , yi ) R2 , i
{1, . . . , 2}, with controllable integrator dynamics pi = ui , where ui R2 is the steering command that
serves as control input. The two agents have access to only relative position measurements p1 p2 . Your
tasks are as follows:
(i) propose a control law for u1 and u2 as function of the relative position p1 p2 and a design parameter
d12 > 0 so that the agents achieve a desired distance kp1 p2 k = d12 > 0 in steady state (possibly
next to other undesired equilibria).
(ii) study the convergence properties of the closed loop under your proposed control law.
(iii) show that your proposed control law (or a modification thereof) achieves that almost all trajectories
converge to the desired formation. Possibly you need to modify your controller accordingly.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 14
Coupled Oscillators: Basic Models
In this chapter we discuss network of coupled oscillators. We borrow ideas from (Drfler and Bullo
2011, 2014). This chapter focuses on phase-coupled oscillators and does not discuss models of impulse-
coupled oscillators. Further information on coupled oscillator models can be found in Mauroy et al. (2012);
Acebrn et al. (2005); Strogatz (2000); Arenas et al. (2008).
14.1 History
The scientific interest in synchronization of coupled oscillators can be traced back to the work by Christiaan
Huygens on an odd kind sympathy between coupled pendulum clocks (Huygens 1673). The model of
coupled oscillator which we study was originally proposed by Arthur Winfree (Winfree 1967). For complete
interaction graphs, this model is nowadays known as the Kuramoto model due to the work by Yoshiki
Kuramoto (Kuramoto 1975, 1984). Stephen Strogatz provides an excellent historical account in (Strogatz
2000).
The Kuramoto model and its variations appear in the study of biological synchronization phenomena
such as pacemaker cells in the heart (Michaels et al. 1987), circadian rhythms (Liu et al. 1997), neuroscience
(Varela et al. 2001; Brown et al. 2003; Crook et al. 1997), metabolic synchrony in yeast cell populations
(Ghosh et al. 1971), flashing fireflies (Buck 1988), chirping crickets (Walker 1969), and rhythmic applause
(Nda et al. 2000), among others. The Kuramoto model also appears in physics and chemistry in modeling
and analysis of spin glass models (Daido 1992; Jongen et al. 2001), flavor evolutions of neutrinos (Pantaleone
1998), and in the analysis of chemical oscillations (Kiss et al. 2002). Some technological applications include
deep brain stimulation (Tass 2003), vehicle coordination (Paley et al. 2007; Sepulchre et al. 2007; Klein
et al. 2008), semiconductor lasers (Kozyreff et al. 2000; Hoppensteadt and Izhikevich 2000), microwave
oscillators (York and Compton 2002), clock synchronization in wireless networks (Simeone et al. 2008), and
droop-controlled inverters in microgrids (Simpson-Porco et al. 2013).
197
198 Chapter 14. Coupled Oscillators: Basic Models
3
k34
4
k23
2 k24
k12
14.2 Examples
14.2.1 Example #1: A spring network on a ring
This coupled-oscillator network consists of particles rotating around a unit-radius circle and assumed to
possibly overlap without colliding. Each particle is subject to (1) a non-conservative torque i , (2) a linear
damping torque, and (3) a total elastic torque.
Pairs of interacting particles i and j are coupled through elastic springs with stiffness kij > 0. The
elastic energy stored by the spring between particles at angles i and j is
kij kij
Eij (i , j ) = distance2 = (cos i cos j )2 + (sin i sin j )2
2 2
= kij 1 cos(i ) cos(j ) sin(i ) sin(j ) = kij 1 cos(i j ) ,
where Mi and Di are inertia and damping coefficients. In the limit of small masses Mi and uniformly-high
viscous damping D = Di , that is, Mi /D 0, the model simplifies to:
Xn
i = i aij sin(i j ), i {1, . . . , n}.
j=1
with natural rotation frequencies i = i /D and with coupling strengths aij = kij /D.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
14.2. Examples 199
!"#$%&'''%()(*%(+,-.,*%/012-3*%)0-4%5677*%899: !"#$%&'
8
8 15 37 9
10 10
10 37 30 25
38 02
29 10 26 03
30 25 26 28 29
i / rad
2 38 2
04
18 27 05
1 24 9 5 1 3 27
6
28
35
39
17 18 22
F 16 0 9
4
17 21
6
3 8
5
1 15 35 -5 24
21 0 2 4 7 6 8 10
39 22
12 14 15 23
6
36
4 14
31
16
7
13
2
5 15 11
6 12 19 23
06
10
13 20 36 19 07
7 31 11 10 32 20 08
10 34 33 7 33
i / rad
8 2 3
09
34
32 5 4 5
4
9
3 5
0
Fig. 9. The New England test system [10], [11]. The system includes
10 synchronous generators and 39 buses. Most of the buses have constant -5
0 2 4 6 8 10
active and reactive
Figure 14.2: powerdiagram
Line loads. Coupled
and swing
graph dynamics of 10 generators
representation for a simplified model of the New England Power Grid. Generators
are studied in the case that a line-to-ground fault occurs at point F near bus TIME /s
16.are represented by and load buses by .
Fig. 10. Coupled swing of phase angle i in New England test system.
The fault duration is 20 cycles of a 60-Hz sine wave. The result is obtained
by numerical integration of eqs. (11).
14.2.2 Example
test system can #2: The structure-preserving
be represented by power network model
i = i , an AC power network, visualized
We consider
in Figure 14.2, with n buses including generators and load
H !10 are provided to discuss whether themodel instability
i
buses. We
i = D i i + Pmi Gii Ei
present 2
two simplified models Ei Ej for this (11) network, a static power-balance
occurs in the corresponding real power system. First, the
andina dynamic
Fig. 10
fs
continuous-time model. j=1,j!=i classical model with constant voltage behind impedance is
{Gij cos(i j ) + Bij sin(i j )},
The transmission network is described by an admittance used for matrix
first swing Y Cnnofthat
criterion transient stability [1].
is symmetric andThis is
sparse
where i = 2, . . . , 10. is the rotor angle of generator i with because second and multi swings may be affected by voltage
with line impedances Z = Zji for each branch {i,fluctuations,
i j} E. The dampingnetwork
effects,admittance
controllers such matrix is sparse
as AVR, PSS,
respect to bus 1, and i the rotorij speed deviation of generator
i relative to P
matrix withsystem angular frequency (2fs = 2 ij60 Hz).
nonzero off-diagonal entries Y = 1/Z and
ij
governor.
for each Second,
branch the
{i, fault
j} durations,
E; the which
diagonal we fixed at
elements
1Yis constant nfor the above assumption. The parameters 20 cycles, are normally less than 10 cycles. Last, the load
ii = j=1,j6 =i Yij assure zero row-sums. condition used above is different from the original one in
fs , Hi , Pmi , D i , Ei , Gii , Gij , and Bij are in per unit
The static model is described
system except for Hi and Di in second, and for fs in Helz. by the following [11].
two We cannotFirstly,
concepts. hence argue that global
according instability occurs
to Kirchhoffs in
current
The mechanical input power P to generator i and the the real system. Analysis, however, does show a possibility
law, the current injection at node i is balanced by the current flows from adjacent nodes:
mi
magnitude Ei of internal voltage in generator i are assumed of global instability in real power systems.
to be constant for transient stability studies [1],X n H is
[2]. X n
1i IV. T OWARDS A C ONTROL FOR G LOBAL S WING
the inertia constant of generator i, Di its damping Ii = coefficient, (Vi Vj ) = Yij Vj . I NSTABILITY
and they are constant. Gii is the internal conductance, Zijand
Gij + jBij the transfer impedance between generators j=1 i Global j=1 instability is related to the undesirable phenomenon
j; that should be avoided by control. We introduce a key
and They
Here, Ii and VNote
topology changes.
are the
i are the phasor representations of the nodal current injections and nodal voltages, e.g.,
parameters which change with network
that electrical loads in the test system mechanism for thecontrol problem and discuss control
V = |V | e ii corresponds to the signal |V | cos( t + i ). (Recall i = 1.)
are imodeled i as passive impedance [11]. i 0 strategies for preventing or avoiding
The complex the instability.
power injection
i = Vi IExperiment
B.SNumerical i (where z denotes the complex conjugateA.ofInternal z C)Resonance then satisfiesas Another
the power Mechanismbalance equation
Coupled swing dynamics of 10 generators Xn in the X nInspired by [12], we here describe the global instability
test system are simulated. Ei and the initial condition with dynamical systems i(i jtheory
) close to internal resonance
S =V
(i (0), i (0) = 0) for generator i arei fixedi through power
Y ij V j = [23],Y[24]. ij |VConsider
i ||Vj |e collective . dynamics in the system (5).
flow calculation. Hi is fixed at the original values j=1 in [11]. j=1 the system (5) with small parameters pm and b, the set
For
Pmi and constant power loads are assumed to be 50% at their {(, ) S 1 R | = 0} of states in the phase plane is
Secondly,
ratings [22]. The for adamping
losslessDnetwork the real part of the power
i is 0.005 s for all generators.
balancesurface
called resonant equations at each
[23], and node is
its neighborhood resonant
Gii , Gij , and Bij are also based on the original line n data band. The phase plane is decomposed into the two parts:
X
in [11] and the power flow calculation. It is assumed that resonant band and high-energy zone outside of it. Here the
Pi condition = at t = 0as,ij sin( i j ) i {1,
, local and.mode
. . , n}, (14.1)
the test system is in a steady operating
that a line-to-ground faultactive occurs
|{z}
at point F near bus 16
|at initial
{z conditions
indeed exist }insideofthe resonant band.
disturbances in Sec.
The collective
II
motion
j=1
power injection active power flow from j to i
t = 1 s20/(60 Hz), and that line 1617 trips at t = 1 s. The before the onset of coherent growing is trapped near the
resonant band. On the other hand, after the coherent growing,
fault
where durationa is=20|Vcycles
is simulatedijby adding i ||V
of a |60-Hz
||Y denotes sine the
wave. The fault power
maximum transfer over the transmission line {i, j}, and
a jsmallij impedance (107 j) between it escapes from the resonant band as shown in Figs. 3(b),
busPi16=and <(Sground.
i ) is Fig.
the 10
active shows
power coupled swings
injection of
into rotor
the 4(b),
network 5, atandnode 8(b)i, and
which(c). isThe trappedfor
positive motion is almost
generators and
angle i in the test system. The figure indicates that all rotor integrable and
negative for loads. The systems of equations (14.1) are the so-called (balanced) active power flow equations. is regarded as a captured state in resonance
angles start to grow coherently at about 8 s. The coherent [23]. At a moment, the integrable motion may be interrupted
growing is global instability. by small kicks that happen during the resonant band. That is,
C. Remarks Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug so-called the 2016). Draft release from
not for resonance
circulation. [23] happens,
Copyright and the
2012-16.
collective motion crosses the homoclinic orbit in Figs. 3(b),
It was confirmed that the system (11) in the New Eng- 4(b), 5, and 8(b) and (c), and hence it goes away from
land test system shows global instability. A few comments the resonant band. It is therefore said that global instability
(')$
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 10, 2009 at 14:48 from IEEE Xplore. Restrictions apply.
200 Chapter 14. Coupled Oscillators: Basic Models
Next, we discuss a simplified dynamic model. Many appropriate dynamic models have been proposed
for each network node: zeroth order (for so-called constant power loads), first-order models (for so-called
frequency-dependent loads and inverter-based generators), and second and higher order for generators;
see (Bergen and Hill 1981). For extreme simplicity here, we assume that every node is described by a
first-order integrator with the following intuition: node i speeds up (i.e., i increases) when the power
balance at node i is positive, and slows down (i.e., i decreases) when the power balance at node i is negative.
In other words, we assume
X n
i = Pi aij sin(i j ). (14.2)
j=1
The systems of equations (14.2) are a first-order simplified version of the so-called coupled swing equations.
Note that, when every node is connected to every other node with identical connections of strength
K > 0, our simplified model of power network is identical to the so-called Kuramoto oscillators model:
n
KX
i = i sin(i j ). (14.3)
n
j=1
ri = eii ,
(14.4)
i = ui (r, ) ,
for i {1, . . . , n}. If no control is applied, then particle i travels in a straight line with orientation i (0),
and if ui = i R is a nonzero constant, then particle i traverses a circle with radius 1/|i |.
The interaction among the particles is modeled by a interaction graph G = ({1, . . . , n}, E, A) de-
termined by communication and sensing patterns. As shown by Vicsek et al. (1995), interesting motion
patterns emerge if the controllers use only relative phase information between neighboring particles. As
discussed in the previous chapter, we may adopt potential functions-based gradient control strategies (i.e.,
negative gradient flows) to coordinate the relative heading angles i (t) j (t). As shown in Example #1, an
intuitive extension of the quadratic Hookean spring potential to the circle is the function Uij : S1 , S1 R
defined by
Uij (i , j ) = aij (1 cos(i j )),
for each edge {i, j} E. Notice that the potential Uij (i , j ) achieves its unique minimum if the heading
angles i and j are synchronized, and it achieves its maximum when i and j are out of phase by angle .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
14.3. Coupled phase oscillator networks 201
X Xn
i = 0 K Uij (i j ) = 0 K aij sin(i j ) , i {1, . . . , n} . (14.5)
i j=1
{i,j}E
to synchronize the heading angles of the particles for K > 1 (gradient descent), respectively, to disperse
the heading angles for K < 1 (gradient ascent). The term 0 can induce additional rotations (for 0 6= 0)
or translations (for 0 = 0). A few representative trajectories are illustrated in Figure 14.3.
The controlled phase dynamics (14.5) give rise to elegant and useful coordination patterns that mimic
animal flocking behavior and fish schools. Inspired by these biological phenomena, scientists have studied
the controlled phase dynamics (14.5) and their variations in the context of tracking and formation controllers
in swarms of autonomous vehicles (Paley et al. 2007).
Figure 14.3: Panel (a) illustrates the particle kinematics (14.4). Panels (b)-(e) illustrate the controlled dynamics (14.4)-
(14.5) with n = 6 particles, a complete interaction graph, and identical and constant natural frequencies: 0 (t) = 0 in
panels (b) and (c) and 0 (t) = 1 in panels (d) and (e). The values of K are K = +1 in panel (b) and (d) and K = 1 in
panel (c) and (e). The arrows depict the orientation, the dashed curves show the long-term position dynamics, and the
solid curves show the initial transient position dynamics. As illustrated, the resulting motion displays synchronized or
dispersed heading angles for K = 1, and translational motion for 0 = 0, respectively circular motion for 0 = 1.
A special case of the coupled oscillator (14.6) is the so-called Kuramoto model (Kuramoto 1975) with a
complete homogeneous network (i.e., with identical edge weights aij = K/n):
K Xn
i = i sin(i j ), i {1, . . . , n}. (14.7)
n j=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
202 Chapter 14. Coupled Oscillators: Basic Models
Geodesic distance The clockwise arc-length from i to j is the length of the clockwise arc from i to
j . The counterclockwise arc-length is defined analogously. The geodesic distance between i and j is
the minimum between clockwise and counterclockwise arc-lengths and is denoted by |i j |. In the
parametrization:
Arc subsets of the n-torus Given a length [0, 2[, the arc subset arc () Tn is the set of n-tuples
(1 , . . . , n ) such that there exists an arc of length containing all 1 , . . . , n . The set arc () is the interior
of arc (). For example, arc () implies all angles 1 , . . . , n belong to a closed half circle. Note:
(i) If (1 , . . . , n ) arc (), then |i j | for all i and j. The converse is not true in general.
For example, { Tn | |i j | for all i, j} is equal to the entire Tn . However, the converse
statement is true in the following form (see also Exercise E14.2): if |i j | for all i and j and
(1 , . . . , n ) arc (), then (1 , . . . , n ) arc ().
(ii) If = (1 , . . . , n ) arc (), then average() is well posed. (The average of n angles is ill-posed in
general. For example, there is no reasonable definition of the average of two diametrically-opposed
points.)
Frequency synchrony: A solution : R0 Tn is frequency synchronized if i (t) = j (t) for all time t
and for all i and j.
Phase synchrony: A solution : R0 Tn is phase synchronized if i (t) = j (t) for all time t and for
all i and j.
Phase cohesiveness: A solution : R0 Tn is phase cohesive with respect to > 0 if one of the
following conditions hold for all time t:
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
14.3. Coupled phase oscillator networks 203
Asymptotic notions: We will also talk about solutions that asymptotically achieve certain synchronization
properties. For example, a solution : R0 Tn achieves phase synchronization if limt |i (t)
j (t)| = 0. Analogous definitions can be given for asymptotic frequency synchronization and
asymptotic phase cohesiveness.
Finally, notice that phase synchrony is the extreme case of all phase cohesiveness notions with = 0.
Lemma 14.1 (Synchronization frequency). If a solution of the coupled oscillator model (14.6) achieves
frequency synchronization, then it does so with a constant synchronization frequency equal to
n
1X
sync , i = average().
n
i=1
Proof. This fact is obtained by summing all equations (14.6) for i {1, . . . , n}.
Lemma 14.1 implies that, by expressing each angle with respect to a rotating frame with frequency
sync and by replacing i by i sync , we obtain sync = 0 or, equivalently, 1 n . In this rotating
frame a frequency-synchronized solution is an equilibrium. Due to the rotational invariance of the coupled
oscillator model (14.6), it follows that if Tn is an equilibrium point, then every point in the rotation set
[ ] = { Tn | rot ( ) , [, [}
is also an equilibrium. Notice that the set [ ] is a connected circle in Tn , and we refer to it as an equilibrium
set. Figure 14.4 for the two-dimensional case.
[ ]
12
|1 2 | < /2
Figure 14.4: Illustration of the state space T2 , the equilibrium set [ ] associated to a phase-synchronized equilibrium
(dotted blue line), the (meshed red) phase cohesive set |2 1 | < /2, and the tangent space with translation
vector 12 at arising from the rotational symmetry.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
204 Chapter 14. Coupled Oscillators: Basic Models
(ii) Local stability: if there exists an equilibrium such that |i j | < /2 for all {i, j} E, then
a) J( ) is a Laplacian matrix; and
b) the equilibrium set [ ] is locally exponentially stable.
Proof. We start with statements (i) and (ii)a. Given Tn , we define the undirected graph Gcosine () with
the same nodes and edges as G and with edge weights aij cos(i j ). Next, we compute
Xn Xn
i aij sin(i j ) = aij cos(i j ),
i j=1 j=1
Xn
i aik sin(i k ) = aij cos(i j ).
j k=1
Therefore, the Jacobian is equal to minus the Laplacian matrix of the (possibly negatively weighted) graph
Gcosine () and statement (i) follows from Lemma 8.1. Regarding statement (ii)a, if |i j | < /2 for all
{i, j} E, then cos(i j ) > 0 for all {i, j} E, so that Gcosine () has strictly nonnegative weights
and all usual properties of Laplacian matrices hold.
To prove statement (ii)b notice that J( ) is negative semidefinite with the nullspace 1n arising from
the rotational symmetry, see Figure 14.4. All other eigenvectors are orthogonal to 1n and have negative
eigenvalues. We now restrict our analysis to the orthogonal complement of 1n : we define a coordinate
transformation matrix Q R(n1)n with orthonormal rows orthogonal to 1n ,
Q1n = 0n1 and QQ> = In1 ,
and we note that QJ( )Q> has negative eigenvalues. Therefore, in the original coordinates, the zero
eigenspace 1n is exponentially stable. By Theorem 13.6, the corresponding equilibrium set [ ] is locally
exponentially stable.
Corollary 14.3 (Frequency synchronization). If a solution of the coupled oscillator model (14.6) satisfies
the phase cohesiveness properties |i (t) j (t)| for some [0, /2[ and for all t 0, then the coupled
oscillator model (14.6) achieves exponential frequency synchronization.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
14.3. Coupled phase oscillator networks 205
The order parameter (14.8) is the centroid of all oscillators represented as points on the unit circle in C1 .
The magnitude r of the order parameter is a synchronization measure:
By means of the order parameter rei the all-to-all Kuramoto model (14.7) can be rewritten in the
insightful form
i = i Kr sin(i ) , i {1, . . . , n} . (14.9)
(We ask the reader to establish this identity in Exercise E14.4.) Equation (14.9) gives the intuition that the
oscillators synchronize because of their coupling to a mean field represented by the order parameter rei ,
which itself is a function of (t). Intuitively, for small coupling strength K each oscillator rotates with its
distinct natural frequency i , whereas for large coupling strength K all angles i (t) will entrain to the
mean field rei , and the oscillators synchronize. The transition from incoherence to synchrony occurs at a
critical threshold value of the coupling strength, denoted by Kcritical .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
206 Chapter 14. Coupled Oscillators: Basic Models
14.4 Exercises
E14.1 Simulating coupled oscillators. Simulate in your favorite programming language and software package
the coupled Kuramoto oscillators in equation (14.3). Set n = 10, define a vector R10 with entries
deterministically uniformly-spaced between 1 and 1. Select random initial phases.
(i) Simulate the resulting differential equations for K = 10 and K = 0.1.
(ii) Find the approximate value of K at which the qualitative behavior of the system changes from
asynchrony to synchrony.
Turn in your code, a few printouts (as few as possible), and your written responses.
E14.2 Phase cohesiveness and arc length. Pick < 2/3 and n 3. Show the following statement: if Tn
satisfies |i j | for all i, j {1, . . . , n}, then there exists an arc of length containing all angles, that
is, arc ().
E14.3 Order parameter and arc length. Given n 2 and Tn , the shortest arc length () is the length of the
shortest arc containing all angles, i.e., the smallest () such that arc (()). Given Tn , the order
parameter is the centroid of (1 , . . . , n ) understood as points on the unit circle in the complex plane C:
1 Xn
r() e()i := ej i .
n j=1
where recall i = 1. Prove the following statements:
(i) if () [0, ], then r() [cos(()/2), 1]; and
(ii) if arc (), then () [2 arccos(r()), ].
The order parameter magnitude r is known to measure synchronization. Show the following statements:
(iii) if all oscillators are phase-synchronized, then r = 1, and
(iv) if all oscillators are spaced equally on the unit circle (the so-called splay state), then r = 0.
E14.4 Order parameter and mean-field dynamics. Show that the Kuramoto model (14.7) is equivalent to the
so-called mean-field model (14.9) with the order parameter r defined in (14.8).
E14.5 Multiplicity of equilibria in the Kuramoto model. A common misconception in the literature is that
the Kuramoto model has a unique equilibrium set in the phase cohesive set { Tn | |i j | <
/2 for all {i, j} E}. Consider now the example of a Kuramoto oscillator network defined over a sym-
metric ring graph with identical unit weights and zero natural frequencies. The equilibria are determined
by
0 = sin(i i1 ) + sin(i i+1 ) ,
where i {1, . . . , n} and all indices are evaluated modulo n. Show that for n > 4 there are at least two
disjoint equilibrium sets in the phase cohesive set { Tn | |i j | < /2 for all {i, j} E}.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 15
Networks of Coupled Oscillators
Lemma 15.1. Consider the coupled oscillator model (14.6). If i 6= j for some distinct i, j {1, . . . , n},
then the oscillators cannot achieve phase synchronization.
Proof. We prove the lemma by contraposition. Assume that all oscillators are in phase synchrony i (t) =
j (t) for all t 0 and all i, j {1, . . . , n}. Then by equating the dynamics, i (t) = j (t), it follows
necessarily that i = j .
Motivated by Lemma 15.1, we consider oscillators with identical natural frequencies, i = R for
all i {1, . . . , n}. By working in a rotating frame with frequency , we have = 0. Thus, we consider
the model Xn
i = aij sin(i j ), i {1, . . . , n}. (15.1)
j=1
Notice that phase synchronization is an equilibrium of the this model. Conversely, phase synchronization
cannot be an equilibrium of the original coupled oscillator model (14.6) if i 6= j .
207
208 Chapter 15. Networks of Coupled Oscillators
p
where bij (t) = aij (1 + xi (t)2 )/(1 + xj (t)2 ) and bij (t) aij cos(/2); see Exercise E15.3 for a deriva-
tion. Similarly, in the yi -coordinates, the coupled oscillator model reads as
Xn
y i (t) = cij (t)(yi (t) yj (t)), (15.3)
j=1
where cij (t) = aij sinc(yi (t) yj (t)) and cij (t) aij sinc(). Notice that both averaging formulations
(15.2) and (15.3) are well-defined as long as the the oscillators remain in a semi-circle arc () for some
[0, [.
Theorem 15.2 (Phase cohesiveness and synchronization in open semicircle). Consider the coupled
oscillator model (15.1) with a connected, undirected, and weighted graph G = ({1, . . . , n}, E, A). The
following statements hold:
(i) phase cohesiveness: for each [0, [ each solution orginating in arc () remains in arc () for all
times;
(ii) asymptotic phase synchronization: each trajectory originating in arc () for [0, [ achieves expo-
nential phase synchronization, that is,
Proof. Consider the averaging formulations (15.2) and (15.3) with initial conditions (0) arc () for some
[0, [. By continuity, for small positive times t > 0, the oscillators remain in a semi-circle, the time-
varying weights bij (t) aij (cos(/2) and cij (t) aij sinc() are strictly positive for each {i, j} E,
the associated time-dependent graph is connected. As one establishes in the proof of Theorem 11.9, the
max-min functions
are strictly decreasing for the time-varying consensus systems (15.2) and (15.3) until consensus is reached.
Thus, the oscillators remain in arc () phase synchronization exponentially fast. Since the graph is
undirected, we can also conclude convergence to the average phase. Finally, the explicit convergence
estimate (15.4) follows, for example, by analyzing (15.2) with the disagreement Lyapunov function and
using bij (t) aij cos(/2).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
15.1. Synchronization of identical oscillators 209
Then the coupled oscillator model (14.6) (with all i = 0) can be formulated as the gradient flow
>
U ()
= . (15.6)
Among the many critical points of the potential function (15.5), the set of phase-synchronized angles is the
global minimum of the potential function (15.5). This can be easily seen since each summand in (15.5) is
bounded in [0, 2aij ] and the lower bound is reached only if neighboring oscillators are phase-synchronized.
This global minimum is locally exponentially stable.
Theorem 15.3 (Phase synchronization). Consider the coupled oscillator model (15.1) with a connected,
undirected, and weighted graph G = ({1, . . . , n}, E, A). Then
(i) Global convergence: For all initial conditions (0) Tn , the phases i (t) converge to the set of critical
points { Tn | U ()/ = 0> n }; and
(ii) Local stability: Phase synchronization is a locally exponentially stable equilibrium set.
Since the potential function and its derivative are smooth and the dynamics are bounded in a compact
forward invariant set (Tn ), we can apply the Invariance Principle in Theorem 13.4 to arrive at statement (i).
Statement (ii) follows from the Jacobian result in Lemma 14.2 and Theorem 13.6.
Theorem 15.3 together with Theorem 15.2 gives a fairly complete picture of the convergence and phase
synchronization properties of the coupled oscillator model (15.1).
According to Theorem 15.3 phase synchronization is only locally stable. A stronger result can be made
in case of an all-to-all homogeneous coupling graph, that is, for the Kuramoto model (14.7).
Corollary 15.4 (Almost global phase synchronization for the Kuramoto model). Consider the Ku-
ramoto model (14.7) with identical natural frequencies i = j for all i, j {1, . . . , n}. Then for almost all
initial conditions in Tn , the oscillators achieve phase synchronization.
Proof. For identical natural frequencies, the Kuramoto model (14.7) can be put in rotating coordinates
coordinates so that i = 0 for all i {1, . . . , n}; see Section 15.1. The Kuramoto model reads in the
order-parameter formulation (14.9) as
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
210 Chapter 15. Networks of Coupled Oscillators
and its unique global minimum is obtained for r = 1, that is, in the phase-synchronized state. By
Theorem 15.3, all angles converge to the set of equilibria which are from (15.7) either (i) r = 0, (ii) r > 0
and in-phase with the order parameter i = , or (iii) r > 0 and out-of-phase with the order parameter
i = + k for k Z \ {0} for all i {1, . . . , n}. In the latter case, any infinitesimal deviation from
an out-of-phase equilibrium causes the potential (15.8) to decrease, that is, the out-of-phase equilibria are
unstable. Likewise, the equilibria with r = 0 correspond to the global maxima of the potential (15.8), and
any infinitesimal deviation from these equilibria causes the potential (15.8) to decrease. It follows that,
from almost all initial conditions1 , the oscillators converge to phase-synchronized equilibria i = for all
i {1, . . . , n}.
that is, asymptotically the oscillators are uniformly distributed over the unit circle S1 so that their centroid
converges to the origin.
For a complete homogeneous graph with coupling strength aij = K/n, i.e., for the Kuramoto model
(14.7), we have a remarkable identity between the magnitude of the order parameter r and the potential
function U ()
Kn
U () = 1 r2 . (15.9)
2
(We ask the reader to establish this identity in Exercise E15.1.) For the complete graph, the correspon-
dence (15.9) shows that the global minimum of the potential function U () = 0 (for r = 1) corresponds to
phase-synchronization and the global maximum U () = Kn/2 (for r = 0) corresponds to phase balancing.
This motivates the following gradient ascent dynamics to reach phase balancing:
> n
X
U ()
= + = aij sin(i j ) . (15.10)
j=1
Theorem 15.5 (Phase balancing). Consider the coupled oscillator model (15.10) with a connected, undirected,
and weighted graph G = ({1, . . . , n}, E, A). Then
(i) Global convergence: For all initial conditions (0) Tn , the phases i (t) converge to the set of critical
points { Tn | U ()/ = 0> n }; and
1
To be precise further analysis is needed. A linearization of the Kuramoto model (15.7) at the unstable out-of-pase equilibria
yields that these are exponentially unstable. The region of attraction (the so-called stable manifold) of such exponentially unstable
equilibria is known to be a zero measure set (Potrie and Monzn 2009, Proposition 4.1).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
15.2. Synchronization of heterogeneous oscillators 211
(ii) Local stability: For a complete graph with uniform weights aij = K/n, phase balancing is the global
maximizer of the potential function (15.9) and is a locally asymptotically stable equilibrium set.
Proof. The proof statement (i) is analogous to the proof of statement (i) in Theorem 15.3.
Kn 2
balanced set characterized by r = 0
To prove statement (ii), notice that, for a complete graph, the phase
achieves the global maximum of the potential U () = 2 1 r . By Theorem 13.7, local maxima of the
potential are locally asymptotically stable for the gradient ascent dynamics (15.10).
Lemma 15.6. Necessary synchronization condition Consider the coupled Pn oscillator model (14.6) with graph
G = ({1, . . . , n}, E, A), frequencies 1n , and nodal degree deg i = j=1 aij for each node i {1, . . . , n}.
If there exists a frequency-synchronized solution satisfying the phase cohesiveness |i j | for all {i, j} E
and for some [0, /2], then the following conditions hold:
Proof. Statement (i) follows directly from the fact that synchronized solutions must satisfy the equilibrium
equation i = 0.
PSince the sinusoidal interaction terms in equation (14.6) are upper bounded by the nodal
degree degi = nj=1 aij , condition (15.11) is necessary for the existence of an equilibrium.
Statement (ii) follows from the fact that frequency-synchronized solutions must satisfy i j = 0. By
analogous arguments, we arrive at the necessary condition (15.12).
As discussed in Subsection 14.3.4, the Kuramoto model synchronizes provided that the coupling gain K is
larger than some critical value Kcritical . The necessary condition (15.12) delivers a lower bound for Kcritical
given by
n
K max i min i .
2(n 1) i i
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
212 Chapter 15. Networks of Coupled Oscillators
Here we evaluated the left-hand side of (15.12) for aij = K/n, for the maximum = /2, and for all
distinct i, j {1, . . . , n}. Perhaps surprisingly, the lower necessary bound (15.2.1) is a factor 1/2 away
from the upper sufficient bound.
Theorem 15.7 (Synchronization test for all-to-all Kuramoto model). Consider the Kuramoto model
(15.13) with natural frequencies 1
n and coupling strength K. Assume
and define the arc lengths min [0, /2[ and max ]/2, ] as the unique solutions to sin(min ) =
sin(max ) = Kcritical /K.
Kcritical /K
max min
(i) phase cohesiveness: each solution starting in arc (), for [min , max ], remains in arc () for all
times;
(ii) asymptotic phase cohesiveness: each solution starting in arc (max ) asymptotically reaches the set
arc (min ); and
(iii) asymptotic frequency synchronization: each solution starting in arc (max ) achieves frequency synchro-
nization.
Moreover, the following converse statement is true: Given an interval [min , max ], the coupling strength
K satisfies K > max min if, for all frequencies supported on [min , max ] and for the arc length max
computed as above, the set arc (max ) is positively invariant.
Proof. We start with statement (i). Define the function W : arc () [0, [ by
The arc containing all angles has two boundary points: a counterclockwise maximum and a counter-
clockwise minimum. If Umax () (resp. Umin ()) denotes the set indices of the angles 1 , . . . , n that are
equal to the counterclockwise maximum (resp. the counterclockwise minimum), then
We now assume (0) arc (), for [min , max ], and aim to show that (t) arc () for all times
t > 0. By continuity, arc () is positively invariant if and only if W ((t)) does not increase at any time t
such that W ((t)) = .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
15.2. Synchronization of heterogeneous oscillators 213
In the next equation we compute the maximum possible amount of infinitesimal increase of W ((t))
along system (15.13). We do this in a loose way here and refer to (Lin et al. 2007, Lemma 2.2) for a rigorous
treatment. The statement is:
W ((t + t)) W ((t))
D+ W ((t)) := lim sup = m (t) k (t),
t0+ h
where m Umax ((t)) and k Umin ((t)) have the property that m (t) = max{m0 (t) | m0 Umax ((t))}
and k (t) = min{k0 (t) | k 0 Umin ((t))}. In components
K X
n
D+ W ((t)) = m k sin(m (t) j (t)) + sin(j (t) k (t)) .
n
j=1
n
+ KX m (t) k (t) m (t) i (t) i (t) k (t)
D W ((t)) = m k 2 sin cos .
n 2 2 2
i=1
Measuring angles counterclockwise and modulo 2, the equality W ((t)) = implies m (t) k (t) = ,
m (t) i (t) [0, ], and i (t) k (t) [0, ]. Moreover,
m i i k m i i k
min cos = cos max = cos(/2),
2 2 2 2
so that
K X
n
D+ W ((t)) m k 2 sin cos .
n 2 2
i=1
Applying the reverse identity 2 sin(x) cos(y) = sin(x y) + sin(x + y), we obtain
n
+ KX
D W ((t)) m k sin() (max i min i ) K sin() .
n i i
i=1
Hence, the W ((t)) does not increase at all t such that W ((t)) = if K sin() Kcritical = maxi i
mini i .
Given the structure of the level sets of 7 K sin(), there exists an open interval of arc lengths
[0, ] satisfying K sin() maxi i mini i if and only if equation (15.14) is true with the strict
equality sign at = /2, that is, if K > Kcritical . Additionally, if K > Kcritical , there exists a unique
min [0, /2[ and a unique max ]/2, ] that satisfy equation (15.14) with the equality sign. In
summary, for every [min , max ], if W ((t)) = , then the arc-length W ((t)) is non-increasing. This
concludes the proof of statement (i).
Moreover, pick max min . For all [min + , max ], there exists a positive () with the
property that, if W ((t)) = , then D+ W ((t)) (). Hence, each solution : R0 Tn starting
in arc (max ) must satisfy W ((t)) min after time at most (max min )/(). This proves
statement (ii).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
214 Chapter 15. Networks of Coupled Oscillators
Regarding statement (iii), we just proved that for every (0) arc (max ) and for all ]min , max ]
there exists a finite time T 0 such that (t) arc () for all t T and for some < /2. It follows that
|i (t) j (t)| < /2 for all {i, j} E and for all t T . We now invoke Corollary 14.3 to conclude
the proof of statement (iii).
The converse statement can be established by noticing that all of the above inequalities and estimates
are exact for a bipolar distribution of natural frequencies i {, } for all i {1, . . . , n}. The full proof
is in (Drfler and Bullo 2011)
Theorem 15.8 (Synchronization test I). Consider the coupled oscillator model (15.15) with frequencies
n defined over a weighted undirected graph with Laplacian matrix L. Assume
1
and define max ]/2, ] and min [0, /2[ as the solutions to (/2) sinc(max ) = sin(min ) =
critical /2 (L). The following statements hold:
(i) phase cohesiveness:
each solution starting
in arc () | kk2, pairs , for [min , max ],
remains in arc () | kk2, pairs for all times,
(ii) asymptotic phase cohesiveness:
each solution starting in arc () | kk2, pairs < max asymptoti-
cally reaches the set arc () | kk2, pairs min ; and
(iii)
asymptotic frequency synchronization:
each solution starting in
arc () | kk2, pairs < max achieves frequency synchronization.
The proof of Theorem 15.8 follows the reasoning of the proof of Theorem 15.7 using the quadratic
2
Lyapunov function
2, pairs . The full proof is in (Drfler and Bullo 2012, Appendix B).
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
15.2. Synchronization of heterogeneous oscillators 215
Theorem 15.9 (Synchronization test II). Consider the coupled oscillator model (15.15) with frequencies
n defined over a weighted undirected graph with Laplacian matrix L. Assume
1
and define min [0, /2[ as the solution to sin(min ) = critical /2 (L). Then there exists a locally exponen-
tially stable equilibrium set [ ] satisfying |i j | min for all {i, j} E.
Proof. Lemma 14.2 guarantees local exponential stability an equilibrium set [ ] satisfying |i j | for
all {i, j} E and for some [0, /2[. In the following we establish conditions for existence of equilibria
this particular set () = { Tn | |i j | , for all {i, j} E}. The equilibrium equations can be
written as
= L(B > ) , (15.18)
where L(B > ) = B diag({aij sinc(i j )}{i,j}E )B > is the Laplacian matrix associated with the graph
with nonnegative edge weights a
G = ({1, . . . , n}, E, A) ij = aij sinc(i j ) aij sinc() > 0 for
{i, j} E and (). Since for any weighted Laplacian matrix L, we have that L L = L L =
In (1/n)1n 1> > >
n , a multiplication of equation (15.18) from the left by B L(B ) yields
Note that the left-hand side of equation (15.19) is a continuous2 function for (). Consider the formal
substitution x = B > , the compact and convex set S () = {x Img(B > ) | kxk } (corresponding
to ()), and the continuous map f : S () R given by f (x) = B > L(x) . Then equation (15.19) is
equivalent to the fixed-point equation
f (x) = x.
We invoke the Brouwers Fixed Point Theorem which states that every continuous map from a compact
and convex set to itself has a fixed point, see for instance (Spanier 1994, Section 7, Corollary 8).
Since the analysis of the map f in the -norm is very hard in the general case, we resort to a 2-norm
>
set S2 () = {x image(B ) | kxk2 } S (). The set S2 ()
analysis and restrict ourselves to the
corresponds to the set { T | 2, edges } in -coordinates. By Brouwers Fixed Point Theorem,
n
there exists a solution x S2 () to the equation x = f (x) if and only if kf (x)k2 for all x S2 (), or
equivalently if and only if
max
B > L(x)
. (15.20)
xS2 () 2
After some bounding (see (Drfler and Bullo 2012, Appendix C) for details), we arrive at
max
B > L(x)
2, edges / (2 (L) sinc()) .
xS2 () 2
2
The continuity can be established when re-writing equations (15.18) and (15.19) in the quotient space 1 >
n , where L(B )
is nonsingular, and using the fact that the inverse of a nonsingular matrix is a continuous function of its elements. See also
(Rakoevi 1997, Theorem 4.2) for a necessary and sufficient conditions for continuity of the Moore-Penrose inverse requiring that
L(B > ) has constant rank for ().
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
216 Chapter 15. Networks of Coupled Oscillators
The term on the right-hand side of the above inequality has to be less or equal than . In summary,
we
conclude that there is a locally exponentially stable synchronization set [ ] { Tn |
2, edges
} () if
2 (L) sin()
2, edges . (15.21)
Since the left-hand side of (15.21) is a concave function of [0, /2[, there exists an open set of
[0, /2[ satisfying equation (15.21) if and only if equation (15.21) is true with the strict equality sign
at = /2, which corresponds to condition (15.17). Additionally, if these two equivalent statements are
exists a unique min [0, /2[ that satisfies equation (15.21) with the equality sign, namely
true, then there
sin(min ) =
2, edges /2 (L). This concludes the proof.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
15.3. Exercises 217
15.3 Exercises
P Kn 2
E15.1 Potential and order parameter. Recall U () = {i,j}E aij 1cos(i j ) . Prove U () = 2 (1r )
for a complete homogeneous graph with coupling strength aij = K/n.
E15.2 Analysis of the two-node case. Present a complete analysis of a system of two coupled oscillators:
1 = 1 a12 sin(1 2 ) ,
2 = 2 a21 sin(2 1 ) ,
where a12 = a21 and 1 + 2 = 0. When do equilibria exist? What are their stability properties and their
basins of attraction?
E15.3 Averaging analysis of coupled oscillators in a semi-circle. Consider the coupled oscillator model (15.1)
with arc () for some < . Show that the coordinate transformations xi = tan(i ), with xi R,
gives the averaging system (15.2) with bij aij cos(/2).
E15.4 Phase synchronization in spring network. Consider the spring network from Example #1 in Subsec-
tion 14.2.1 with identical oscillators, no external torques, and a connected, undirected, and weighted graph:
n
X
Mi i + Di i + aij sin(i j ) = 0 , i {1, . . . , n} .
j=1
Prove the phase synchronization result (in Theorem 15.3) for this spring network.
Pn
E15.5 Synchronization on acyclic graphs. For frequencies i=1 i = 0, consider the coupled oscillator model
Xn
i = aij sin(i j ).
j=1
Assume the adjacency matrix A with elements aij = aji {0, 1} is associated to an undirected, connected,
and acyclic graph. Show that the following statements are equivalent:
(i) there exists a locally stable frequency-synchronized solution in the set { Tn | |i j | <
/2 for all {i, j} E},
>
(ii) B L
< 1, where B and L are the network incidence and Laplacian matrices.
Hint: Follow the derivation in Example 8.12.
E15.6 Distributed averaging-based PI control for coupled oscillators. Consider a set of n controllable coupled
oscillators governed by the second-order dynamics
i = i , (E15.1a)
Xn
Mi i = Di i aij sin(i j ) + ui , (E15.1b)
j=1
where i {1, . . . , n} is the index set, each oscillator has the state (i , i ) T1 R, ui R is a control input
to oscillator i, and Mi > 0 and Di > 0 are the inertia and damping coefficients. The oscillators are coupled
through an undirected, connected, and weighted graph G = (V, E, A) with node set V = {1, . . . , n}, edge
set E V V , and adjacency matrix A = AT Rnn . To reject disturbances affecting the oscillators,
consider the distributed averaging-based integral controller (see Exercise E6.17)
ui = qi , (E15.2a)
Xn
qi = wi bij (qi qj ) , (E15.2b)
j=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
218 Chapter 15. Networks of Coupled Oscillators
where qi R is a controller state for each agent i {1, . . . , n}, and the matrix B with elements bij is the
adjacency matrix of an undirected and connected graph. Your tasks are as follows:
(i) characterize the set of equilibria (? , ? , q ? ) of the closed-loop system (E15.1)-(E15.2),
(ii) show that all trajectories converge to the set of equilibria, and
(iii) show that the phase synchronization set { Tn | i = j for all i, j {1, . . . , n}} together with
= q = 0n is an equilibrium and that it is locally asymptotically stable.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 16
Virus Propagation: Basic Models
In this chapter and the next we present simple models for the diffusion and propagation of infectious
diseases. The proposed models may be relevant also in the context of propagation of information/signals in a
communication network and diffusion of innovations in competitive economic networks. Other interesting
propagation phenomena include failures in power networks and wildfires in forests.
In this chapter and the next, we are interested in (1) models (lumped vs network, deterministic vs
stochastic), (2) asymptotic behaviors (vanishing infection, steady-state epidemic, full contagion), and (3) the
transient propagation of epidemics starting from small initial fractions of infected nodes (possible epidemic
outbreak as opposed to monotonically vanishing infection). In the interest of clarity, we begin with lumped
variables, i.e., variables which represent an entire well-mixed population of nodes. The next chapter will
discuss distributed variable models, i.e., network models. We study three low-dimensional deterministic
models in which nodes may be in one of two or three states; see Figure 16.1.
Figure 16.1: The three basic models SI, SIS and SIR for the propagation of an infectious disease
We say that an epidemic outbreak takes place if a small initial fraction of infected individuals leads to
the contagion of a significant fraction of the population. We say the system displays an epidemic threshold
if epidemic outbreaks occur when some combined value of parameters and initial conditions are above
critical values.
219
220 Chapter 16. Virus Propagation: Basic Models
via the following first-order differential equation, called the susceptibleinfected (SI) model
x(t)
= s(t)x(t) = (1 x(t))x(t), (16.1)
where > 0 is the infection rate. We will see distributed and stochastic versions of this model later in
the chapter. A simple qualitative analysis of this equation can be performed by plotting x over x, see
Figure 16.2.
0.4
x = (1 x)x
0.3
0.2
0.1
0.0
0.2 0.4 0.6 0.8 1.0
x
-0.1
Remark 16.1 (Heuristic modeling assumptions and derivation). Over the interval (t, t+t), pairwise
meetings between individuals in the population take place in the following fashion: assume the population
has n individuals, pick a meeting rate m > 0, and assume that nm t individuals will meet other nm t
individuals. Assuming meetings involve uniformly-selected individuals, over the interval (t, t + t), there
are s(t)2 nm t meetings between a susceptible and another susceptible individual; these meetings, as well
as meetings between infected individuals result in no epidemic propagation. However, there will also be
s(t)x(t)nm t + x(t)s(t)nm t meetings between a susceptible and an infected individual. We assume a
fraction i [0, 1], called transmission rate, of these meetings results in the successful transmission of the
infection:
i s(t)x(t)nm t + x(t)s(t)nm t = 2i m x(t)s(t)nt.
and the SI model (16.1) is the limit at t 0+ , where the infection parameter is twice the product of
meeting rate m and infection transmission fraction i .
Lemma 16.2 (Dynamical behavior of the SI model). Consider the SI model (16.1). The solution from
initial condition x(0) = x0 [0, 1] is
x0 et
x(t) = . (16.2)
1 x0 + x0 et
From all positive initial conditions 0 < x0 < 1, the solution x(t) is monotonically increasing and converges to
the unique equilibrium 1 as t .
It is easy to see that the SI model (16.1) results in an evolution akin to a logistic curve; see Figure 16.3.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
16.2. The SIR model 221
x(t)
1.0
0.8
0.6
0.4
0.2
t
0.0
0 2 4 6 8
Figure 16.3: Evolution of the fraction of infected individuals in the (lumped deterministic) SI model ( = 1), from
initial conditions in the range [.001, .5].
= s(t)x(t),
s(t)
x(t)
= s(t)x(t) x(t), (16.3)
r(t)
= x(t).
Remark 16.3 (Heuristic modeling assumptions and derivation). One can show that the constant
recovery rate assumption corresponds to assuming a so-called Poisson recovery rate for the stochastic version of
the SI model. This is arguably not a very realistic assumption.
Lemma 16.4 (Dynamical behavior of the SIR model). Consider the SIR model (16.3). From each initial
condition s(0) + x(0) + r(0) = 1 with s(0) > 0, x(0) > 0 and r(0) 0, the resulting trajectory t 7
(s(t), x(t), r(t)) has the following properties:
(i) s(t) > 0, x(t) > 0, r(t) 0, and s(t) + x(t) + r(t) = 1 for all t 0;
(ii) t 7 s(t) is monotonically decreasing and t 7 r(t) is monotonically increasing;
(iii) limt (s(t), x(t), r(t)) = (s , 0, r ), where r is the unique solution to the equality
1 r = s(0) exp (r r(0)) ; (16.4)
(iv) if s(0)/ < 1, then t 7 x(t) monotonically and exponentially decreases to zero as t ;
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
222 Chapter 16. Virus Propagation: Basic Models
(v) if s(0)/ > 1, then t 7 x(t) first monotonically increases to a maximum value and then monotonically
decreases to zero as t ; (we describe this case as epidemic outbreak, that is, an exponential growth
of t 7 x(t) for small times).
1.0
1.0
r(t) ( / )r1
0.8 s(0)e , / = 1/4
0.8 s(t)
0.6
0.6
1 r1
0.4 0.4
x(t)
( / )r1
0.2 0.2 s(0)e , / =4
t
r1
5 10 15 20 0.0 0.2 0.4 0.6 0.8 1.0
Figure 16.4: Left figure: evolution of the (lumped deterministic) SIR model from small initial fraction of infected
individuals (and zero recovered); parameters = 2, = 1/4 (case (iv) in Lemma 16.4). Right figure: intersection
between the two curves in equation (16.4) with s(0) = 0.95, r(0) = 0 and / {1/4, 4}. If / = 1/4, then
.05 < r < .1. If / = 4, then .95 < r .
Equation (16.4) follows by taking the limit as t and noting that for all time 1 = s(t) + x(t) + r(t); in
particular, 1 = s + r . The uniqueness of the solution r to equation (16.4) follows from showing there
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
16.3. The SIS model 223
exists a unique intersection between left and right hand side, as illustrated in Figure 16.4. This concludes
the proof of statement (iii).
Regarding statement (iv), note that s(t) being monotonically decreasing and s(0)/ < 1 together imply
s(t)/ s(0)/ < 1 for all time t. This implies x(t) = s(t)x(t) x(t) (s(0)/ 1)x(t) < 0.
By the Grnwall-Bellman Comparison Lemma in Exercise E13.1, we now know that x(t) y(t), where
y = (s(0)/ 1)y so that both y and x decrease exponentially fast to zero. This concludes the proof of
statement (iv).
Regarding statement (v), because x(t) = (s(t) )x(t) and x [0, 1], we know the sign of x(t) is
equal to the sign of s(t) . By assumption, we start with x(0) > 0. From statement (ii), we know
t 7 s(t) is monotonically decreasing. It remains to show that s(t) crosses the zero value in
finite time. By contradiction, assume s(t) 0 for all time t. Then x(t)
0 and hence x(t) x(0) for
x(0)s(t) and, via the Grnwall-Bellman Comparison Lemma
all time t. In turn, this implies that s(t)
in Exercise E13.1, that s(t) decreases to zero exponentially fast as time diverges. This is a contradiction and
concludes the proof of statement (v).
x = ( x)x x = ( x)x
0.0
0.2 0.4 0.6 0.8 1.0
x 0.2
-1.0 -0.2
-1.5
Figure 16.5: Phase portrait of the (lumped deterministic) SIS model for = 1 < = 3/2 and for = 1 > = 1/2.
Lemma 16.5 (Dynamical behavior of the SIS model). For the SIS model (16.5):
(i) the closed form solution to equation (16.5) from initial condition x(0) = x0 [0, 1], for 6= , is
( )x0
x(t) = , (16.6)
x0 e()t ( (1 x0 ))
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
224 Chapter 16. Virus Propagation: Basic Models
(ii) if , all trajectories converge to the unique equilibrium x = 0 (i.e., the epidemic disappears), and
(iii) if > , then, from all positive initial conditions x(0) > 0, all trajectories converge to the unique
exponentially stable equilibrium x = ( )/ < 1 (epidemic outbreak and steady-state epidemic
contagion).
0.5
x(t)
0.4
0.3
0.2
0.1
t
5 10 15 20
Figure 16.6: Evolution of the (lumped deterministic) SIS model from small initial fraction of infected individuals;
= 1 > = .5.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
16.4. Exercises 225
16.4 Exercises
E16.1 Closed-form solutions for SI and SIS models. Verify the correctness of the closed-form solutions for SI
and SIS models given in equations (16.2) and (16.6).
E16.2 Dynamical behavior of the SIS model. Prove Lemma 16.5.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 17
Virus Propagation in Contact Networks
In this chapter we continue our discussion about the diffusion and propagation of infectious diseases.
Starting from the basic lumped models discussed in Chapter 16, we now focus on network models as well
as we discuss some stochastic modelling aspects.
We borrow ideas from the lecture notes by Zampieri (2013) and from Bullo et al. (2016). A detailed
survey about infectious diseases is (Hethcote 2000); a more recent survey is (Nowzari et al. 2016). A very
early work on epidemic models over networks, the spectral radius of the adjacency matrix and the epidemic
threshold is Lajmanovich and Yorke (1976). Later works on similar models include (Wang et al. 2003)
and Van Mieghem et al. (2009); Van Mieghem (2011). Our stochastic analysis is based on the approach
in (Mei and Bullo 2014). Recent extensions and general proofs for the deterministic SIS network model are
given by Khanafer et al. (2014). A related book chapter is (Newman 2010, Chapter 17). The network SIR
model is discussed by Youssef and Scoglio (2011).
The stochastic model The stochastic network SI model, illustrated in Figure 17.1, is defined as follows:
(i) We consider a group of n individuals. The state of each individual is either S for susceptible or I for
infected.
(ii) The n individuals are in pairwise contact, as specified by an undirected graph G with adjacency
matrix A (without self-loops). The edge weights represent the frequency of contact among two
individuals.
(iii) Each individual in susceptible status can transition to infected as follows: given an infection rate
> 0, if a susceptible individual i is in contact with an infected individual j for time t, the
probability of infection is aij t. Each individual can be infected by any neighboring individual:
these random events are independent.
227
228 Chapter 17. Virus Propagation in Contact Networks
(infection rate)
Susceptible Infected
Figure 17.1: In the stochastic network SI model, each susceptible individual (blue) becomes infected by contact with
infected individuals (red) in its neighborhood according to an infection rate .
An approximate deterministic model We define the infection variable at time t for individual i by
(
1, if node i is in state I at time t,
Yi (t) =
0, if node i is in state S at time t,
and the expected infection, which turns out to be equal to the probability of infection, of individual i by
In what follows it will be useful to approximate P[Yi (t) = 0 | Yj (t) = 1] with P[Yi (t) = 0], that is, to
require Yi and Yj to be independent for arbitrary i and j. We claim this approximation is acceptable over
certain graphs with large numbers n of individuals. The final model, which we obtain below based on the
Independence Approximation, is an upper bound on the true model because P[Yi (t) = 0] P[Yi (t) =
0 | Yj (t) = 1].
Definition 17.1 (Independence Approximation). For any two individuals i and j, the infection variables
Yi and Yj are independent.
Theorem 17.2 (From the stochastic to the deterministic network SI model). Consider the stochastic
network SI model with infection rate over a contact graph with adjacency matrix A. The probabilities of
infection satisfy
X n
d
P[Yi (t) = 1] = aij P[Yi (t) = 0, Yj (t) = 1].
dt
j=1
Moreover, under the Independence Approximation 17.1, the probabilities of infection xi (t) = P[Yi (t) = 1],
i {1, . . . , n}, satisfy (deterministic) network SI model defined by
n
X
x i (t) = (1 xi (t)) aij xj (t).
j=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
17.1. The stochastic network SI model 229
where O(t2 ) is a function upper bounded by a constant times t2 . The complementary probability, i.e.,
the probability of infection for time t is:
n
X
P[Yi (t + t) = 1 | Yi (t) = 0, Yi (t)] = aij Yj (t)t + O(t2 ).
j=1
We are now ready to study the random variable Yi (t + t) Yi (t), given Yi (t):
E[Yi (t+t) Yi (t) | Yi (t)]
= 1 P[Yi (t + t) = 1, Yi (t) = 0 | Yi (t)]
+ 0 P (Yi (t + t) = Yi (t) = 0) or (Yi (t + t) = Yi (t) = 1) | Yi (t) (by def. expectation)
= P[Yi (t + t) = 1 | Yi (t) = 0, Yi (t)] P[Yi (t) = 0 | Yi (t)] (by conditional prob.)
X n
= aij Yj (t)t + O(t2 ) P[Yi (t) = 0 | Yi (t)].
j=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
230 Chapter 17. Virus Propagation in Contact Networks
where, for example, the first summation is taken over all possible values yi that the variable Yi (t) takes.
In summary, we know
n
X
E[Yi (t + t) Yi (t)] = aij t P[Yi (t) = 0, Yj (t) = 1] + O(t2 ),
j=1
The final step is an immediate consequence of the Independence Approximation: P[Yi (t) = 0, Yj (t) = 1] =
P[Yi (t) = 0 | Yj (t) = 1] P[Yj (t) = 1] (1 P[Yi (t) = 1]) P[Yj (t) = 1].
(infection rate)
Susceptible Infected
Figure 17.2: In the (deterministic) network SI model, each node is described by a probability of infection taking value
between 0 (blue) and 1 (red). The rate at which individuals become increasingly infected is parametrized by the
infection rate .
Consider an undirected weighted graph G = (V, E) of order n with adjacency matrix A and degree
matrix D = diag(A1n ). Let xi (t) [0, 1] denote the fraction of infected individuals at node i V at time
t R0 . The network SI model is
n
X
x i (t) = (1 xi (t)) aij xj (t), (17.1)
j=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
17.2. The network SI model 231
Alternatively, in terms the fractions of susceptibile individuals s = 1n x, the network SI model reads
Theorem 17.3 (Dynamical behavior of the network SI model). Consider the network SI model (17.1).
Assume G is connected so that A is irreducible; let D denote the degree matrix. The following statements hold:
(i) if x(0), s(0) [0, 1]n , then x(t), s(t) [0, 1]n for all t 0;
(ii) there are two equilibrium points: 0n (no epidemics), and 1n (full contagion);
(iii) the linearization of model (17.1) about the equilibrium point 0n is x = Ax and it is exponentially
unstable;
(iv) the linearization of model (17.2) about the equilibrium 0n is s = Ds and it is exponentially stable;
(v) each trajectory with initial condition x(0) 6= 0n converges asymptotically to 1n , that is, the epidemics
spreads to the entire network.
Proof. Statement (i) can be proved by evaluating the vector field (17.1) at the boundaries of the admissible
state space that is for x [0, 1]n such that at least one entry i satisfies xi {0, 1}. We leave the detailed
proof of statement (i) to the reader.
We now prove statement (ii). The point x is an equilibrium point if and only if:
In diag(x) Ax = 0n Ax = diag(x)Ax.
Clearly, 0n and 1n are equilibrium points. Hence we just need to show that no other points can be equilibria.
First, suppose that there exists an equilibrium point x with 0n x < 1n . But then In diag(x)
Pn1 hask strictly
positive diagonal and therefore
Pn1 k x must satisfy Ax = 0 n . Note that Ax = 0 n implies also k=1 A x = 0n .
If A is irreducible, then k=1 A has all off-diagonal terms strictly positive. Because xi [0, 1[, the only
possible solution to Ax = 0n is therefore x = 0n . This is a contradiction.
Next, suppose there exists an equilibrium point x = (x1 , x2 ) with 0n1 x1 < 1n1 , x2 = 1n2 , and
n1 + n2 = n. The equality Ax = diag(x)Ax implies Ax = diag(x)k Ax for all k N and, in turn,
k 0n1 n1 0n1 n2
Ax = lim diag(x) Ax = Ax.
k 0n2 n1 In2
By partitioning A in corresponding blocks, the previous equality implies A11 x1 + A12 x2 = 0n1 . Because
x2 = 1n2 we know that A12 = 0n1 n2 and, therefore, that A is reducible. This contradiction concludes the
proof of statement (ii).
Statements (iii) and (iv) are straightforward computations:
x
= In diag(x) Ax = Ax diag(x)Ax Ax,
s
= diag(s)A(1n s) = diag(s)A1n + diag(s)As = Ds + diag(s)As Ds,
where we used the equality diag(y)z = diag(z)y for y, z Rn . Exponential stability of the linearization
s = Ds is obvious, and the PerronFrobenius Theorem 2.12 for irreducible matrices implies the existence
of the unstable positive eigenvalue (A) > 0 for the linearization x = Ax.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
232 Chapter 17. Virus Propagation in Contact Networks
To show statement (v), consider the function V (x) = 1> n (1n x); this is a smooth functiondefined
over the compact and forward invariant set [0, 1]n (see (i)). We compute V = 1> n In diag(x) Ax and
note that V 0 for all x, and V (x) = 0 if and only if x {0n , 1n }. Because of these facts, the LaSalle
Invariance Principle in Theorem 13.4 implies all trajectories with x(0) converge asymptotically to either
1n or 0n . Additionally, note that 0 V (x) n for all x [0, 1]n , that V (x) = 0 if and only if x = 1n and
that V (x) = n if and only if x = 0n . Therefore, all trajectories with x(0) 6= 0n converge asymptotically to
1n .
Before proceeding, we review the notion of dominant eigenvector and introduce some notation. Let
max = (A) be the dominant eigenvalue of the adjacency matrix A and let vmax be the corresponding
positive eigenvector normalized to satisfy 1>
n vmax = 1. (Recall that these definitions are well posed because
of the PerronFrobenius Theorem 2.12 for irreducible matrices.) Additionally, let vmax , v2 . . . , vn denote an
orthonormal set of eigenvectors with corresponding eigenvalues max > 2 n for the symmetric
adjacency matrix A.
Consider now the onset of an epidemics in a large population characterized by a small initial infection
x(0) = x0 1n . So long as x(t) 1n , the system evolution is approximated by x = Ax. This
initial-times linear evolution satisfies
n
X
>
x(t) = vmax x0 emax t vmax + vi> x0 ei t vi
i=2
= emax t >
vmax x0 vmax + o(t) , (17.3)
where o(t) is a function exponentially vanishing as t . In other words, the epidemics initially
experiences exponential growth with rate max and with distribution among the nodes given by the
eigenvector vmax .
We start our analysis with useful preliminary notions. We define the monotonically-increasing functions
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
17.3. The network SIS model 233
for y R0 and z [0, 1]. One can easily verify that f+ (f (z)) = z for all z [0, 1]. For vector variables
y Rn0 and z [0, 1]n , we write F+ (y) = (f+ (y1 ), . . . , f+ (yn )), and F (z) = (f (z1 ), . . . , f (zn )).
Denoting A = A/ and assuming x < 1n , the model (17.5) is rewritten as:
x = F (x) = diag(1n x) Ax F (x) ,
so that
F (x) 0n F (x)
Ax x.
F+ (Ax)
= F (x ) or, equivalently, if and
Moreover, x is an equilibrium point (F (x ) = 0n ) if and only if Ax
) = x . We are now ready to present our results in two theorems.
only if F+ (Ax
Theorem 17.4 (Dynamical behavior of the network SIS model: below the threshold). Consider the
network SIS model (17.4) over an undirected graph G with infection rate and a recovery rate . Assume G is
connected, let A be its adjacency matrix with dominant eigenvalue max . If max / < 1, then
(i) there exists a unique equilibrium point 0n ,
(ii) the linearization of model (17.4) about the equilibrium 0n is x = (A In )x and it is exponentially
stable; and
(iii) from any initial condition x(0) 6= 0n , the weighted average t 7 vmax
> x(t) is monotonically and
Theorem 17.5 (Dynamical behavior of the network SIS model: above the threshold). Consider the
network SIS model (17.4) over an undirected graph G with infection rate and a recovery rate . Assume
G is connected, let A be its adjacency matrix with dominant eigenpair (max , vmax ) and with degree vector
d = A1n . If max / > 1, then
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
234 Chapter 17. Virus Propagation in Contact Networks
(i) 0n is an equilibrium point, the linearization of system (17.5) at 0n is unstable with dominant unstable
eigenvalue max and with dominant eigenvector vmax , i.e., there will be an epidemic outbreak;
(ii) besides the equilibrium 0n , there exists a unique other equilibrium point x such that
a) x > 0n ,
b) x = vmax + O( 2 ), for := max / 1, as 0+ ,
c) x = 1n (/) diag(d)1 1n + O( 2 / 2 ), at fixed A as / 0+ ,
d) x = limk y(k), where the monotonically-increasing {y(k)}kZ0 [0, 1]n is defined by
X
n
yi (k + 1) := f+ aij yj (k) , y(0) := vmax ,
(1 + )2
j=1
(iii) if x(0) 6= 0n , then x(t) x as t . Moreover, if x(0) < x (resp. x(0) > x ), then t 7 x(t) is
monotonically increasing (resp. decreasing).
Note: statement (i) means that, near the onset of an epidemic outbreak, the exponential growth rate
is max and the outbreak tends to align with the dominant eigenvector vmax as in the discussion
leading up to the approximate evolution (17.3).
Proof of selected statements in Theorem 17.5. Statement (i) follows from the same analysis of the linearized
system as in the proof of Theorem 17.4(ii).
We next focus on the statements (ii). We begin by establishing two properties of the map x 7 F+ (Ax).
First, we claim that, y > z 0n implies F+ (Ay) > F+ (Az). Indeed, note that G being connected implies
that the adjacency matrix A has at least one strictly positive entry in each row. Hence, y z > 0n implies
z) > 0n and, since f+ is monotonically increasing, Ay
A(y > Az implies F+ (Ay) > F+ (Az).
n
Second, we claim that there exists an x [0, 1] satisfying F+ (Ax) > x. Indeed, let max = max (A)
=
max (A)/ > 1 and compute for any > 0
max )) = f+ (
F+ (A(v max vmax,i ) > vmax,i max
2 vmax,i ,
i max
max 1)/
where we used the scalar inequality /(1 + ) > (1 ), for > 0. For = ( 2 and recalling
max
vmax,i < 1 for each i, compute
max
2 vmax,i =
max (
max 1)vmax,i >
max (
max 1) = 1.
max
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
17.4. The network SIR model 235
Regarding the statement (ii)b, we claim there exists a bounded sequence {w(k, )}kZ0 Rn such
that the sequence {y(k)}kZ0 satisfies y(k) = vmax + 2 w(k, ). The statement x = vmax + O( 2 ) is
then an immediate consequence of this claim and of the limit limk y(k) = x . We prove the claim by
induction. Because /(1+)2 = 2 2 +O( 3 ), the claim is true for k = 0 with w(0, ) = 2vmax +O().
We now assume the claim is true at k and show it is true at k + 1:
max + 2 w(k, ))
yi (k + 1) = F+ A(v
= F+ (1 + )vmax + 2 Aw(k, )
= F+ vmax + 2 (Aw(k, ) + vmax )
= vmax + 2 (Aw(k, ) + vmax )
2 2
diag vmax + (Aw(k, ) + vmax ) vmax + (Aw(k, ) + vmax ) + O( 3 )
= vmax + 2 Aw(k, ) + vmax diag(vmax )vmax + O() ,
where we used the Taylor expansion F+ (y) = y diag(y)y + O(kyk3 ). Hence, the claim is true if the
sequence {w(k, )}kZ0 defined by
w(k + 1, ) = Aw(k, ) + vmax diag(vmax )vmax + O()
is bounded. But the sequence is bounded because the spectral radius of A equals max / < 1. This
concludes the proof of statement (ii)b. The proof of statement (ii)c is analogous: it suffices to show the
existence of a bounded sequence {w(k)} such that y(k) = 1n (/) diag(d)1 1n + (/)2 w(k).
To complete the proof of statement (ii) we establish the uniqueness of the equilibrium x [0, 1]n \{0n }.
First, we claim that an equilibrium point with an entry equal to 0 must be 0n . Indeed,
Pn assume y is an
equilibrium point and assume yi = 0 for some i {1, . . . , n}. The equality yi = f+ ( j=1 aij yj ) implies
that also any node j with aij > 0 must satisfy yj = 0. Because G is connected, all entries of y must be
zero. Second, by contradiction, we assume there exists another equilibrium point y > 0n distinct from x .
Without loss of generality, assume there exists i such that yi < xi . Let (0, 1) satisfy y x > 0n
and yi = xi . Note:
) y = f+ (Ay
F+ (Ay )i xi
i
)i xi
f+ (Ax (because A 0)
)i xi
> f+ (Ax (because f+ (y) > f+ (y) for < 1)
) x = 0.
= F+ (Ax (because x is an equilibrium)
i
Therefore F+ (Ay ) y > 0n and this is a contradiction.
i
Regarding statement (iii) we refer to (Lajmanovich and Yorke 1976; Fall et al. 2007; Khanafer et al. 2014)
in the interest of brevity.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
236 Chapter 17. Virus Propagation in Contact Networks
where > 0 is the infection rate and > 0 is the recovery rate. Note that the third equation is redundant
because of the constraint si (t) + xi (t) + ri (t) = 1 and that, therefore, we regard the dynamical system as
described by the first two equations and write it in vector form as
s = diag(s)Ax,
(17.6)
x = diag(s)Ax x.
Theorem 17.6 (Dynamical behavior of the network SIR model). Consider the network SIR model (17.6)
over an undirected graph G with infection rate and a recovery rate . Assume G is connected and let A be
its adjacency matrix. Let (max,t , vmax,t ) be the dominant eigenpair for the nonnegative matrix diag(s(t))A.
The following statements hold:
(ii) the set of equilibrium points is the set of pairs (s , 0n ), for any s [0, 1]n , and the linearization of
model (17.6) about an equilibrium point (s , 0n ) is
s = diag(s )Ax,
(17.7)
x = diag(s )Ax x;
(iii) (behavior above the threshold) if max,0 > and x(0) 6= 0n , then,
(iv) (behavior below the threshold) for all 0 such that max, < , the weighted average t 7
>
vmax, x(t), for t , is monotonically and exponentialy decreasing to zero.
(v) each trajectory converges asymptotically to an equilibrium point, that is, limt x(t) = 0n so that the
epidemics asymptotically disappears.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
17.5. Exercises 237
17.5 Exercises
E17.1 Network SI model in digraphs. Generalize Theorem 17.3 to the setting of strongly-connected directed
graphs:
(i) what are the equilibrium points?
(ii) what are their convergence properties?
E17.2 Initial evolution of network SIS model. Consider the network SIS model with initial fraction x(0) = x0 ,
where we take x0 1n and 1. Show that in the time scale t() = ln(1/)/(max ), the linearized
evolution satisfies
>
lim+ x t() = vmax x0 vmax .
0
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Chapter 18
Lotka-Volterra Population Dynamics
The Lotka-Volterra model is one of the simplest frameworks for modeling the dynamics of interacting
populations in mathematical ecology. These equations were originally developed in (Lotka 1920; Volterra
1928). Treatment based on (Goh 1980; Takeuchi 1996; Baigent 2010). We refer to (Baigent 2010) for
additional results on conservative Lotka-Volterra models (Hamiltonian structur and existence of periodic
orbits), competitive and monotone models. We refer to (Hofbauer and Sigmund 1998; Sandholm 2010) for
comprehensive discussions about the connection with evolutionary game dynamics.
(a) Common Clownfish (Amphiprion (b) The Canadian Lynx (Lynx canaden- (c) Subadult male lion (Panthera Leo)
ocellaris) near Magnificent Sea sis) is a major predator of the Snow- and spotted hyena (Crocuta Crocuta)
Anemones (Heteractis magnifica) on the shoe Hare (Lepus americanus). Histori- compete for the same resources in the
Great Barrier Reef, Australia. Clownfish cal records of animals captures indicate Maasai Mara National Reserve in Narok
and anemones provide an example that the lynx and hare numbers rise and County, Kenya. (Picture "Hynen und
of ecological mutualism in that each fall periodically; see Odum (1959). Photo Lwe im Morgenlicht" by lubye134 is
species benefits from the activity of the source: Rudolfos Usenet Animal Pic- licensed under CC BY 2.0)
other. tures Gallery (no longer in existence).
239
240 Chapter 18. Lotka-Volterra Population Dynamics
Single-species constant growth model In a simplest model, one may assume x/x is equal to a constant
growth rate r. This assumption however leads to exponential growth or decay x(t) = x(0) ert depending
upon whether r is positive or negative. Of course, exponential growth may be reasonable only for short
periods of time and violates a reasonable assumption of bounded resources for large times.
Single-species logistic growth model In large populations it is natural to assume that resources would
= r(1 x/K),
diminish with the growing size of the population. In a simplest model, one may assume x/x
where r > 0 is the intrinsic growth rate and K > 0 is called the carrying capacity. This assumption leads
to the so-called logistic equation
x = rx(1 x/K). (18.1)
This dynamical system has the following behavior:
(iii) all solutions with 0 < x(0) < K are monotonically increasing and converge asymptotically to K,
(iv) all solutions with K < x(0) are monotonically decreasing and converge asymptotically to K.
The reader is invited to show these facts and related ones in Exercise E18.1. The evolution of the logistic
equation from multiple initial values is illustrated in Figure 18.2.
Kx(0)ert
x(t) =
K + x(0)(ert 1)
t
1/r 2/r 3/r 4/r 5/r
Multi-species Lotka-Volterra model with signed interations Finally, we consider the case of n 2
interacting species. We assume logistic growth model for each species with an additional term due to the
interaction with the other species. Specifically, we write the growth rate for species i {1, . . . , n},
n
X
x i
= ri + aii xi + aij xj , (18.2)
xi
j=1,j6=i
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
18.1. The Lotka-Volterra population model: setup 241
where the first two terms are the logistic equation (so that aii is typically negative because of bounded
resources and the carrying capacity is Ki = ri /aii ), and the third term is the combined effect of the
pairwise interactions with all other species.
The vector r is called the intrisic growth rate, the matrix A = [aij ] is called the interaction matrix,
and the the ordinary differential equations (18.2) are called the Lotka-Volterra model for n 2 interacting
species. This model is written in vector form as
x = diag(x) Ax + r =: fLK (x). (18.3)
For any two species i and j, the sign of aij and aji in the interaction matrix A is determined by which
of the following three possible types of interaction is being modelled:
(+, +) = mutualism: for aij > 0 and aji > 0, the two species are in symbiosis and cooperation. The
presence of species i has a positive effect on the growth of species j and vice versa.
(+,-) = predation: for aij > 0 and aji < 0, the species are in a predator-prey or host-parasite relationship.
In other words, the presence of a prey (or host) species j favors the growth of the predator (or parasite)
species i, wheres the presence of the predator species has a negative effect on the growth of the prey.
(-,-) = competition: for aij < 0 and aji < 0, the two species compete for a common resources of sorts
and have therefore a negative effect on each other.
Note: the typical availability of bounded resources suggests it is ecologically meaningful to assume that
the interaction matrix A is Hurwitz and that, to model the setting in which species live in isolation, the
diagonal entries aii are negative.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
242 Chapter 18. Lotka-Volterra Population Dynamics
18.2.1 Mutualism
Here we assume inter-species mutualism, that is, we assume both inter-species coefficients a12 and a21
are positive. We identify two distinct parameter ranges corresponding to distinct dynamic behavior and
illustrate them in Figure 18.3.
x1 -null-line x2 -null-line
x2 -null-line
r2 /a22
r2 /a22
x2 =
x2 =
x1 -null-line
x1 = r1 /a11 x1 = r1 /a11
Case I: a12 > 0, a21 > 0, a12 a21 < a11 a22 . There exists a Case II: a12 > 0, a21 > 0, a12 a21 > a11 a22 . There exists
unique positive equilibrium point. All trajectories starting in no positive equilibrium point. All trajectories starting in R2>0
R2>0 converge to the equilibrium point. diverge.
Figure 18.3: Two possible cases of mutualism in the two-species Lotka-Volterra system
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
18.2. Two-species model and analysis 243
18.2.2 Competition
Here we assume inter-species competition, that is, we assume both inter-species coefficients a12 and a21
are negative. We identify four distinct parameter ranges corresponding to distinct dynamic behavior and
illustrate them in Figure 18.4.
r2 /a22
r1 /a12
x2 -null-line
x1 -null-line
r2 /a22
r1 /a12
x2 -null-line
x1 -null-line
r2 /a22
r2 /a22
x1 -null-line x2 -null-line
r1 /a12
x2 -null-line x1 -null-line
Figure 18.4: Four possible cases of competition in the two-species Lotka-Volterra system
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
244 Chapter 18. Lotka-Volterra Population Dynamics
Lemma 18.1 (Basic properties). For n 2, the Lotka-Volterra system (18.3) is a positive system, i.e.,
x(0) 0 implies x(t) 0 for all subsequent t. Moverover, if xi (0) = 0, then xi (t) = 0 for all subsequent t.
Therefore, without loss of generality we can assume that all initial conditions are positive vectors that
is, in Rn>0 . In other words, the best we can hope for is to establish that an equilibrium point is globally
asymptotically stable on Rn>0 . We are now ready to state the main result of this section, due to (Goh 1979).
Theorem 18.2 (Sufficient conditions). For the Lotka-Volterra system (18.3) with interaction matrix A and
intrinsic growth rate r, assume
(A1) A is diagonally stable, i.e., there exists a positive vector d such that diag(d)A + A> diag(d) is negative
definite.
(A2) the unique equilibrium point x = A1 r is positive,
Proof. Note that A diagonally stable implies A Hurwitz and invertible. For K > 0, define the linear-minus-
logarithmic function Vlin-log,K : R>0 R by
x
Vlin-log,K (x) = x K K log .
K
Define V : Rn>0 R0 by
n
X n
X
V (x) = di Vlin-log,xi (xi ) = di xi xi xi log(xi /xi ) .
i=1 i=1
The reader is invited to show in Exercise E18.2 that the function Vlin-log,K is continuously differentiable,
takes nonnegative values and satisfies Vlin-log,K (xi ) = 0 if and only if xi = K. Moreover, this function is
unbounded in the limits as xi and xi 0+ . Therefore, V is globally positive-definite about x and
is radially unbounded.
Next, we compute the Lie derivative of V along the flow of the Lotka-Volterra vector field fLK (x) =
diag(x)(Ax + r). First, compute dxd i Vxi (xi ) = (xi xi )/xi , so that
n
X xi xi
LfLK V (x) = di (fLK (x))i .
xi
i=1
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
18.4. Cooperative Lotka-Volterra models 245
VK (x) = x K K log(x/K)
K 2K 3K 4K
x
Figure 18.5: The function Vlin-log,K (x) = x K K log(x/K) studied in Exercise E18.2
This implies that LfLK V (x) 0 with equality if and only if x = x . Therefore, LfLK V is globally negative-
definite about x . According to the Lyapunov Theorem 13.3, x is globally asymptotically stable on
Rn>0 .
Note: Assumption (A2) is not critical and, via a more complex treatment, a more general theorem can
be obtained. Under Assumption (A1) about A being diagonally stable, (Takeuchi 1996, Theorem 3.2.1)
shows the existence of a unique nonnegative and globally stable equilibrium point x for each r Rn ; the
existence and uniqueness of x is established via a linear complementarity problem.
Lemma 18.3 (Unbounded evolutions for unstable Metzler matrices). If A is Metzler and has a positive
dominant eigenvalue, then the Lotka-Volterra systems has unbounded solutions in R>0 .
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
246 Chapter 18. Lotka-Volterra Population Dynamics
If the dominant eigenvalue is negative, then the Metzler matrix is Hurwitz; this case was studied in
Theorem 9.5. We here provide a useful extension of that theorem.
Theorem 18.4 (Properties of Hurwitz Metzler matrices: continued). For a Metzler matrix A, the
following statements are equivalent:
(i) A is Hurwitz,
(ii) A is invertible and A1 0,
(iii) for all b 0n , there exists x 0n solving Ax + b = 0n ,
(iv) A is negative diagonally dominant, i.e., there exists > 0n such that A < 0n , and
(v) A is diagonally stable, i.e., there exists a positive-definite diagonal matrix P such that A> P + P A < 0.
Proof. The equivalence between statements (i), (ii), and (iii) is established in Theorem 9.5.
(ii) = (iv) Set = A1 1n . Because A1 0 is invertible, it can have no rows identically equal to
zero. Hence = A1 1 > 0n . Moreover A = 1n < 0n .
(iv) = (i) We follow the steps in (Baigent 2010, Lemma 6). Let be an eigenvalue of A with eigenvector
w Rn by wi = vi /i , for i {1, . . . , n}, where is as in statement (iv). We have therefore
v. Define P
i wi = nj=1 aij j wj . If ` is the index satisfying |w` | = maxi |wi | > 0, then
n
X wj
` = a`` ` + a`j j ,
w`
j=1,j6=`
where the last equality follows from the `-th row of the inequality A < 0n . Therefore, | a`` | <
a`` . This inequality implies that the eigenvalue must belong to an open disc in the complex plan
with center a`` < 0 and radius |a`` |. Hence, , together with all othe eigenvalues of A, must have
negative real part.
(iv) = (v) From statement (iv) applied to A and A> , let > 0n satisfy A < 0n and > 0n satisfy
A> < 0n . Define P = diag(1 /1 , . . . , n /n ) and consider the symmetric matrix A> P + P A.
This matrix is Metzler and satisfies (A> P + P A) = A> + P A < 0n . Hence, A> P + P A is
negative diagonally dominant and, because (iv) = (i), Hurwitz. In summary, A> P + P A is
symmetric and Hurwitz, hence, it is negative definite.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
18.4. Cooperative Lotka-Volterra models 247
Theorem 18.5 (Global convergence for cooperative Lotka-Volterra). For the Lotka-Volterra system (18.3),
assume
Then there exists a unique interior equilibrium point x and x is globally attractive on Rn>0 .
Proof. We leave it to the reader to verify that the Assumptions (A1) and (A2) of Theorem 18.2 are satisfied
so that its consequences hold.
|xi xi |
Note: In (Baigent 2010), Theorem 18.5 is established via the Lyapunov function V (x) = maxi{1,...,n} i ,
where x is the equilibrium point and diag(1 , . . . , n ) is the diagonal Lyapunov matrix for A.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
248 Chapter 18. Lotka-Volterra Population Dynamics
18.5 Exercises
E18.1 Logistic ordinary differential equation. For r > 0 and K > 0, consider the logistic equation (18.1)
defined by
x = rx(1 x/K),
for x R0 . Show that
(i) there are two equilibrium points 0 and K,
(ii) the solution is
Kx(0) ert
x(t) = ,
K + x(0)(ert 1)
(iii) all solutions with 0 < x(0) < K are monotonically increasing and converge asymptotically to K,
(iv) all solutions with K < x(0) are monotonically decreasing and converge asymptotically to K, and
(v) if x(0) < K/2, then the solution x(t) has an inflection point when x(t) = K/2.
E18.2 The linear-minus-logarithmic function. For K > 0, define the function Vlin-log,K : R>0 R by
x
Vlin-log,K (x) = x K K log .
K
Show that
d
(i) Vlin-log,K is continuously differentiable and dx VK (x) = (x K)/x,
(ii) Vlin-log,K (x) = 0 if and only if x = K,
(iii) Vlin-log,K (x) > 0 for all x > 0, x 6= K, and
(iv) limx0+ Vlin-log,K (x) = limx Vlin-log,K (x) = +.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography
J. A. Acebrn, L. L. Bonilla, C. J. P. Vicente, F. Ritort, and R. Spigler. The Kuramoto model: A simple
paradigm for synchronization phenomena. Reviews of Modern Physics, 77(1):137185, 2005.
D. Acemoglu and A. Ozdaglar. Opinion dynamics and learning in social networks. Dynamic Games and
Applications, 1(1):349, 2011.
D. Acemoglu, G. Como, F. Fagnani, and A. Ozdaglar. Opinion fluctuations and disagreement in social
networks. Mathematics of Operation Research, 38(1):127, 2013.
R. P. Agaev and P. Y. Chebotarev. The matrix of maximum out forests and its applications. Automation and
Remote Control, 61(9):14241450, 2000.
B. D. O. Anderson, C. Yu, B. Fidan, and J. M. Hendrickx. Rigid graph control architectures for autonomous
formations. IEEE Control Systems Magazine, 28(6):4863, 2008.
M. Arcak. Passivity as a design tool for group coordination. IEEE Transactions on Automatic Control, 52(8):
13801390, 2007.
L. Asimow and B. Roth. The rigidity of graphs, II. Journal of Mathematical Analysis and Applications, 68(1):
171190, 1979.
H. Bai, M. Arcak, and J. Wen. Cooperative Control Design, volume 89. Springer, 2011. ISBN 1461429072.
S. Baigent. Lotka-Volterra Dynamics An Introduction. Preprint, Mar. 2010. University of College, London.
P. Barooah. Estimation and Control with Relative Measurements: Algorithms and Scaling Laws. PhD thesis,
University of California at Santa Barbara, July 2007.
249
250 Bibliography
P. Barooah and J. P. Hespanha. Estimation from relative measurements: Algorithms and scaling laws. IEEE
Control Systems Magazine, 27(4):5774, 2007.
D. Bauso and G. Notarstefano. Distributed-player approachability and consensus in coalitional games. IEEE
Transactions on Automatic Control, 60(11):31073112, 2015.
M. Benzi, G. H. Golub, and J. Liesen. Numerical solution of saddle point problems. Acta Numerica, 14:1137,
2005.
A. R. Bergen and D. J. Hill. A structure preserving model for power system stability analysis. IEEE
Transactions on Power Apparatus and Systems, 100(1):2535, 1981.
A. Berman and R. J. Plemmons. Nonnegative Matrices in the Mathematical Sciences. SIAM, 1994. ISBN
978-0-89871-321-3.
D. S. Bernstein. Matrix Mathematics. Princeton University Press, 2 edition, 2009. ISBN 0691140391.
N. Biggs. Algebraic Graph Theory. Cambridge University Press, 2 edition, 1994. ISBN 0521458978.
V. D. Blondel and A. Olshevsky. How to decide consensus? a combinatorial necessary and sufficient condition
and a proof that consensus is decidable but np-hard. SIAM Journal on Control and Optimization, 52(5):
27072726, 2014.
S. Bolognani, S. Del Favero, L. Schenato, and D. Varagnolo. Consensus-based distributed sensor calibration
and least-square parameter identification in WSNs. International Journal of Robust and Nonlinear
Control, 20(2):176193, 2010.
P. Bonacich. Technique for analyzing overlapping memberships. Sociological Methodology, 4:176185, 1972a.
P. Bonacich. Factoring and weighting approaches to status scores and clique identification. Journal of
Mathematical Sociology, 2(1):113120, 1972b.
S. Boyd, P. Diaconis, and L. Xiao. Fastest mixing Markov chain on a graph. SIAM Review, 46(4):667689,
2004.
S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah. Randomized gossip algorithms. IEEE Transactions on
Information Theory, 52(6):25082530, 2006.
U. Brandes. Centrality: concepts and methods. Slides, May 2006. The International Workshop/School and
Conference on Network Science.
U. Brandes and T. Erlebach. Network Analysis: Methodological Foundations. Springer, 2005. ISBN 3540249796.
L. Breiman. Probability, volume 7 of Classics in Applied Mathematics. SIAM, 1992. ISBN 0-89871-296-3.
Corrected reprint of the 1968 original.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography 251
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 30:
107117, 1998.
E. Brown, P. Holmes, and J. Moehlis. Globally coupled oscillator networks. In E. Kaplan, J. E. Marsden,
and K. R. Sreenivasan, editors, Perspectives and Problems in Nonlinear Science: A Celebratory Volume in
Honor of Larry Sirovich, pages 183215. Springer, 2003.
A. M. Bruckstein, N. Cohen, and A. Efrat. Ants, crickets, and frogs in cyclic pursuit. Technical Re-
port CIS 9105, Center for Intelligent Systems, Technion, Haifa, Israel, July 1991. Available at
http://www.cs.technion.ac.il/tech-reports.
J. Buck. Synchronous rhythmic flashing of fireflies. II. Quarterly Review of Biology, 63(3):265289, 1988.
F. Bullo, J. Corts, and S. Martnez. Distributed Control of Robotic Networks. Princeton University Press,
2009. ISBN 978-0-691-14195-4. URL http://www.coordinationbook.info.
F. Bullo, W. Mei, S. Mohagheghib, and S. Zampieri. Nonlinear propagation models in contact networks. To
be submitted, 2016.
R. Carli, F. Fagnani, A. Speranzon, and S. Zampieri. Communication constraints in the average consensus
problem. Automatica, 44(3):671684, 2008.
R. Carli, F. Garin, and S. Zampieri. Quadratic indices for the analysis of consensus algorithms. In Information
Theory and Applications Workshop, pages 96104, San Diego, CA, USA, Feb. 2009.
H. Caswell. Matrix Population Models. Sinauer Associates, 2 edition, 2006. ISBN 087893121X.
N. D. Charkes, P. T. M. Jr, and C. Philips. Studies of skeletal tracer kinetics. I. digital-computer solution of
a five-compartment model of [18f] fluoride kinetics in humans. Journal of Nuclear Medicine, 19(12):
13011309, 1978.
S. Chatterjee and E. Seneta. Towards consensus: Some convergence theorems on repeated averaging.
Journal of Applied Probability, 14(1):8997, 1977.
A. Cherukuri and J. Corts. Asymptotic stability of saddle points under the saddle-point dynamics. In
American Control Conference, Chicago, IL, USA, July 2015. To appear.
R. Cogburn. The ergodic theory of Markov chains in random environments. Zeitschrift fr Wahrschein-
lichkeitstheorie und verwandte Gebiete, 66(1):109128, 1984.
G. Como, K. Savla, D. Acemoglu, M. A. Dahleh, and E. Frazzoli. Robust distributed routing in dynamical
networks Part I: locally responsive policies and weak resilience. IEEE Transactions on Automatic
Control, 58(2):317332, 2013.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
252 Bibliography
S. Coogan and M. Arcak. A compartmental model for traffic networks and its dynamical behavior. IEEE
Transactions on Automatic Control, 60(10):26982703, 2015.
E. Cristiani, B. Piccoli, and A. Tosin. Multiscale Modeling of Pedestrian Dynamics. Springer, 2014. ISBN
978-3-319-06619-6.
S. M. Crook, G. B. Ermentrout, M. C. Vanier, and J. M. Bower. The role of axonal delay in the synchronization
of networks of coupled cortical oscillators. Journal of Computational Neuroscience, 4(2):161172, 1997.
H. Daido. Quasientrainment and slow relaxation in a population of oscillators with random and frustrated
interactions. Physical Review Letters, 68(7):10731076, 1992.
P. J. Davis. Circulant Matrices. American Mathematical Society, 2 edition, 1994. ISBN 0828403384.
T. A. Davis and Y. Hu. The University of Florida sparse matrix collection. ACM Transactions on Mathematical
Software, 38(1):125, 2011.
M. H. DeGroot. Reaching a consensus. Journal of the American Statistical Association, 69(345):118121, 1974.
P. M. DeMarzo, D. Vayanos, and J. Zwiebel. Persuasion bias, social influence, and unidimensional opinions.
The Quarterly Journal of Economics, 118(3):909968, 2003.
R. Diestel. Graph Theory, volume 173 of Graduate Texts in Mathematics. Springer, 2 edition, 2000. ISBN
3642142788.
F. Drfler and F. Bullo. On the critical coupling for Kuramoto oscillators. SIAM Journal on Applied Dynamical
Systems, 10(3):10701099, 2011. doi: 10.1137/10081530X.
F. Drfler and F. Bullo. Exploring synchronization in complex oscillator networks, Sept. 2012. Extended
version including proofs. Available at http://arxiv.org/abs/1209.1335.
F. Drfler and F. Bullo. Synchronization in complex networks of phase oscillators: A survey. Automatica, 50
(6):15391564, 2014. doi: 10.1016/j.automatica.2014.04.012.
F. Drfler and B. Francis. Formation control of autonomous robots based on cooperative behavior. In
European Control Conference, pages 24322437, Budapest, Hungary, Aug. 2009.
F. Drfler and B. Francis. Geometric analysis of the formation problem for autonomous robots. IEEE
Transactions on Automatic Control, 55(10):23792384, 2010.
F. Fagnani. Consensus dynamics over networks. Winter School on Complex Networks, INRIA, Jan. 2014.
F. Fagnani and S. Zampieri. Randomized consensus algorithms over large scale networks. IEEE Journal on
Selected Areas in Communications, 26(4):634649, 2008.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography 253
A. Fall, A. Iggidr, G. Sallet, and J.-J. Tewa. Epidemiological models and Lyapunov functions. Mathematical
Modelling of Natural Phenomena, 2(1):6268, 2007.
L. Farina and S. Rinaldi. Positive Linear Systems: Theory and Applications. John Wiley & Sons, 2000. ISBN
0471384569.
D. Fife. Which linear compartmental systems contain traps? Mathematical Biosciences, 14(3):311315, 1972.
D. M. Foster and J. A. Jacquez. Multiple zeros for eigenvalues and the multiplicity of traps of a linear
compartmental system. Mathematical Biosciences, 26(1):8997, 1975.
B. A. Francis and M. Maggiore. Flocking and Rendezvous in Distributed Robotics. Springer, 2016. ISBN
978-3-319-24727-4.
P. Frasca. Quick convergence proof for gossip consensus. Personal communication, 2012.
P. Frasca, R. Carli, F. Fagnani, and S. Zampieri. Average consensus on networks with quantized communica-
tion. International Journal of Robust and Nonlinear Control, 19(16):17871816, 2009.
N. E. Friedkin. Theoretical foundations for centrality measures. American Journal of Sociology, 96(6):
14781504, 1991.
N. E. Friedkin and E. C. Johnsen. Social influence networks and opinion change. In E. J. Lawler and M. W.
Macy, editors, Advances in Group Processes, volume 16, pages 129. JAI Press, 1999.
N. E. Friedkin and E. C. Johnsen. Social Influence Network Theory: A Sociological Examination of Small Group
Dynamics. Cambridge University Press, 2011. ISBN 9781107002463.
N. E. Friedkin and E. C. Johnsen. Two steps to obfuscation. Social Networks, 39:1213, 2014.
P. A. Fuhrmann and U. Helmke. The Mathematics of Networks of Linear Systems. Springer, 2015. ISBN
3319166468.
C. Gao, J. Corts, and F. Bullo. Notes on averaging over acyclic digraphs and discrete coverage control.
Automatica, 44(8):21202127, 2008. doi: 10.1016/j.automatica.2007.12.017.
F. Garin and L. Schenato. A survey on distributed estimation and control applications using linear consensus
algorithms. In A. Bemporad, M. Heemels, and M. Johansson, editors, Networked Control Systems, LNCIS,
pages 75107. Springer, 2010.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
254 Bibliography
A. K. Ghosh, B. Chance, and E. K. Pye. Metabolic coupling and synchronization of NADH oscillations in
yeast cell populations. Archives of Biochemistry and Biophysics, 145(1):319331, 1971.
D. Gleich. Spectral Graph Partitioning and the Laplacian with Matlab, Jan. 2006. URL https://www.cs.
purdue.edu/homes/dgleich/demos/matlab/spectral/spectral.html. (Last retrieved on May
30, 2016.).
C. D. Godsil and G. F. Royle. Algebraic Graph Theory, volume 207 of Graduate Texts in Mathematics. Springer,
2001. ISBN 0387952411.
B. S. Goh. Global stability in two species interactions. Journal of Mathematical Biology, 3(3-4):313318,
1976.
B.-S. Goh. Management and Analysis of Biological Populations. Elsevier, 1980. ISBN 978-0-444-41793-0.
M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 2.1. http:
//cvxr.com/cvx, Oct. 2014.
W. H. Haddad, V. Chellaboina, and Q. Hui. Nonnegative and Compartmental Dynamical Systems. Princeton
University Press, 2010. ISBN 0691144117.
F. Harary. A criterion for unanimity in Frenchs theory of social power. In D. Cartwright, editor, Studies in
Social Power, pages 168182. University of Michigan, 1959.
Y. Hatano and M. Mesbahi. Agreement over random networks. IEEE Transactions on Automatic Control, 50
(11):18671872, 2005.
J. M. Hendrickx. Graphs and Networks for the Analysis of Autonomous Agent Systems. PhD thesis, Universit
Catholique de Louvain, Belgium, Feb. 2008.
J. P. Hespanha. Linear Systems Theory. Princeton University Press, 2009. ISBN 0691140219.
J. Hofbauer and K. Sigmund. Evolutionary Games and Population Dynamics. Cambridge University Press,
1998. ISBN 052162570X.
L. Hogben, editor. Handbook of Linear Algebra. Chapman and Hall/CRC, 2 edition, 2013. ISBN 1466507284.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography 255
R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1985. ISBN 0521386322.
Y. Hu. Efficient, high-quality force-directed graph drawing. Mathematica Journal, 10(1):3771, 2005.
H. Ishii and R. Tempo. The pagerank problem, multiagent consensus, and web aggregation: A systems and
control viewpoint. IEEE Control Systems Magazine, 34(3):3453, 2014.
M. O. Jackson. Social and Economic Networks. Princeton University Press, 2010. ISBN 0691148201.
J. A. Jacquez and C. P. Simon. Qualitative theory of compartmental systems. SIAM Review, 35(1):4379,
1993.
A. Jadbabaie and A. Olshevsky. On performance of consensus protocols subject to noise: role of hitting
times and network structure. arXiv preprint arXiv:1508.00036, 2015.
A. Jadbabaie, J. Lin, and A. S. Morse. Coordination of groups of mobile autonomous agents using nearest
neighbor rules. IEEE Transactions on Automatic Control, 48(6):9881001, 2003.
G. Jongen, J. Anemller, D. Boll, A. C. C. Coolen, and C. Perez-Vicente. Coupled dynamics of fast spins
and slow exchange interactions in the XY spin glass. Journal of Physics A: Mathematical and General,
34(19):39573984, 2001.
L. Katz. A new status index derived from sociometric analysis. Psychometrika, 18(1):3943, 1953.
A. Khanafer, T. Baar, and B. Gharesifard. Stability properties of infected networks with low curing rates.
In American Control Conference, pages 35793584, Portland, OR, USA, June 2014.
G. Kirchhoff. ber die Auflsung der Gleichungen, auf welche man bei der Untersuchung der linearen
Verteilung galvanischer Strme gefhrt wird. Annalen der Physik und Chemie, 148(12):497508, 1847.
I. Z. Kiss, Y. Zhai, and J. L. Hudson. Emerging coherence in a population of chemical oscillators. Science,
296(5573):16761678, 2002.
M. S. Klamkin and D. J. Newman. Cyclic pursuit or "the three bugs problem". American Mathematical
Monthly, 78(6):631639, 1971.
D. J. Klein, P. Lee, K. A. Morgansen, and T. Javidi. Integration of communication and control using discrete
time Kuramoto models for multivehicle coordination over broadcast networks. IEEE Journal on Selected
Areas in Communications, 26(4):695705, 2008.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
256 Bibliography
G. Kozyreff, A. G. Vladimirov, and P. Mandel. Global coupling with time delay in an array of semiconductor
lasers. Physical Review Letters, 85(18):38093812, 2000.
J. Kunegis. KONECT: the Koblenz network collection. In International Conference on World Wide Web
Companion, pages 13431350, 2013.
Y. Kuramoto. Chemical Oscillations, Waves, and Turbulence. Springer, 1984. ISBN 0387133224.
P. H. Leslie. On the use of matrices in certain population mathematics. Biometrika, 3(3):183212, 1945.
Z. Lin, B. Francis, and M. Maggiore. Necessary and sufficient graphical conditions for formation control of
unicycles. IEEE Transactions on Automatic Control, 50(1):121127, 2005.
Z. Lin, B. Francis, and M. Maggiore. State agreement for continuous-time coupled nonlinear systems. SIAM
Journal on Control and Optimization, 46(1):288307, 2007.
C. Liu, D. R. Weaver, S. H. Strogatz, and S. M. Reppert. Cellular construction of a circadian clock: period
determination in the suprachiasmatic nuclei. Cell, 91(6):855860, 1997.
S. ojasiewicz. Sur les trajectoires du gradient dune fonction analytique. Seminari di Geometria 1982-1983,
pages 115117, 1984. Istituto di Geometria, Dipartimento di Matematica, Universit di Bologna, Italy.
A. J. Lotka. Analytical note on certain rhythmic relations in organic systems. Proceedings of the National
Academy of Sciences, 6(7):410415, 1920.
E. Lovisari, F. Garin, and S. Zampieri. Resistance-based performance analysis of the consensus algorithm
over geometric graphs. SIAM Journal on Control and Optimization, 51(5):39183945, 2013.
D. G. Luenberger. Introduction to Dynamic Systems: Theory, Models, and Applications. John Wiley & Sons,
1979. ISBN 0471025941.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography 257
J. R. Marden, G. Arslan, and J. S. Shamma. Joint strategy fictitious play with inertia for potential games.
IEEE Transactions on Automatic Control, 54(2):208220, 2009.
J. A. Marshall, M. E. Broucke, and B. A. Francis. Formations of vehicles in cyclic pursuit. IEEE Transactions
on Automatic Control, 49(11):19631974, 2004.
A. Mauroy, P. Sacr, and R. J. Sepulchre. Kick synchronization versus diffusive synchronization. In IEEE
Conf. on Decision and Control, pages 71717183, Maui, HI, USA, Dec. 2012.
W. Mei and F. Bullo. Modeling and analysis of competitive propagation with social conversion. In IEEE
Conf. on Decision and Control, pages 62036208, Los Angeles, CA, USA, Dec. 2014.
R. Merris. Laplacian matrices of a graph: A survey. Linear Algebra and its Applications, 197:143176, 1994.
M. Mesbahi and M. Egerstedt. Graph Theoretic Methods in Multiagent Networks. Princeton University Press,
2010. ISBN 9781400835355.
C. D. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, 2001. ISBN 0898714540.
B. Mohar. The Laplacian spectrum of graphs. In Y. Alavi, G. Chartrand, O. R. Oellermann, and A. J. Schwenk,
editors, Graph Theory, Combinatorics, and Applications, volume 2, pages 871898. John Wiley & Sons,
1991. ISBN 0471532452.
L. Moreau. Stability of continuous-time distributed consensus algorithms. In IEEE Conf. on Decision and
Control, pages 39984003, Nassau, Bahamas, 2004.
L. Moreau. Stability of multiagent systems with time-dependent communication links. IEEE Transactions
on Automatic Control, 50(2):169182, 2005.
Z. Nda, E. Ravasz, T. Vicsek, Y. Brechet, and A.-L. Barabsi. Physics of the rhythmic applause. Physical
Review E, 61(6):69876992, 2000.
A. Nedi, A. Olshevsky, A. Ozdaglar, and J. N. Tsitsiklis. On distributed averaging algorithms and quantization
effects. IEEE Transactions on Automatic Control, 54(11):25062517, 2009.
C. Nowzari, V. M. Preciado, and G. J. Pappas. Analysis and control of epidemics: A survey of spreading
processes on complex networks. IEEE Control Systems Magazine, 36(1):2646, 2016.
I. Noy-Meir. Desert ecosystems. I. Environment and producers. Annual Review of Ecology and Systematics,
pages 2551, 1973.
K.-K. Oh, M.-C. Park, and H.-S. Ahn. A survey of multi-agent formation control. Automatica, 53:424440,
2015.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
258 Bibliography
R. Olfati-Saber. Flocking for multi-agent dynamic systems: Algorithms and theory. IEEE Transactions on
Automatic Control, 51(3):401420, 2006.
R. Olfati-Saber, E. Franco, E. Frazzoli, and J. S. Shamma. Belief consensus and distributed hypothesis testing
in sensor networks. In P. J. Antsaklis and P. Tabuada, editors, Network Embedded Sensing and Control.
(Proceedings of NESC05 Worskhop), Lecture Notes in Control and Information Sciences, pages 169182.
Springer, 2006. ISBN 3540327940.
A. Olshevsky and J. N. Tsitsiklis. On the nonexistence of quadratic Lyapunov functions for consensus
algorithms. IEEE Transactions on Automatic Control, 53(11):26422645, 2008.
R. W. Owens. An algorithm to solve the Frobenius problem. Mathematics Magazine, 76(4):264275, 2003.
L. Page. Method for node ranking in a linked database, Sept. 2001. US Patent 6,285,999.
D. A. Paley, N. E. Leonard, R. Sepulchre, D. Grunbaum, and J. K. Parrish. Oscillator models and collective
motion. IEEE Control Systems Magazine, 27(4):89105, 2007.
J. Pantaleone. Stability of incoherence in an isotropic gas of oscillating neutrinos. Physical Review D, 58(7):
073002, 1998.
G. Piovan, I. Shames, B. Fidan, F. Bullo, and B. D. O. Anderson. On frame and orientation localization for
relative sensing networks. Automatica, 49(1):206213, 2013. doi: 10.1016/j.automatica.2012.09.014.
V. H. Poor. An Introduction to Signal Detection and Estimation. Springer, 2 edition, 1998. ISBN 0387941738.
R. Potrie and P. Monzn. Local implications of almost global stability. Dynamical Systems, 24(1):109115,
2009.
V. Rakoevi. On continuity of the Moore-Penrose and Drazin inverses. Matematichki Vesnik, 49(3-4):
163172, 1997.
C. Ravazzi, P. Frasca, R. Tempo, and H. Ishii. Ergodic randomized algorithms and dynamics over networks.
IEEE Transactions on Control of Network Systems, 2(1):7887, 2015.
W. Ren. On consensus algorithms for double-integrator dynamics. IEEE Transactions on Automatic Control,
53(6):15031509, 2008a.
W. Ren. Synchronization of coupled harmonic oscillators with local interaction. Automatica, 44:31963200,
2008b.
W. Ren and W. Atkins. Second-order consensus protocols in multiple vehicle systems with local interactions.
In AIAA Guidance, Navigation, and Control Conference and Exhibit, pages 1518, San Francisco, CA,
USA, Aug. 2005.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography 259
W. Ren and R. W. Beard. Consensus seeking in multi-agent systems under dynamically changing interaction
topologies. IEEE Transactions on Automatic Control, 50(5):655661, 2005.
W. Ren and R. W. Beard. Distributed Consensus in Multi-vehicle Cooperative Control. Communications and
Control Engineering. Springer, 2008. ISBN 978-1-84800-014-8.
W. Ren, R. W. Beard, and E. M. Atkins. Information consensus in multivehicle cooperative control: Collective
group behavior through local interaction. IEEE Control Systems Magazine, 27(2):7182, 2007.
W. H. Sandholm. Population Games and Evolutionary Dynamics. MIT Press, 2010. ISBN 0262195879.
P. Santesso and M. E. Valcher. On the zero pattern properties and asymptotic behavior of continuous-time
positive system trajectories. Linear Algebra and its Applications, 425(2):283302, 2007.
L. Schenato and F. Fiorentin. Average TimeSynch: A consensus-based protocol for clock synchronization in
wireless sensor networks. Automatica, 47(9):18781886, 2011.
R. Sepulchre, D. A. Paley, and N. E. Leonard. Stabilization of planar collective motion: All-to-all communi-
cation. IEEE Transactions on Automatic Control, 52(5):811824, 2007.
J. W. Simpson-Porco, F. Drfler, and F. Bullo. Synchronization and power sharing for droop-controlled
inverters in islanded microgrids. Automatica, 49(9):26032611, 2013. doi: 10.1016/j.automatica.2013.05.
018.
S. L. Smith, M. E. Broucke, and B. A. Francis. A hierarchical cyclic pursuit scheme for vehicle networks.
Automatica, 41(6):10451053, 2005.
A. Tahbaz-Salehi and A. Jadbabaie. A necessary and sufficient condition for consensus over random
networks. IEEE Transactions on Automatic Control, 53(3):791795, 2008.
Y. Takeuchi. Global Dynamical Properties of Lotka-Volterra Systems. World Scientific Publishing, 1996. ISBN
9810224710.
H. G. Tanner, A. Jadbabaie, and G. J. Pappas. Flocking in fixed and switching networks. IEEE Transactions
on Automatic Control, 52(5):863868, 2007.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
260 Bibliography
P. A. Tass. A model of desynchronizing deep brain stimulation with a demand-controlled coordinated reset
of neural subpopulations. Biological Cybernetics, 89(2):8188, 2003.
B. Touri and A. Nedi. Product of random stochastic matrices. IEEE Transactions on Automatic Control, 59
(2):437448, 2014.
P. Van Mieghem. The N -intertwined SIS epidemic network model. Computing, 93(2-4):147169, 2011.
P. Van Mieghem, J. Omic, and R. Kooij. Virus spread in networks. IEEE/ACM Transactions on Networking, 17
(1):114, 2009.
F. Varela, J. P. Lachaux, E. Rodriguez, and J. Martinerie. The brainweb: Phase synchronization and large-scale
integration. Nature Reviews Neuroscience, 2(4):229239, 2001.
T. Vicsek, A. Czirk, E. Ben-Jacob, I. Cohen, and O. Shochet. Novel type of phase transition in a system of
self-driven particles. Physical Review Letters, 75(6-7):12261229, 1995.
V. Volterra. Variations and fluctuations of the number of individuals in animal species living together. ICES
Journal of Marine Science, 3(1):351, 1928.
T. J. Walker. Acoustic synchrony: two mechanisms in the snowy tree cricket. Science, 166(3907):891894,
1969.
G. G. Walter and M. Contreras. Compartmental Modeling with Networks. Birkhuser, 1999. ISBN 0817640193.
J. Wang and N. Elia. Control approach to distributed optimization. In Allerton Conf. on Communications,
Control and Computing, pages 557561, Monticello, IL, USA, 2010.
Y. Wang, D. Chakrabarti, C. Wang, and C. Faloutsos. Epidemic spreading in real networks: An eigenvalue
viewpoint. In IEEE Int. Symposium on Reliable Distributed Systems, pages 2534, Oct. 2003.
A. Watton and D. W. Kydon. Analytical aspects of the N -bug problem. American Journal of Physics, 37(2):
220221, 1969.
J. T. Wen and M. Arcak. A unifying passivity framework for network flow control. IEEE Transactions on
Automatic Control, 49(2):162174, 2004.
A. T. Winfree. Biological rhythms and the behavior of populations of coupled oscillators. Journal of
Theoretical Biology, 16(1):1542, 1967.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.
Bibliography 261
W. Xia and M. Cao. Sarymsakov matrices and asynchronous implementation of distributed coordination
algorithms. IEEE Transactions on Automatic Control, 59(8):22282233, 2014.
L. Xiao and S. Boyd. Fast linear iterations for distributed averaging. Systems & Control Letters, 53:6578,
2004.
L. Xiao, S. Boyd, and S. Lall. A scheme for robust distributed sensor fusion based on average consensus.
In Symposium on Information Processing of Sensor Networks, pages 6370, Los Angeles, CA, USA, Apr.
2005.
L. Xiao, S. Boyd, and S.-J. Kim. Distributed average consensus with least-mean-square deviation. Journal of
Parallel and Distributed Computing, 67(1):3346, 2007.
R. A. York and R. C. Compton. Quasi-optical power combining using mutually synchronized oscillator
arrays. IEEE Transactions on Microwave Theory and Techniques, 39(6):10001009, 2002.
M. Youssef and C. Scoglio. An individual-based approach to SIR epidemics in contact networks. Journal of
Theoretical Biology, 283(1):136144, 2011.
W. Yu, G. Chen, and M. Cao. Some necessary and sufficient conditions for second-order consensus in
multi-agent dynamical systems. Automatica, 46(6):10891095, 2010.
S. Zampieri. Lecture Notes on Dynamics over Networks. Minicourse at UC Santa Barbara, Apr. 2013.
D. Zelazo. Graph-Theoretic Methods for the Analysis and Synthesis of Networked Dynamic Systems. PhD
thesis, University of Washington, 2009.
D. Zelazo and M. Mesbahi. Edge agreement: Graph-theoretic performance bounds and passivity analysis.
IEEE Transactions on Automatic Control, 56(3):544555, 2011.
Y. Zhang and Y. P. Tian. Consentability and protocol design of multi-agent systems with stochastic switching
topology. Automatica, 45:11951201, 2009.
J. Zhu, Y. Tian, and J. Kuang. On the general consensus protocol of multi-agent systems with double-
integrator dynamics. Linear Algebra and its Applications, 431(5-7):701715, 2009.
Lectures on Network Systems, F. Bullo, Version v0.85(i) (6 Aug 2016). Draft not for circulation. Copyright 2012-16.