You are on page 1of 19


Model Order Reduction

Partial Report
Mateus Antunes Oliveira Leite

Partial report containing the theory of model order reduction

1. Topics in Linear Algebra

In order to introduce the concepts of model order reduction in a self-contained manner, some
important topics in linear algebra are presented.

Vector basis
Any vector in a space can be represented as a linear combination of n linearly independent base
vectors. In Equation (1), i represents a base vector of with index i and is the vector being
= 1 + 2 + +


In matrix notation, this relation can be expressed as in Equation (2). The columns of matrix are the
base vectors and the vector contains the coefficients of each of the base vectors. The vector is
said to be in the column space of B.


It must be pointed out that vectors are geometric objects and thus are independent of the basis one
chooses the represent them. The vectors and can be regarded as the same geometric vector but
in different basis. Therefore, Equation (2) can be regarded as change of basis. Once the columns of
are linearly independent, the matrix is not singular and its inverse exists. This permits to express the
inverse mapping as represented in Equation (3).


In this section, a simple 3-dimensional example of projection is presented. However, the derived
results are general and can be applied directly to an arbitrary number of dimensions.



Figure 1 - Projection into a 2-dimensional space

The Figure 1 represents a projection of the vector into the subspace . The resulting or projected
vector is and is the different among these two quantities. The subspace represented by the
matrix can be constructed by arranging the base vectors and as its columns. Thus, any vector
lying in this subspace can be represented by a relation similar to Equation (2). Therefore, the relation
among the vectors v, p and e is given by Equation (4).


The fact that is in the column span of allows writing equation (4) as in Equation (5).
= +


To determine the vector , one should choose a subspace orthogonal to e, obtaining an explicit
expression for , as shown in Equation (6).

= (T ) T


Therefore, the projected vector can be obtained by Equation (7).


= (T ) T


The linear operator representing the oblique projection of the components of that are orthogonal
to into can be extracted directly from this relation as expressed in Equation (8).

, = (T ) T


If both the subspaces and are the same, this projection is called an orthogonal projection. This
kind of projection has a very important relation to the following optimization problem: find a vector
lying in the subspace that best approximates the vector v. In the current context, the best
approximation is obtained when the square of the length of the error is minimized. Using the law of
cosines, the error vector represented in Figure 1 is given by Equation (9).
||2 = T 2 T + T


Using the fact that is in the column space of , the above relation can be written as in Equation
||2 = T 2 T + ()T


Taking the gradient of the right hand side in relation to and imposing the condition of it being zero,
one can isolate the vector . This process is shown in (11). See the appendix for the used math

( T 2 T + ()T ) =
(2 T ) + [()T ] =
2[ T + ()T ] + 2()T =


T = T

= (T ) T
Therefore the vector that minimizes the square of the modulus of the error is given by Equation (12).

= (T ) T


Comparing Equations (8) and (12), one can conclude that the optimal approximation vector is just an
orthogonal projection into the subspace .
A very important application of this fact that will be largely used in model order reduction is that it
can be used to approximately solve overdetermined systems. As an example, imagine that one wants
to solve the linear system of Equation (13) with , , and with n > m.


This problem is overdetermined and can be interpreted geometrically as trying to find the
representation of the vector using the columns of as the base vectors. It is very unlikely that this
task is possible, once that the number of base vector to span the totality of is equal to n but only
m base vectors are available. Therefore, the available basis represents a subspace of dimension m
embedded on a higher n-dimensional space. One approach to solve this problem in an approximated
fashion is to project the vector into the column span of A. This results in shown in Equation (14).

= (T ) T


One can multiply both sides of the above equation by T to obtain Equation (15).
T = T


Therefore, it is possible to conclude that to obtain the closest approximation to Equation (13), in the
sense of least squares, it is only necessary to multiply the system of equations by the transpose of
the coefficient matrix.

Similarity Transformation
Matrices may be used to represent linear operators; an example of this is the projection operator. If
two bases for the same linear space are related by a change of coordinates as shown by Equation
(16), it is natural to wonder how to transform the operator between coordinate systems. One may
start with Equation (17), that represents the application of a transformation A to a vector x.



Using Equation (16) into (17), the relation for y and the transformed operator B may be obtained.
This is shown in Equation (18).

= 1 =


This type of relation has a very important property: the eigenvalues of A are the same as Bs. This can
be easy shown by the development in (19).
| 1 | = | 1 1 | = | 1 || ||| = | |


In a similar fashion, one may wonder of what are the relations of the eigenvalues of a transformation
of the kind indicated in Equation (20).
= T


For this analysis the focus is not to show that the eigenvalues are the same, which they are not, but
to demonstrate that their signs are preserved. This is very important because, as will be explained
latter in this document, the sign of the eigenvalues determines if the system is stable or not.
If A is symmetric, and the system is stable, then it can be decomposed as written in Equation (21).


This allows to prove that the reduced matrix is negative definite as shown in (22). For a general A
matrix, there is no guarantee that the resulting reduced system will be stable.
= T = = () () =


Singular Value Decomposition

Any matrix m,n may be decomposed as a product of two orthonormal and a diagonal matrix as
shown in Equation (23).


To be able to determine these matrices, one may write the product of Equation (24).
= =


This leads to conclude that the matrix U is build using the eigenvectors of the matrix AA*, and each
of the diagonal entries of is the square root of the not null eigenvalues of AA*. A similar argument
shows that the matrix V is composed by the eigenvectors of the matrix A*A. The first n columns of
matrix U are an orthonormal basis for the range of A.

This decomposition allows writing any matrix as a sum of products of rank 1. This is written in
Equation (25). It is possible to show that truncating this series at position r leads to the best rank r
approximation of matrix A.
= 1 + 1 + + r


2. Model Order Reduction by Projection

State Space Representation
The representation of time invariant linear systems used in this document is given by Equations (26)
and (27). In these equations: , , , , , , , , , , , ,
s . The vector x is the state vector, y is the output vector and u is the input vector.
= +


It must be pointed out that Equations (26) and (27) are not coupled.
= +


For now, the matrix D is considered of all zeros. If this is not the case in a particular application, the
reasoning presented below can be easily adapted.
If the matrix E is not singular it is possible to write the system in a simpler form as shown in (28).
= 1 + 1 = +


Many textbooks present this formulation as the standard form for the state space system. However,
problems arise when E is singular. This indicates the presence of algebraic states that have to be
threated explicitly. The first step is to write Equation (26) as in Equation (29).

[ 11

] [ ] = [ 11


12 1

] [ ] + [ 1]
22 2


To obtain this form, permutations of the lines and columns of the E matrix should be followed by
equivalent permutations of A and B. To achieve this, one may use a permutation matrix P to create
the transformation of Equation (30).


Accordingly, the transformed system is given by Equation (31).

T = T + T


In this stage it is possible to apply a Kron reduction to Equation (29) to obtain (32).
11 1 = (11 12 1
22 21 ) + (1 12 22 2 )


This representation is a reduced state space equation that allows direct reduction to standard form.
Thus, writing the system as in Equation (26) or as in Equation (28) is interchangeable and both
formulations may be used when they are more convenient.

Observability and Reachability

There are two quantities that are very important to characterize the system. Suppose a system
written as in Equation (28), the response to an arbitrary input signal is given by Equation (33).

() = (0 ) (0 ) + () ()d


If in the beginning of the simulations the system is found with zero initial conditions, it is a natural
question to ask which is the input signal u(t) that may be used to drive the system to a given state.
There may be an infinite number of signals that can accomplish this task. To reduce the number of
choices, one may want that this signal is optimal in the sense that it is capable to drive the system to
the desired state using the minimum amount of energy. For simplicity, but without any loss of
generality, the initial time is set to zero. The equation that describes the state evolution of this
system is given by (34).

x(t) = (t) ()d


If one chooses the input function as in Equation (35), the above identity is satisfied in the least
amount of energy [1]. The matrix which is called the controllability Gramian is defined by
Equation (36). Some authors use t = to define this quantity and they call it the infinite

() = () 1 ()()


= (t) () =


The energy in this signal can by written as in (37). Its physical interpretation leads to the conclusion
that the controllability Gramian measures how hard is to achieve a certain state. In the framework of
model order reduction, states that are difficult to reach are good candidates for truncation.

Ec = () () = 0 1 0


The above discussion involved solely the states of the system and ignored its output. A dual concept
called observability can be derived if one analyses the amount of energy that a certain state delivers
to the output if no input signal is present. The system response to this scenario is given by Equation
0 =


The energy of this signal can be calculates with aid of Equation (37) and the result is given by
Equation (39) which is also the definition of the observability Gramian.

Eo = 0 0 = 0


Controllability and Observability are dual concepts and their relation is shown in Table 1.
Table 1 Duality relation among controllability and observability

Controllability Observability

Reduction by Projection
If it is assumed that n is a large number, the burden of solving Equation (26) is very high or even
prohibitive. Therefore, a reduction of the system may be used to allow a faster solution with an
acceptable loss of accuracy. As presented in Equation (2), any two bases for n can be related by a
transformation matrix. The transformation matrix can be partitioned into two smaller matrices,
allowing one to obtain Equation (40). V contains q base vector and U the last n q.
= +


Then, Equation (26) may be written as in (41).

= + +


One may choose the basis in a way that the vectors of V are more important in representing the
dynamics of the system than the vectors in U. Thus, the last two terms of the right hand side can be
interpreted as an error factor.
= + +


The above system in xr is overdetermined. The same process used in Equations (13), (14) and (15) can
be used to allow the solution of this equation. This is shown in (43).
T = T + T + T


Until this point, the relation is exact and does not lead to any computational improvement. However,
if the last term of the right rand side of (43) is neglected, one can write the equations for the reduced
system. This is shown in Equations (44) and (45).
T = T + T



Under the light of projection matrices, the above equations represent the projection of the system
into a subspace spanned by V and orthogonal to W. The above set of equations allows the solution of
an approximation to the original system but with a reduced order.
For simplicity, a relation between the original system and reduced system matrices can be made. This
allows writing the reduced system in standard form. This relation is presented in Table 2.
Table 2 - Reduced System Coefficient Matrices

Original System Reduced System

The reduced system can be written as in Equations (46) and (47).

= +



To deal with initial conditions, one could use the projection of the initial condition vector into the
span of the reduced basis.

Invariance of the Transfer Function

In the frequency domain, the transfer function of the system is given by Equation (48). This is
valid for the full and reduced system as well if one keeps in mind the relations of Table 2.
= (s )1


If instead of using matrices V and W for the projection, one utilizes different matrices but with the
same column span, the resulting transfer function is unchanged. If the matrices K and L are
nonsingular, the matrices V and W, defined in Equations (49) and (50), have the same column span
of V and W, respectively.



Substituting these matrices into Equation (48), Equation (51) is obtained.


= 1 (s( 1 )T 1 ( 1 )T 1 ) (1 )T


Developing the above expression, the expression to the transformed reduced system is found to be
identical to the original reduced system. Therefore, it may be concluded that the transfer function is
invariant under a change of base. The important aspects of matrices V and W are their column span.

Balanced Truncation
The observability and reachability of each state is dependent of the system realization. If one applies
a base change represented by the matrix T, the transformed Gramians are given by (52).


It is possible to choose the matrix T such that both Gramians are equal and diagonal. This can be
achieved by a singular value decomposition of the product (53).


In order to reduce the system, the states that have small associated diagonal values. The problem of
this method is the burden to calculate the Gramians that is of the order of n3 [2]. The standard way
to do this is to solve the Lyapunov equations given by (54). The verification of this fact can be made
by direct substitution of the definition of the Gramians.
+ + =
+ + =


Proper Orthogonal Decomposition

The development of this section is largely inspired in [3]. The interested reader may look at this work
for further discussion.
The proper orthogonal decomposition aims at finding a projection operator of fixed rank () that
minimizes the quadratic error created by the simulation of a lower order system. For the case of
continuous time, the quadratic error is given by Equation (55). We already know that one may
truncate Equation (25) to solve directly this problem. However, the following development raises a
lot of understanding of the problem structure.

= () ()2


The integrand is the projection of the vector x into the space that is orthogonal to the projected one.
As any vector can be written as the sum of the projection into a subspace and the projection to the
orthogonal subspace, minimizing the error is equal to maximize the projection into the orthogonal
space. This is shown in Equation (56).

= ()2


Using Equation (12) and imposing the orthonormality of the base, the development in (57) is
possible. In this equation, tr represents the trace.

= ()2 = T () = ( ()() )


This motivates the definition of the POD Kernel given by Equation (58).

= () ()


This development can be cast into an optimization problem whose Lagrangian is given by (59).
(, ) = ( ) , ( )


It is possible to cast the orthogonality restriction as in Equation (60) if one defines the matrix A as in
(, ) = ( ) ( ) + ()

, =



Using the properties in the appendix, one may calculate the first order optimality condition, in
relation to , to obtain Equation (62).


This condition implies that V must span a subspace that is invariant under the application of the
linear operator K. The first order optimality condition for the values leads to the obvious restriction
given by Equation (63).
T =


Both of these conditions are satisfied if V is built with the eigenvectors of K. To calculate K in a digital
computer, a discretization is necessary. Equation shows a way to approximate the kernel. The S
matrix is defined in (65).

= () () = ( )( ) =


= [


As the kernel is an n-by-n matrix, the burden to compute the eigenvectors may be very high.
However it is possible to avoid this by the application of the singular value decomposition of S. This is
shown in Equation (66).
= ( )( ) =


Therefore, the eigenvectors that we are looking for are the columns of the U matrix of the singular
value decomposition of the kernel. To compute U in an efficient way, the economic version of the
SVD may be used. Mathematically this is equivalent to compute V by the eigenvalue decomposition
of the product STS, that has a size much smaller than the Kernel, and then using Equation (67) to get

only some eigenvectors. This does not alter the problem once that the column span of the product is
unchanged by the multiplication by .


It must be pointed out that this is only the mathematical description of the solution. In a computer
implementation this procedure may be replaced by more efficient ones.
In this method, only the system states are taken into account. No information is obtained from the
system output. This can be easily solved by using a second system that satisfies the relations in Table
1, called the dual system. Using information from these two systems allows to approximate the
balanced truncation method in a less expensive way.

Moment Matching
The transfer function in Equation (48) can be rewritten as in Equation (68).
= (s1 )1 1


The central term can be expanded to obtain a series representation of the transfer function. This is
shown in (69).

= (1 ) 1 s



The negative of these terms in s are called the moments of the transfer function. These moments are
centered at the zero frequency. To obtain the moments for other frequencies, the transfer function
can be written as in Equation (85).

= (( ) ( ))


Direct comparison of Equations (48) and (70) allows determining the equivalences pointed out in
Table 3.
Table 3 - Equivalence for decentered moments

Centered at Zero Centered at

Using these relations of can directly write the moments for any frequency. This is written in Equation
(71). This relation can be used to extend other results obtained to the zero centered expression to an
arbitrary placed frequency.

= [( )1 ] ( )1 (s )j



The moment matching technique aims at choosing the columns of V and W in a way that some of the
moments of the transfer function are exactly matched. The first moment of the reduced model is
given by Equation (72).

= ( T )


If A-1B is in the column span of V, there exists a vector r0 as shown in Equation (88).
0 = 1


Equation (72) can be written as in (74). Therefore, the zeroth moment of the reduced system is equal
of the zeroth moment of the full system.

0 = ( T )

T 0 = 1


The first moment for the reduced system is given by Equation (75).

1 = ( )

( )


Using (73), Equation (75) can be reduced to (76).

1 = ( ) 1


If A-1EA-1B is in the column span of V, there exists a vector r1 as shown in Equation (77).
= 1 1


This relation can be used to write (76) as in Equation (78).

1 = ( )

1 = 1 1


Therefore, if A-1B and A-1EA-1B are in the column span of V, the zeroth and the first moments match.
This process can be continued to match the successive moments. To express this result in a simple
way, one may introduce the Krylov subspaces, defined in Equation (79).
(, , ) = [


Using this definition, q moments of the transfer function of the full system are matched if the matrix
V is given by Equation (80).
= ( , , )


It must be pointed out that nothing was imposed over the matrix W. A simple choice for this matrix is
to make it equal to V. However, one may use a very similar argument to show that W may be chosen
in order to match even more moments. A very similar argument shows that if W is chosen in
accordance to Equation (81), then other q moments of the full system are matched.
= ( , , )


In some special systems, e.g. impedance probing, both of the subspaces are the same. If A and E are
symmetric and CT = B, there is no need to calculate both subspaces.

3. Circuit Simulation
This section is largely related to the work in [4].
Before introducing the linear equations for circuit simulation, some notation and important
concepts are introduced. One way to model a circuit into the computer is by an abstraction
called graph. A graph, denoted by G(V, E), is a group of two sets: the set of vertices (V) and the
set of edges (E), containing tuples of elements of V.
For the case of circuit representation, an edge can be tough as a circuit component (resistor,
capacitor, current source, etc.) and the nodes are the connection points. In this particular
analysis, only resistors, capacitors, inductors and current sources will be considered. These
components and its orientations are illustrated in Figure 2.

Figure 2 - Circuit elements with voltage and current orientations

Suppose the existence of an oriented graph representing the circuit. In order to translate this
structure into matrix notation, one utilizes an edge-node incidence matrix ( |V|,|E|) with its
elements satisfying ij {1, 0, 1}. Each column of this matrix is directly associated with an
edge of the underlying graph and each row with a node. To build the matrix, each column must
contain exactly one entry 1, one entry -1 and |V| - 2 zero entries. The nonzero elements must be
placed accordingly with the incidence of the edges into the nodes. The graph orientation is used
to determine the sign of the entry.
From this matrix, it is possible to derive a reduced matrix excluding the line representing the
. If the edges
node of known potential, usually the ground node. This new matrix is denoted by
of this matrix are placed in a way that components of the same kind are side by side, some
submatrices can be identified as shown in (82).
= [s

l ]


Using these submatrices is possible to write the Kirchhoffs circuit laws for the circuit as shown
in Equation (83). The vectors r , c , l e s contains the currents of each of the different types of
circuit elements.
r r + c c + l l = s s


Direct application of resistors and capacitors terminal relations leads to Equation (84).
r 1 + c

+ l l = s s


The relation between the voltage in each of the elements and the potential of each of the nodes
of the circuit is given by Equation (85). The index i is a placeholder and can be replaced by r, c, l
or s.
Ti = i


Using this relation, Equation (84) can be rewritten as in Equation (86).

r 1 Tr + c Tc

+ l l = s s


Writing the inductors terminal voltage and using Equation (85), it is possible to write Equation
(87). This allows obtaining a unified equation with the node voltages and currents in the
inductors as unknowns. This is presented in Equation (88).
Ti = i

1 T
] [n ] = [r r


i n

] [ ] + [ ]


If formulated in this fashion, both right and left Krylov subspaces are the equal. The same
phenomenon happens with the original and dual systems in the Proper Orthogonal Decomposition.
This allows calculating only one subspace and having the precision achieved by two subspaces in the
general case.

4. Application Example
The formulation developed in the last section is used to build an electric circuit as shown in Figure 3.
This circuit consists basically in a current source in parallel with a capacitor that feeds a ladder of
adjustable size that contains a pattern of inductance, resistance and capacitance. The output is an
ordinary resistor whose choice is independent of the other.

Figure 3 - RLC ladder

For this experiment a pattern as indicated in Table 4 was chosen. The Bode diagram for the
impedance measured at the terminals of the current source is plotted in Figure 4.

Table 4 - Value patterns for the RLC ladder

Input capacitor
1 pF
Output resistor
Resistor pattern
Inductor pattern 1 nH, 10 nH, 1H
Capacitor pattern 1pF, 1nF, 10nF

Figure 4 Bode diagram of the system

The resulting reduced order models for an order equal to 20 is shown for the POD method in Figure
5. The sampling time was one nanosecond. Figure 6 shows the result for the Krylov method with 12
logarithmically distributed expansion points for the same system. Finally, Figure 7 shows the result
for a Balanced Truncation method.

Figure 5 POD Approximation

Figure 6 Krylov Approximation

Figure 7 Balanced Truncation

5. Appendix Mathematical development

Gradient property




( ) = ( ) + ( )


First gradient of the trace

( ) =


( )

= (


( )
= + ( )


( ) = +


Second gradient of the trace

( ) =


( )

= (


( )
= ( ) +


( ) = +