You are on page 1of 106

Sparse Linear Algebra:

LU Factorization
Kristin Davies
Peter He
Feng Xie
Hamid Ghaffari

April 4, 2007
Outline
 Introduction to LU Factorization (Kristin)

 LU Transformation Algorithms (Kristin)

 LU and Sparsity (Peter)

 Simplex Method (Feng)

 LU Update (Hamid)
Introduction – Transformations – Sparsity – Simplex – Implementation

Introduction – What is LU Factorization?


 Matrix decomposition into the product of a
lower and upper triangular matrix:

A = LU
 Example for a 3x3 matrix:

 a11 a12 a13  l11 0 0  u11 u12 u13 


a a22  
a23  = l21 l22  
0   0 u 22 
u 23 
 21
 a31 a32 a33  l31 l32 l33   0 0 u33 
Introduction – Transformations – Sparsity – Simplex – Implementation
 a11 a12 a13 
A =  a 21 a 22 a 23 
 a 31 a 32 a 33 
Introduction – LU Existence
 LU factorization can be completed on an
invertible matrix if and only if all its principle
minors are non-zero
 Recall: invertible Recall: principle minors
A is invertible if B exists s.t. det (a11 ) det (a22 ) det (a33 )
AB = BA = I  a11 a12   a22 a23   a11 a13 
det   det   det  
 a21 a22   a32 a33   a31 a33 
 a11 a12 a13 
 
det  a21 a22 a23 
a a32 a33 
 31
Introduction – Transformations – Sparsity – Simplex – Implementation

Introduction – LU Unique Existence


 Imposing the requirement that the diagonal of
either L or U must consist of ones results in
unique LU factorization
 1 0 0 u11 u12 u13 
  0 u 
l
 21 1 0  22 u 23 
 a11 a12 a13  l31 l32 1  0 0 u33 
a 
 21 a22 a23  = 
 a31 a32 a33   l11 0 0  1 u12 u13 
  0 1 u 
l l
  21 22 0  23 
 l31 l32 l33  0 0 1 
 
Introduction – Transformations – Sparsity – Simplex – Implementation

Introduction – Why LU Factorization?


 LU factorization is useful in numerical analysis
for:
– Solving systems of linear equations (AX=B)
– Computing the inverse of a matrix

 LU factorization is advantageous when there is


a need to solve a set of equations for many
different values of B
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformation Algorithms
 Modified form of Gaussian elimination

 Doolittle factorization – L has 1’s on its diagonal


 Crout factorization – U has 1’s on its diagonal
 Cholesky factorization – U=LT or L=UT

 Solution to AX=B is found as follows:


– Construct the matrices L and U (if possible)
– Solve LY=B for Y using forward substitution
– Solve UX=Y for X using back substitution
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle
 Doolittle factorization – L has 1’s on its diagonal
 General algorithm – determine rows of U from top to
bottom; determine columns of L from left to right

for i=1:n
for j=i:n
LikUkj=Aij gives row i of U
end
for j=i+1:n
LjkUki=Aji gives column i of L
end
end
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example


1 0 0 u11 u12 u13   2 − 1 − 2 for j=i:n
l 0  0 u 22 u23  = − 4 6 3 
 21 1
LikUkj=Aij gives row i of U

l31 l32
1  0 0 u33  − 4 − 2 8  end

 u11 
 
(1 0 0) 0  = 2 ⇒ u11 = 2
 0
 
Similarly
1 0 0 u11 u12 u13   2 − 1 − 2
l 0  0 u 22 u23  = − 4 6 3 
 21 1
l31 l321  0 0 u33  − 4 − 2 8 
⇒ u12 = −1 , u13 = −2
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example


1 0 0  2 − 1 − 2  2 − 1 − 2 for j=i+1:n
l 0 0 u22 u23  = − 4 6 3 
 21 1
LjkUki=Aji gives column i of L

l31 l32 1 0 0 u33  − 4 − 2 8  end

 2
 
(l21 1 0) 0  = −4 ⇒ 2l21 = −4 ⇒ l21 = −2
0
 
Similarly
1 0 0  2 − 1 − 2  2 − 1 − 2 
l 0 0 u 22 u23  = − 4 6 3 
 21 1
l31 l321 0 0 u33  − 4 − 2 8 
⇒ l31 = −2
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example


1 0 0 u11 u12 u13   2 − 1 − 2 1 0 0 2 − 1 − 2  2 − 1 − 2
l 0  0 u22 u23  = − 4 6 3   − 2 1 0 0 4 − 1  =  − 4 6 
 21 1     3 
l31 l32 1  0 0 u33  − 4 − 2 8  − 2 − 1 1  0 0 3  − 4 − 2 8 

3rd row of U
1st row of U
1 0 0  2 − 1 − 2   2 − 1 − 2  1 0 0 2 − 1 − 2  2 − 1 − 2
l 0 0 u22 u23  = − 4 6 3   − 2 1 0 0 4 − 1  =  − 4 6 
 21 1     3 
l31 l32 1 0 0 u33  − 4 − 2 8  − 2 − 1 1  0 0 u33  − 4 − 2 8 

2nd coln of L
1st coln of L
1 0 0 2 − 1 − 2  2 − 1 − 2 1 0 0  2 − 1 − 2   2 − 1 − 2 
− 2 1 0 0 u22 u23  = − 4 6 3  − 2 1 0 0 4 − 1  = − 4 6 3 
 
− 2 l32 1 0 0 u33  − 4 − 2 8  − 2 l32 1 0 0 u33  − 4 − 2 8 
2nd row of U
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example


 Execute algorithm for our example (n=3)
i=1
j=1  L1kUk1=A11 gives row 1 of U
for i=1:n
j=2  L1kUk2=A12 gives row 1 of U
for j=i:n
j=3  L1kUk3=A13 gives row 1 of U
LikUkj=Aij gives row i of U
j=1+1=2  L2kUk1=A21 gives column 1 of L
end
j=3  L3kUk1=A31 gives column 1 of L
for j=i+1:n
i=2
LjkUki=Aji gives column i of L
j=2  L2kUk2=A22 gives row 2 of U
end
j=3  L2kUk3=A23 gives row 2 of U
end
j=2+1=3  L3kUk2=A32 gives column 2 of L
i=3
j=3  L3kUk3=A33 gives row 3 of U
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Crout
 Crout factorization – U has 1’s on its diagonal
 General algorithm – determine columns of L from left to
right; determine rows of U from top to bottom (same!?)

for i=1:n
for j=i:n
LjkUki=Aji gives column i of L
end
for j=i+1:n
LikUkj=Aij gives row i of U
end
end
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Crout example


l11 0 0  1 u12 u13   2 − 1 − 2 for j=i:n
l 0  0 1 u 23  = − 4 6 3 
 21 l22
LjkUki=Aji gives column i of L

l31 l32 l33  0 0 1  − 4 − 2 8  end

1
 
(l11 0 0 ) 0  = 2 ⇒ l11 = 2
 0
 
Similarly
l11 0 0  1 u12 u13   2 − 1 − 2
l 0  0 1 u 23  = − 4 6 3 
 21 l22
l31 l32l33  0 0 1  − 4 − 2 8 
⇒ l21 = −4 , l31 = −4
Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Solution
 Once the L and U matrices have been found, we can
easily solve our system
AX=B  AUX=B
LY=B UX=Y
l11 0 0   y1   b1  u11 u12 u13   x1   y1 
l 0   y2  = b2  0 u u23   x2  =  y2 
 21 l22  22

l31 l32 l33   y3  b3   0 0 u33   x3   y3 


Forward Substitution Backward Substitution
b1 yn
y1 = xn =
l11 unn
i −1 n
bi − ∑ lij x j yi − ∑u ij yj
j =1 j = i +1
yi = for i = 2,...n xi = for i = (n − 1),...1
aii uii
Introduction – Transformations – Sparsity – Simplex – Implementation

References - Intro & Transformations


 Module for Forward and Backward Substitution.
http://math.fullerton.edu/mathews/n2003/BackSubstitutionMod.hml
 Forward and Backward Substitution.
www.ac3.edu/papers/Khoury-thesis/node13.html
 LU Decomposition.
http://en.wikipedia.org/wiki/LU_decomposition
 Crout Matrix Decomposition.
http://en.wikipedia.org/wiki/Crout_matrix_decomposition
 Doolittle Decomposition of a Matrix.
www.engr.colostate.edu/~thompson/hPage/CourseMat/Tutorials/C
ompMethods/doolittle.pdf
Introduction – Transformations – Sparsity – Simplex – Implementation

Definition and Storage of Sparse


Matrix

 sparse … many elements are zero


for example: diag(d1,…,dn), as n>>0.

 dense … few elements are zero

 In order to reduce memory burden, introduce


some kinds of matrix storage format
Introduction – Transformations – Sparsity – Simplex – Implementation

Definition and Storage of


Sparse Matrix
 Regular storage structure.

row 1 1 2 2 4 4
list = column 3 5 3 4 2 3
value 3 4 5 7 2 6
Introduction – Transformations – Sparsity – Simplex – Implementation

Definition and Storage of


Sparse Matrix

– A standard sparse matrix format


Subscripts:1 2 3 4 5 6 7 8 9 10 11
Colptr: 1 4 6 8 10 12
1 -3 0 -1 0 Rowind: 1 3 5 1 4 2 5 1 4 2 5
0 0 -2 0 3 Value: 1 2 5 -3 4 -2 –5-1 –4 3 6
2 0 0 0 0
0 4 0 –4 0
5 0 –5 0 6
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

 A ↔ G(A) ↔ Adj(G)
i
Adj(G)   ∈ {0,1}
 j
 Matrix P is of permutation and G(A) = G(PT AP)

opt{PT AP P : G(PT AP) = G(A)} ⇔

opt{PT adj(G)P P : G(PT adj(G)P) = G(A)}


Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

 1. There is no loop in G(A)


Step 1. No loop and t = 1
Step 2. ∃ output vertices
{ }
St = i1t i2t . . . int t ⊂ V(G)
Step 3.
t 
 i1 i2t . . . t
in 
t
adj(G) = adj(G) \ adj(G)  
t
 i1 i2t . . . int 
 t
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

t=t+1, returning to step 2.


 { 1}
After p times, 1,2,...,n = S ∪ S ∪ ... ∪ Sp
2
T
 T T T 
(separation of set). ∃P = Ps ,Ps ,...,Psp 
 1 2 

such that PTadj(G)P is a lower triangular

block matrix . Therefore PT AP is a lower


Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

triangular block matrix .


 2.There is a loop in G(A).
If the graph is not strongly connected, in the
reachable matrix of adj(A), there are naught
entries.
Step 1. Choose the j-th column, t = 1,and
 
 
T i
 
St (j) = i (R ∩ R )   = 1,1≤ i ≤ n =
 
j 
 
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)
 t 
= i1 i2t . . . t
in 
 t

{
(j ∈ St ⊂ 1 2 . . . n , }
St is closed on strong connection. )
Step 2: Choose the j1-th column ( j1 ≠ j ),
j = j1 , t=t+1, returning step 1.
 After p times,
{1,2,...,n} = S1 ∪ S2 ∪... ∪ Sp
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

∃P = (PsT PsT . . . PsTp )T ,∋ PTadj(G)P


1 2
is a lower triangular block matrix. Therefore
PT AP is a lower triangular block matrix.
 Note of case 2 mentioned above.

 Aˆ 11 Aˆ 12 . . . Aˆ 1p 
 Aˆ Aˆ 22 . . . Aˆ 2p 
= P adj(G)P =  .
T 21

 . . . . . . 
 Aˆ 
Aˆ pp 
 p1 . . . .
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

B = (bij p×
)∈R ,p


= 0 i= j

bij =  = 1 Aˆ ij ≠ 0. ⇒

= 0 Aˆ ij = 0

Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

There is no loop of G(B). According to similar


analysis to case 1,
× 
 
 . ×  ∈ Rp×p
P(j1 j2 . . . jp )T BP(j1 j2 . . . jp ) =
× . × 
 
× × . ×
. Therefore, PT AP is a lower triangular block
matrix.
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (square)

 Reducing computation burden on solution to


large scale linear system
 Generally speaking, for large scale system,
system stability can easily be judged.
 The method is used in hierarchical optimization
of Large Scale Dynamic System
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (General)

Dulmage-Mendelsohn Decomposition
 ∃P,Q such that

A
 h x x 
PT AQ =  As X 

 A v 
 
Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of
Sparse Matrix (General)

 Further, fine decomposition is needed.


Ah → the block diagonal form
A v → the block diagonal form
As → the block upper triangular form.
 D-M decomposition can be seen in reference
[3].
 Computation and storage: Minimum of filling in
The detail can be see in reference [2].
Introduction – Transformations – Sparsity – Simplex – Implementation

LU Factorization Method :
Gilbert/Peierls

 left-looking. kth stage computes kth column of L and U


1. L=I
2. U=I
3. for k = 1:n
4. s = L \ A(:,k)
5. (partial pivoting on s)
6. U(1:k,k) = s(1:k)
7. L(k:n,k) = s(k:n) / U(k,k)
8. end
Introduction – Transformations – Sparsity – Simplex – Implementation
Introduction – Transformations – Sparsity – Simplex – Implementation

LU Factorization Method :
Gilbert/Peierls

 THEOREM(Gilbert/Peierls). The entire


algorithm for LU factorization of A
with partial pivoting can be
implemented to run in O(flops (LU)+ m)
time on a RAM
where m is number of the nonzero
entries of A.
 Note: The theorem expresses that the LU
factorization will run in the time
within a constant factor of the best
possible, but it does not say what the
Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b

x = b;
for j = 1:n
if ≠(x(j) 0) , x(j+1:n) =
x(j+1:n) - L(j+1:n,j) * x(j);
end; ---Total time: O(n+flops)
Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b

 x j ≠ 0 ∧ lij ≠ 0 ⇒ x i ≠ 0
 bi ≠ 0 ⇒ x i ≠ 0
 LetG(L) j →an
have i edge lij ≠ 0
if β = {i b ≠ 0}
i
χ = {i x i ≠ 0}
 Let χ = Re achG(L ) (β) and
 Then
---Total time: O(flops)
Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b


Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b


Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b


Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b


Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b


Introduction – Transformations – Sparsity – Simplex – Implementation

References

 J. R. Gilbert and T. Peierls, Sparse partial pivoting in


time proportional to arithmetic operations, SIAM J.
Sci. Statist. Comput., 9 (1988), pp. 862-874
 Mihalis Yannakakis’ website:
http://www1.cs.columbia.edu/~mihalis
 A. Pothen and C. Fan, Computing the block triangular
form of a sparse matrix, ACM Trans. On Math. Soft.
Vol. 18, No.4, Dec. 1990, pp. 303-324
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Problem size

min {cT x Ax = b, x ≥ 0, A ∈ R m×n }


 Problem size determined by A
 On the average, 5~10 nonzeros per column

greenbea.mps
of NETLIB
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Computational Form, Basis

min {cT x Ax = b, x ≥ 0, A ∈ R m×n }


Note: A is of full row rank and m<n

Basis (of Rm) : m linearly independent columns of A

Basic variables
x1 x2 x3 x4
1 0 1 2 
0 1 2 1 
 
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Notations
β index set of basic variables
γ index set of non-basic variables
B:=Aβ basis
R:=Aγ non-basis (columns)
 xβ   cβ 
A = [ B | R], x =   , c =  
 xγ  cγ 
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basic Feasible Solution
Ax = b Bxβ + Rxγ = b

xβ = B −1 (b − Rxγ )

Basic feasible solution


xγ = 0, x ≥ 0, Ax = b

If a problem has an optimal solution, then there is a basic


solution which is also optimal.
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Checking Optimality
Objective value : cT x = cβT xβ + cγT xγ

= cβT B −1b + ( cγT − cβT B −1 R ) xγ


   
constant Reduced Costs

= c0 + d T xγ

Optimality condition : d j ≥ 0 for all j ∈ γ


Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Improving Basic Feasible Solution

 Choose xq (incoming variable) s.t. dq < 0 Objective value


 increase xq as much as possible c0 + d T xγ
non-basic variables basic variables
basic variables remain feasible
xm +1 = 0 x1 ↑ (≥ 0) even if xq  ∞
⋮ ⋮
xq ↑ xp → objective value is unbounded

⋮ ⋮
xn = 0 xn ↑
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Improving Basic Feasible Solution
 Choose xq (incoming variable) s.t. dq < 0 Objective value
 increase xq as much as possible c0 + d T xγ

non-basic variables basic variables basic variables


xm +1 = 0 x1 ↑ x1 ↑
⋮ ⋮ ⋮
xq ↑ xp → x p ↓ goes to 0 first
(outgoing variable)
⋮ ⋮ ⋮
xn = 0 xn ↑ xn ↑
Unbounded solution neighboring improving basis
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating

Neighboring bases : B = b1 ,..., bp ,..., bm 

B = [b1 ,..., a,..., bm ]


Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating B = b1 ,..., bp ,..., bm 

m
Write a = ∑ v i bi = Bv ( v = B −1 a ) B = [b1 ,..., a,..., bm ]
i =1

(as the linear combination of the bases)


Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating B = b1 ,..., bp ,..., bm 

m
Write a = ∑ v i bi = Bv ( v = B −1 a ) B = [b1 ,..., a,..., bm ]
i =1

1 vi
bp = p a − ∑ p bi ( v p ≠ 0)
v i≠ p v

p −1 p +1 T
 v1
v 1 v v  m
bp = Bη , where η =  − p ,⋯ , − p , p , − p , ⋯, − p 
 v v v v v 
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating B = b1 ,..., bp ,..., bm 

m
Write a = ∑ v i bi = Bv ( v = B −1 a ) B = [b1 ,..., a,..., bm ]
i =1

1 vi
bp = p a − ∑ p bi
v i≠ p v

p −1 p +1 T
 v1
v 1 v v  m
bp = Bη , where η =  − p ,⋯ , − p , p , − p , ⋯, − p 
 v v v v v 

vp : pivot element
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating B = b1 ,..., bp ,..., bm 

m
Write a = ∑ v i bi = Bv ( v = B −1 a ) B = [b1 ,..., a,..., bm ]
i =1

1 vi
bp = p a − ∑ p bi
v i≠ p v

p −1 p +1 T
 v v 1
1 v v  m
bp = Bη , where η =  − p ,⋯ , − p , p , − p , ⋯, − p 
 v v v v v 

B = BE , where E =  e1 ,..., e p −1 ,η , e p +1 ,..., em 


(elementary transformation matrix)
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating B = b1 ,..., bp ,..., bm 

m
Write a = ∑ v i bi = Bv ( v = B −1 a ) B = [b1 ,..., a,..., bm ]
i =1

1 vi
bp = p a − ∑ p bi
v i≠ p v

p −1 p +1 T
 v v 1
1 v v  m
bp = Bη , where η =  − p ,⋯ , − p , p , − p , ⋯, − p 
 v v v v v 

B = BE , where E =  e1 ,..., e p −1 ,η , e p +1 ,..., em 

B −1 = EB −1
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating
E =  e1 , ..., e p −1 ,η , e p +1 ,..., e m 
1 η1 
  ETM
 ⋱  (Elementary Transformation Matrix)
= ηp 
 
 ⋱ 
 ηm 1 

1 0  1 η1  1 0 
 ⋱ ⋮   
⋱ ⋮ 
  ⋱ ⋮   
= ηp  1 ⋯  1 
    
 ⋮ ⋱  ⋮ ⋱   ⋮ ⋱ 
 0 1   0 1   ηm 1 
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Basis Updating
Basis tends to get denser after each update ( B −1 = EB −1 )
1 η1   w1   w1 + w pη 1 
    
 ⋱   ⋮ ⋮ 
Ew =  ηp   w p  =  w pη p 
    
 ⋱  ⋮   ⋮ 
 ηm 1  wm   wm + w pη m 

Ew = w if wp = 0
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Algorithm

Steps Major ops


1. Find an initial feasible basis B
2. Initialization B −1b

3. Check optimality cβT B −1


4. Choose incoming variable xq B −1 a q
5. Choose outgoing variable xp
6. Update basis EB −1
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Algorithm

Steps Major ops


1. Find an initial feasible basis B
2. Initialization B −1b

3. Check optimality cβT B −1


4. Choose incoming variable xq B −1 a q
5. Choose outgoing variable xp (pivot step)
6. Update basis EB −1
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Algorithm

Choice of pivot (numerical considerations)

 resulting less fill-ins


 large pivot element

Conflicting goals sometimes

In practice, compromise.
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– Typical Operations in Simplex Method
Typical operations : B −1w, wT B −1

Challenge: sparsity of B-1 could be destroyed by basis update

Need a proper way to represent B-1

Two ways:
−1
 Product form of the inverse ( B = Ek Ek −1 ⋯ E1 ) (obsolete)
 LU factorization
Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method
– LU Factorization
 Reduce complexity using LU update
( B = BE , B −1 = EB −1 )

Side effect : more LU factors

 Refactorization
(reinstate efficiency and numerical accuracy)
Sparse LU Updates in Simplex Method

Hamid R. Ghaffari

April 10, 2007


Outline

LU Update Methods
Preliminaries
Bartels-Golub LU UPDATE
Sparse Bartels-Golub Method
Reid’s Method
The Forrest-Tomlin Method
Suhl-Suhl Method
More Details on the Topic
Revised Simplex Algorithm

Simplex Method Revised Simplex Method


Determine the current basis, d d = B −1 b
Choose xq to enter the basis based on c̄ = cN0 − cB0 B −1 N,
the greatest cost contribution {q|c̄q = min(c̄t )}
t
If xq cannot decrease the cost, c̄q ≥ 0, d is optimal solution
d is optimal solution
−1
Determine xp that leaves the basis w
n = B Aq ,   o
(become zero) as xq increases. di
p wi = min wdtt , wt > 0
t
If xq can increase without causing If wp ≤ 0 for all i, the solution is
another variable to leave the basis, unbounded.
the solution is unbounded
Update dictionary. Update B −1
Note: In general we do not compute the inverse.
Revised Simplex Algorithm

Simplex Method Revised Simplex Method


Determine the current basis, d d = B −1 b
Choose xq to enter the basis based on c̄ = cN0 − cB0 B −1 N,
the greatest cost contribution {q|c̄q = min(c̄t )}
t
If xq cannot decrease the cost, c̄q ≥ 0, d is optimal solution
d is optimal solution
−1
Determine xp that leaves the basis w
n = B Aq ,   o
(become zero) as xq increases. di
p wi = min wdtt , wt > 0
t
If xq can increase without causing If wp ≤ 0 for all i, the solution is
another variable to leave the basis, unbounded.
the solution is unbounded
Update dictionary. Update B −1
Note: In general we do not compute the inverse.
Revised Simplex Algorithm

Simplex Method Revised Simplex Method


Determine the current basis, d d = B −1 b
Choose xq to enter the basis based on c̄ = cN0 − cB0 B −1 N,
the greatest cost contribution {q|c̄q = min(c̄t )}
t
If xq cannot decrease the cost, c̄q ≥ 0, d is optimal solution
d is optimal solution
−1
Determine xp that leaves the basis w
n = B Aq ,   o
(become zero) as xq increases. di
p wi = min wdtt , wt > 0
t
If xq can increase without causing If wp ≤ 0 for all i, the solution is
another variable to leave the basis, unbounded.
the solution is unbounded
Update dictionary. Update B −1
Note: In general we do not compute the inverse.
Problems with Revised Simplex Algorithm

I The physical limitations of a computer can become a factor.

I Round-off error and significant digit loss are common problems in


matrix manipulations (ill-conditioned matrices).

I It also becomes a task in numerical stability.

I It take m2 (m − 1) multiplications and m(m − 1) additions, a total of


m3 − m floating-point (real number) calculations.

Many variants of the Revised Simplex Method have been designed to


reduce this O(m3 )-time algorithm as well as improve its accuracy.
Introducing Spike

I If Aq is the entering column, B the original basis and B̄ the new


basis, then we have

B̄ = B + (Aq − Bep )eqT ,


Introducing Spike

I If Aq is the entering column, B the original basis and B̄ the new


basis, then we have

B̄ = B + (Aq − Bep )eqT ,

I Having LU decomposition B = LU we have

L−1 B̄ = U + (L−l Aq − Uep )eqT ,


Introducing Spike

I If Aq is the entering column, B the original basis and B̄ the new


basis, then we have

B̄ = B + (Aq − Bep )eqT ,

I Having LU decomposition B = LU we have

L−1 B̄ = U + (L−l Aq − Uep )eqT ,


Introducing Spike

I If Aq is the entering column, B the original basis and B̄ the new


basis, then we have

B̄ = B + (Aq − Bep )eqT ,

I Having LU decomposition B = LU we have

L−1 B̄ = U + (L−l Aq − Uep )eqT ,

I How to deal with this?


Introducing Spike

I If Aq is the entering column, B the original basis and B̄ the new


basis, then we have

B̄ = B + (Aq − Bep )eqT ,

I Having LU decomposition B = LU we have

L−1 B̄ = U + (L−l Aq − Uep )eqT ,

I How to deal with this?

The various implementations and variations of the Bartels-Golub


generally diverge with the next step: reduction of the spiked upper
triangular matrix back to an upper-triangular matrix. (Chvátal, p150)
Bartels-Golub Method
Illustration

The first variant of the Revised Simplex Method was the Bartels-Golub
Method.
Bartels-Golub Method
Algorithm

Revised Simplex Method Bartels-Golub


d = B −1 b d = U −1 L−1 b
c̄ = cN0 − cB0 B −1 N, c̄ = cN0 − cB0 U −1 L−1 N,
{q|c̄q = min(c̄t )} {q|c̄q = min(c̄t )}
t t
c̄q ≥ 0, d is optimal solution c̄q ≥ 0, d is optimal solution
−1 −1 −1
w
n = B Aq ,   o w
n = U L Aq ,  o
di di
p wi = min wdtt , wt > 0 p wi = min wdtt , wt > 0
t t
If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is
unbounded. unbounded.
Update B −1 Update U −1 and L−1
Bartels-Golub Method
Characteristics

I It significantly improved numerical accuracy.

I Can we do better?
Bartels-Golub Method
Characteristics

I It significantly improved numerical accuracy.

I Can we do better? In sparse case, yes.


Sparse Bartels-Golub Method
eta matrices

First take a look at the following facts:

Column-Eta factorization of triangular matrices:


1 1 1 1
       
l21 1 1 1  l21 1
= ·  · l
    
l l32 1 1 l32 1 1 
31 31
l41 l42 l43 1 l43 1 l42 1 l41 1
Sparse Bartels-Golub Method
eta matrices

First take a look at the following facts:

Column-Eta factorization of triangular matrices:


1 1 1 1
       
l21 1 1 1  l21 1
= ·  · l
    
l l32 1 1 l32 1 1 
31 31
l41 l42 l43 1 l43 1 l42 1 l41 1

Single-Entry-Eta Decomposition:
1 1 1 1
       
l21 1  l21 1 1 1
=  · l ·
    
l 1 1 1 1 
31 31
l41 1 1 1 l41 1
Sparse Bartels-Golub Method
eta matrices

First take a look at the following facts:

Column-Eta factorization of triangular matrices:


1 1 1 1
       
l21 1 1 1  l21 1
= ·  · l
    
l l32 1 1 l32 1 1 
31 31
l41 l42 l43 1 l43 1 l42 1 l41 1

Single-Entry-Eta Decomposition:
1 1 1 1
       
l21 1  l21 1 1 1
=  · l ·
    
l 1 1 1 1 
31 31
l41 1 1 1 l41 1

So L can be expressed as the multiplication of single-entry eta matrices,


and hence, L−1 is also is the product of the same matrices with
off-diagonal entries negated.
Sparse Bartels-Golub Method
Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method


d = U −1 L−1 b d = U −1 t ηt b
Q

c̄ = cN0 − cB0 U −1 L−1 N, c̄ = cN0 − cB0 U −1 t ηt N,


Q
{q|c̄q = min(c̄t )} {q|c̄q = min(c̄t )}
t t
c̄q ≥ 0, d is optimal solution c̄q ≥ 0, d is optimal solution
−1 −1 −1 Q
w
n = U L Aq ,  o w
n =U t η
t Aq ,
 o
di
p wi = min wdtt , wt > 0 p wdii = min wdtt , wt > 0
t t
If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is
unbounded. unbounded.
Update U −1 and L−1 Update U −1 and create any
necessary eta matrices.
If there are too many eta matrices,
completely refactor the basis.
Sparse Bartels-Golub Method
Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method


d = U −1 L−1 b d = U −1 t ηt b
Q

c̄ = cN0 − cB0 U −1 L−1 N, c̄ = cN0 − cB0 U −1 t ηt N,


Q
{q|c̄q = min(c̄t )} {q|c̄q = min(c̄t )}
t t
c̄q ≥ 0, d is optimal solution c̄q ≥ 0, d is optimal solution
−1 −1 −1 Q
w
n = U L Aq ,  o w
n =U t η
t Aq ,
 o
di
p wi = min wdtt , wt > 0 p wdii = min wdtt , wt > 0
t t
If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is
unbounded. unbounded.
Update U −1 and L−1 Update U −1 and create any
necessary eta matrices.
If there are too many eta matrices,
completely refactor the basis.
Sparse Bartels-Golub Method
Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method


d = U −1 L−1 b d = U −1 t ηt b
Q

c̄ = cN0 − cB0 U −1 L−1 N, c̄ = cN0 − cB0 U −1 t ηt N,


Q
{q|c̄q = min(c̄t )} {q|c̄q = min(c̄t )}
t t
c̄q ≥ 0, d is optimal solution c̄q ≥ 0, d is optimal solution
−1 −1 −1 Q
w
n = U L Aq ,  o w
n =U t η
t Aq ,
 o
di
p wi = min wdtt , wt > 0 p wdii = min wdtt , wt > 0
t t
If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is
unbounded. unbounded.
Update U −1 and L−1 Update U −1 and create any
necessary eta matrices.
If there are too many eta matrices,
completely refactor the basis.
Sparse Bartels-Golub Method
Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method


d = U −1 L−1 b d = U −1 t ηt b
Q

c̄ = cN0 − cB0 U −1 L−1 N, c̄ = cN0 − cB0 U −1 t ηt N,


Q
{q|c̄q = min(c̄t )} {q|c̄q = min(c̄t )}
t t
c̄q ≥ 0, d is optimal solution c̄q ≥ 0, d is optimal solution
−1 −1 −1 Q
w
n = U L Aq ,  o w
n =U t η
t Aq ,
 o
di
p wi = min wdtt , wt > 0 p wdii = min wdtt , wt > 0
t t
If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is
unbounded. unbounded.
Update U −1 and L−1 Update U −1 and create any
necessary eta matrices.
If there are too many eta matrices,
completely refactor the basis.
Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.


Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta


matrices and U.
Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta


matrices and U.

I The eta matrices were reduced to single-entry eta matrices.


Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta


matrices and U.

I The eta matrices were reduced to single-entry eta matrices.

I Instead of having to store the entire matrix, it is only necessary to


store the location and value of the off-diagonal element for each
matrix.
Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta


matrices and U.

I The eta matrices were reduced to single-entry eta matrices.

I Instead of having to store the entire matrix, it is only necessary to


store the location and value of the off-diagonal element for each
matrix.

I Refactorizations occur less than once every m times, so the


complexity improves significantly to O(m2 ).
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to


promote stability if noticeable round-off errors begin to occur.
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to


promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized


quite frequently, often after every twenty iterations of so. (Chvátal,
p. 111)
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to


promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized


quite frequently, often after every twenty iterations of so. (Chvátal,
p. 111)

I If the spike always occurs in the first column and extends to the
bottom row, the Sparse Bartels-Golub Method becomes worse than
the Bartels-Golub Method.
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to


promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized


quite frequently, often after every twenty iterations of so. (Chvátal,
p. 111)

I If the spike always occurs in the first column and extends to the
bottom row, the Sparse Bartels-Golub Method becomes worse than
the Bartels-Golub Method.
I The upper-triangular matrix will always be fully-decomposed
resulting in huge amounts of fill-in;
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to


promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized


quite frequently, often after every twenty iterations of so. (Chvátal,
p. 111)

I If the spike always occurs in the first column and extends to the
bottom row, the Sparse Bartels-Golub Method becomes worse than
the Bartels-Golub Method.
I The upper-triangular matrix will always be fully-decomposed
resulting in huge amounts of fill-in;
I Large numbers of eta matrices;
Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it


becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to


promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized


quite frequently, often after every twenty iterations of so. (Chvátal,
p. 111)

I If the spike always occurs in the first column and extends to the
bottom row, the Sparse Bartels-Golub Method becomes worse than
the Bartels-Golub Method.
I The upper-triangular matrix will always be fully-decomposed
resulting in huge amounts of fill-in;
I Large numbers of eta matrices;
I O(n3 )-cost decomposition;
Ried Suggestion on Sparse Bartels-Golub Method

Rather than completely refactoring the basis, applying LU-decomposition


only to the part of that remained upper-Hessenberg.
Reid’s Method

Task: The task is to find a way to reduce that bump before attempting to
decompose it;

Row singleton: any row of the bump that only has one non-zero
entry.

Column singleton: any column of the bump that only has one
non-zero entry.

Method:
I When a column singleton is found, in a bump, it is moved to the top
left corner of the bump.

I When a row singleton is found, in a bump, it is moved to the bottom


right corner of the bump.
Reid’s Method
Column Rotation
Reid’s Method
Row Rotation
Reid’s Method
Characteristics

Advantages:
I It significantly reduces the growth of the number of eta matrices in
the Sparse Bartels-Golub Method
I So, the basis should not need to be decomposed nearly as often.

I The use of LU-decomposition on any remaining bump still allows


some attempt to maintain stability.
Disadvantages:
I The rotations make absolutely no allowance for stability whatsoever,

I So, Reid’s Method remains numerically less stable than the Sparse
Bartels-Golub Method.
The Forrest-Tomlin Method
The Forrest-Tomlin Method

Bartels-Golub Method Forrest-Tomlin Method


d = U −1 L−1 b d = U −1 t Rt L−1
Q
b
c̄ = cN0 − cB0 U −1 L−1 N, c̄ = cN0 − cB0 U −1 t Rt L−1 N,
Q
{q|c̄q = min(c̄t )} {q|c̄q = min(c̄t )}
t t
c̄q ≥ 0, d is optimal solution c̄q ≥ 0, d Q
is optimal solution
−1 −1 −1 −1
w
n = U L Aq ,  o w
n = U tRt L  Aq , o
p wdii = min wdtt , wt > 0 p di
wi = min dt
wt , wt > 0
t t
If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is
unbounded. unbounded.
Update U −1 and L−1 Update U −1 creating a row factor
as necessary. If there are too
many factors, completely refactor
the basis.
The Forrest-Tomlin Method Characteristics
Advantages:

I At most one row-eta matrix factor will occur for each iteration
where an unpredictable number occurred before.
I The code can take advantage of such knowledge for predicting
necessary storage space and calculations.
I Fill-in should also be relatively slow, since fill-in can only occur
within the spiked column.

Disadvantages:
I Sparse Bartels-Golub Method allowed LU-decomposition to pivot for
numerical stability, but Forrest-Tomlin Method makes no such
allowances.
I Therefore, severe calculation errors due to near-singular matrices are
more likely to occur.
Suhl-Suhl Method
This method is a modification of Forrest-Tomlin Method.
For More Detail

Leena M. Suhl, Uwe H. Suhl


A fast LU update for linear programming.
Annals of Operations Research 43(1993)33-47
Stiven S. Morgan
A Comparison of Simplex Method Algorithms
University of Florida, 1997
Vasek Chvátal
Linear Programming
W.H. Freeman & Company (September 1983)
Thanks