Beruflich Dokumente
Kultur Dokumente
Anna C. Gilbert
Department of Mathematics
University of Michigan
accurately (mathematics)
concisely (statistics)
efficiently (algorithms)
Redundancy
If one orthonormal basis is good, surely two (or more) are better...
Redundancy
If one orthonormal basis is good, surely two (or more) are better...
Dictionary
Definition
A dictionary D in Rn is a collection {` }d`=1 Rn of unit-norm
vectors: k` k2 = 1 for all `.
Elements are called atoms
If span{` } = Rn , the dictionary is complete
Matrix representation
Form a matrix
= 1 2 . . . d
so that
c =
c ` ` .
Examples: FourierDirac
= [F |I ]
1
` (t) = e2i`t/n ` = 1, 2, . . . , n
n
` (t) = ` (t) ` = n + 1, n + 2, . . . , 2n
Sparse Problems
Exact. Given a vector x Rn and a complete dictionary ,
solve
min kck0 s.t. x = c
c
s.t.
kx ck2
s.t.
kck0 k
NP-hardness
Theorem
Given an arbitrary redundant dictionary and a signal x, it is
NP-hard to solve the sparse representation problem Sparse.
[Natarajan95,Davis97]
Corollary
Error is NP-hard as well.
Corollary
It is NP-hard to determine if the optimal error is zero for a given
sparsity level k.
Proposition
Any instance of X3C is reducible in polynomial time to Sparse.
X1
X2
X3
XN
Redundant dictionary
input signal x
NP-hard
arbitrary
arbitrary
depends on
choice of
fixed
fixed
random
(distribution?)
random
(distribution?)
compressive
sensing
random
signal model
When atoms are (nearly) parallel, cant tell which one is best
Coherence
Definition
The coherence of a dictionary
= max | hj , ` i |
j6=`
Small coherence
(good)
Large coherence
(bad)
1
n
1
2
0.3
0.25
0.2
0.15
0.1
0.05
Greedy algorithms
[Mallat93, Davis97]
new approximation at c
Convergence of OMP
Theorem
Suppose is a complete dictionary for Rn . For any vector x, the
residual after t steps of OMP satisfies
C
kx ck2 .
t
[Devore-Temlyakov]
Convergence of OMP
Theorem
Suppose is a complete dictionary for Rn . For any vector x, the
residual after t steps of OMP satisfies
C
kx ck2 .
t
[Devore-Temlyakov]
where A+ = (A A)1 A .
[Tropp04]
Theorem
The ERC holds whenever k < 12 (1 + 1). Therefore, OMP can
recover any sufficiently sparse signals. [Tropp04]
`2
`k
`2
`Nk
`s
and
`s
/
k r k
max i.p. bad atoms
max`
/ | hr , ` i |
=
=
max` | hr , ` i |
max i.p. good atoms
r
Sparse
Theorem
1
. For any vector x, the approximation b
x after k
Assume k 3
steps of OMP satisfies
kx b
x k2 1 + 6k kx xk k2
[Tropp04]
Theorem
Assume 4 k
1 .
kx b
x k2 3 kx xk k2 .
2k 2
kx xk k2 .
(1 2k)2
s.t.
x = c
s.t.
x = c
s.t.
x = c
s.t.
kx ck2
s.t.
x = c
s.t.
x = c
s.t.
kx ck2
s.t.
kx ck2 .
where
A+
(AT A)1 AT .
[Tropp04]
where
A+
Theorem
(AT A)1 AT .
[Tropp04]
Constrained minimization:
arg min kck1
s.t.
kx ck2 .
Unconstrained minimization:
minimize L(c; , x) =
1
kx ck22 + kck1 .
2
s.t.
kck0 k
Connection between...
=
X
responses
Xj = (X1j , . . . , XN j )T
predictor variables
[Girosi 96]
Connection between...
=
data/image
measurements/coeffs.
Interchange roles
Assume x has
low complexity:
x is k-sparse
(with noise)
m as small
as possible
Construct
Matrix : Rn Rm
Decoding algorithm D
Given x for any signal x Rn , we can, with high probability,
quickly recover b
x with
kx b
x kp (1 + )
min
y ksparse
kx y kq = (1 + )kx xk kq
=
X
responses
Xj = (X1j , . . . , XN j )
predictor variables
Analogy: root-finding
p
!
p! with
with
|!
p p| !
|f (!
p) 0| !
root = p
Parameters
1. Number of measurements m
2. Recovery time
3. Approximation guarantee (norms, mixed)
4. One matrix vs. distribution over matrices
5. Explicit construction
6. Universal matrix (for any basis, after measuring)
7. Tolerance to measurement noise
Applications
Applications
recover approximation to x
code {y Rn | Ay = 0}
* Error-correcting codes
code {y Rn |y
0} vector, Ax = syndrome
x ==
error
x = error vector, x = syndrome
Two approaches
Geometric
Combinatorial
[Gilbert-Guha-Indyk-Kotidis-Muthukrishnan-Strauss 02],
Summary