Justin Romberg Lecture 4

Compressive Sampling
and Frontiers in Signal Processing

`
Emmanuel Candes
New directions short course, IMA, University of Minnesota, June 2007
Lecture 4: The uniform uncertainty principle and its implications

The uniform uncertainty principle (UUP)
The UUP and general signal recovery from undersampled data
Examples of measurements obeying the UUP
Gaussian measurements
Binary measurements
Random orthonormal projections
Bounded orthogonal systems
So far...
Last time: exact recovery of sparse signals
So far...

1
We need to deal with compressible signals (not exactly sparse)
We need to deal with noise
So far...

1
We need to deal with compressible signals (not exactly sparse)
We need to deal with noise

We deal with the first issue today
We will deal with the second issue next time
Last time: uncertainty relation
Key by-product of signal recovery result:

Rmn (sensing matrix)
T arbitrary set of size S
Near isometry: for all x supported on T
1m
3m
kxk2`2 kxk2`2
kxk2`2
2n
2n
Last time: uncertainty relation
Key by-product of signal recovery result:

Rmn (sensing matrix)
T arbitrary set of size S
Near isometry: for all x supported on T
1m
3m
kxk2`2 kxk2`2
kxk2`2
2n
2n
We interpreted this as an uncertainty relation; e.g.
If x is supported on T
Then the energy of x on a set of size m is just about proportional to m
The uniform uncertainty principle

Definition (Restricted isometry constant S )
For each S = 1, 2, . . . , S is the smallest quantity such that
(1 S )kxk2`2 kxk2`2 (1 + S ) kxk2`2 ,
S-sparse x

(1 S )kxk2`2 kxk2`2 (1 + S ) kxk2`2 ,
S-sparse x
Or equivalently for all T with |T| S

1 S min (T T ) max (T T ) 1 + S
T Rm|T| columns with indices in T, |T| S
T

(1 S )kxk2`2 kxk2`2 (1 + S ) kxk2`2 ,
S-sparse x
Or equivalently for all T with |T| S

1 S min (T T ) max (T T ) 1 + S
T Rm|T| columns with indices in T, |T| S
T
Sparse subsets of column vectors are approximately orthonormal
Why is this is an uncertainty principle?
Suppose =
pn
m
R F
F is the n by n Fourier isometry

is a set of frequencies
Suppose =
pn
m
R F

Suppose S = 1/2
Arbitrary support T with |T| S
Arbitrary signal supported on T
1m
3m
kxk2`2 k1 xk2`2
kxk2`2
2n
2n
Suppose =
pn
m
R F

Suppose S = 1/2
Arbitrary support T with |T| S
Arbitrary signal supported on T
1m
3m
kxk2`2 k1 xk2`2
kxk2`2
2n
2n
Uniform because holds for all T
Foundational result of CS?

min ksk`1
s = y = x
xS : best S-term approximation of x (S largest entries)

min ksk`1
s = y = x

Theorem (C., Tao (2004)a )
a although
the statement below is due to C. (2007)
Assume 2S <
2 1 = .414. Then
kx? xk`1 . kx xS k`1 ,
kx? xk`2 .
kx xS k`1

min ksk`1
s = y = x

a although
Assume 2S <
2 1 = .414. Then

Deterministic: nothing is random here!
kx? xk`2 .
kx xS k`1

min ksk`1
s = y = x

a although
Assume 2S <
2 1 = .414. Then
kx? xk`2 .
kx xS k`1
Deterministic: nothing is random here!

Exact if x is S-sparse
Otherwise, essentially reconstructs the S largest entries of x
Powerful if S is close to m
The condition 2S < 1 is necessary

Exact recovery if x is S-sparse
With 2S = 1, one can have h Rn , 2S-sparse with
h = 0
Decompose h = x x0 each S-sparse with
(x x0 ) = 0
x = x0
Two S sparse signals produce the same measurements need 2S < 1
The condition 2S < 1 is necessary

Exact recovery if x is S-sparse
With 2S = 1, one can have h Rn , 2S-sparse with
h = 0
Decompose h = x x0 each S-sparse with
(x x0 ) = 0
x = x0
Two S sparse signals produce the same measurements need 2S < 1
`1 recovery condition is only slightly stronger: 2S <
21
Formal equivalence
Suppose there is an S-sparse solution to y = x

Combinatorial optimization problem
(P0 )
min ksk`0 ,
sRn
s = y
Convex optimization problem (Linear Program)

(P1 )
min ksk`1 ,
sRn
s = y
Formal equivalence
Suppose there is an S-sparse solution to y = x

Combinatorial optimization problem
(P0 )
min ksk`0 ,
sRn
s = y
Convex optimization problem (Linear Program)

(P1 )
min ksk`1 ,
sRn
s = y
Equivalence
If 2S < 1, solution to (P0 ) is unique
If 2S < 2 1, the solutions to (P0 ) and (P1 ) are unique and the same!
Proof of general `1 recovery

min ksk`1
Proof is elementary but nontrivial
s. t.
y = s = (x)

min ksk`1
s. t.
y = s = (x)

Solution x? = x + h, h = 0
x solution gives
kx + hk`1 kxk`1
T0 location of the S largest coefficients

min ksk`1
s. t.
y = s = (x)

Solution x? = x + h, h = 0
x solution gives
kx + hk`1 kxk`1
T0 location of the S largest coefficients
kx + hk`1 =
X
iT0
X
iT0
|xi + hi | +
|xi + hi |
iT0c
|xi | |hi | +
|hi | |xi |
iT0c
= kxk`1 + khk`1 (T0c ) khk`1 (T0 ) 2kxk`1 (T0c )
Implication
khk`1 (T0c ) khk`1 (T0 ) + 2kxk`1 (T0c ) = khk`1 (T0 ) + 2kx xS k`1
Implication
Uniform uncertainty principle implies

khk`1 (T0 ) khk`1 (T0c ) ,
<1
()
Implication

khk`1 (T0 ) khk`1 (T0c ) ,
<1
This gives
khk`1 (T0c )
2
kx xS k`1
1
()
Implication

khk`1 (T0 ) khk`1 (T0c ) ,
<1
This gives
khk`1 (T0c )
2
kx xS k`1
1
and
kx? xk`1 2
1+
kx xS k`1
1
()
Implication

khk`1 (T0 ) khk`1 (T0c ) ,
<1
This gives
khk`1 (T0c )
2
kx xS k`1
1
and
kx? xk`1 2
We are done!
1+
kx xS k`1
1
()
Implication

khk`1 (T0 ) khk`1 (T0c ) ,
<1
This gives
khk`1 (T0c )
2
kx xS k`1
1
and
kx? xk`1 2
1+
kx xS k`1
1
We are done! Well almost, we need to check ()
()
By Cauchy-Schwarz
khk`1 (T0 )
S khk`2 (T0 )
By Cauchy-Schwarz
khk`1 (T0 )
S khk`2 (T0 )
into subsets of size S and enumerate T0c as k1 , k2 , . . . , knS in

Divide
decreasing order of magnitude of hT0c . Set
T0c
Tj = {k` : (j 1)S0 + 1 ` jS0 }
T1 : indices of the S0 largest coefficients of hT0c ,

T2 : indices of the next S0 largest coefficients,
and so on...
By Cauchy-Schwarz
khk`1 (T0 )
S khk`2 (T0 )

Divide
T0c
Tj = {k` : (j 1)S0 + 1 ` jS0 }

and so on...
Now h = 0 gives
hT0 T1 = h(T0 T1 )c =
X
j2
hTj
By Cauchy-Schwarz
khk`1 (T0 )
S khk`2 (T0 )

Divide
T0c
Tj = {k` : (j 1)S0 + 1 ` jS0 }

and so on...
Now h = 0 gives
hT0 T1 = h(T0 T1 )c =
hTj
j2
This is interesting because

X
X
p
p
1 S+S0 khk`2 (T0 T1 ) khT0 T1 k`2
khTj k`2 1 + S0
khTj k`2
j2
j2
For each j 2
khTj k`2
1
S0 khTj k` khTj1 k`1
S0
For each j 2
khTj k`2
and thus
X
j2
1
S0
khT c k`1
1
khTj k`2 (khT1 k`1 + khT2 k`1 + . . .) 0
0
S
S0
For each j 2
khTj k`2
and thus
X
j2
1
S0
khT c k`1
1
khTj k`2 (khT1 k`1 + khT2 k`1 + . . .) 0
0
S
S0
In short
s
khk`2 (T0 T1 )
1 + S0 khT0c k`1
1 S+S0
S0
For each j 2
khTj k`2
and thus
X
j2
1
S0
khT c k`1
1
khTj k`2 (khT1 k`1 + khT2 k`1 + . . .) 0
0
S
S0
In short
s
khk`2 (T0 T1 )
1 + S0 khT0c k`1
1 S+S0
S0
It then follows that

khk`1 (T0 )
S khk`2 (T0 )
S khk`2 (T0 T1 ) khT0c k`1 ,
2 =
S 1 + S 0
S0 1 S+S0
For each j 2
khTj k`2
and thus
X
j2
1
S0
khT c k`1
1
khTj k`2 (khT1 k`1 + khT2 k`1 + . . .) 0
0
S
S0
In short
s
khk`2 (T0 T1 )
1 + S0 khT0c k`1
1 S+S0
S0
It then follows that

khk`1 (T0 )
S khk`2 (T0 )
S khk`2 (T0 T1 ) khT0c k`1 ,
Pick S0 = 2S, then

=
1 1 + 2S
<1
2 1 3S
iff
2S + 3S < 1
2 =
S 1 + S 0
S0 1 S+S0
Slight improvement
Lemma
khT0 k`1
2 2S
khT0c k`1
1 2S
A slight improvement
Definition (Restricted orthogonality numbers)
S,S0 : smallest quantity for which
|hx, x0 i| S,S0 kxk`2 kx0 k`2
x, x0 supported on disjoint subsets T, T 0 {1, . . . , n} with |T| S, |T 0 | S0
Suppose x and x0 are unit vectors as above. Then
2(1 S+S0 ) kx + x0 k2`2 2(1 + S+S0 )
2(1 S+S0 ) kx x0 k2`2 2(1 + S+S0 )
Parallelogram identity
|hx, x0 i| =

1
kx + x0 k2`2 kx x0 k2`2 S+S0
4
Suppose x and x0 are unit vectors as above. Then
2(1 S+S0 ) kx + x0 k2`2 2(1 + S+S0 )
2(1 S+S0 ) kx x0 k2`2 2(1 + S+S0 )
Parallelogram identity
|hx, x0 i| =
Therefore S,S0 S+S0

1
kx + x0 k2`2 kx x0 k2`2 S+S0
4
Go back to the proof of the recovery result and set S0 = S,

X
hT0 T1 = h(T0 T1 )c =
hTj
j2
gives
khT0 T1 k2`2 = hhT0 T1 ,
X
j2
hTj i = hhT0 ,
hTj i hhT1 ,
j2
S,S (khT0 k`2 + khT1 k`2 )
hTj i
j2
X
j2
khTj k`2

X
hT0 T1 = h(T0 T1 )c =
hTj
j2
gives
khT0 T1 k2`2 = hhT0 T1 ,
hTj i = hhT0 ,
j2
hTj i hhT1 ,
j2
hTj i
j2
S,S (khT0 k`2 + khT1 k`2 )
khTj k`2
j2
In other words and since khT0 k`2 + khT1 k`2

(1 2S )khT0 T1 k2`2 khT0 T1 k2`2
2khT0 T1 k`2 ,
2 S,S khT0 T1 k`2
X
j2
khTj k`2

X
hT0 T1 = h(T0 T1 )c =
hTj
j2
gives
khT0 T1 k2`2 = hhT0 T1 ,
hTj i = hhT0 ,
j2
hTj i hhT1 ,
j2
hTj i
j2
S,S (khT0 k`2 + khT1 k`2 )
khTj k`2
j2
In other words and since khT0 k`2 + khT1 k`2

(1 2S )khT0 T1 k2`2 khT0 T1 k2`2
2khT0 T1 k`2 ,
2 S,S khT0 T1 k`2
X
j2
and thus
khT0 T1 k`2 0
X
j2
khTj k`2 ,
0 =
2 S,S
1 2S
khTj k`2
We finish as before and obtain

khT0 k`1 0 khT0c k`1 ,
About the constant

Recall h = x? x and
khk`1 (T0 ) khT0c k`1
khk`1 = 2
1+
kx xS k`1
1
About the constant

Recall h = x? x and
khk`1 = 2
1+
kx xS k`1
1
Since
2 2S
12S ,
we have
1 + ( 2 1)2S
1
1+
1 (1 + 2)2S
and
kx? xk`1 2
1 + ( 2 1)2S
kx xS k`1
1 (1 + 2)2S
About the constant

Recall h = x? x and
khk`1 = 2
1+
kx xS k`1
1
Since
2 2S
12S ,
we have
1 + ( 2 1)2S
1
1+
1 (1 + 2)2S
and
kx? xk`1 2
1 + ( 2 1)2S
kx xS k`1
1 (1 + 2)2S
If 2S < 1/4, then

kx? xk`1 5.5kx xS k`1 !
What if one is interested in the `2 error?

The kth largest value of hT0c obeys
|hT0c |(k)
khT0c k`1
k
and, therefore, choosing S0 = S gives

kh(T0 T1
)c )
k2`2
kh
T0c
k2`1
n
X
khT0c k2`1
1
2
k
S
k=S+1

|hT0c |(k)
khT0c k`1
k

kh(T0 T1
)c )
k2`2
kh
T0c
k2`1
n
X
khT0c k2`1
1
2
k
S
k=S+1
With
2 2S
12S ,
previous analysis gives

1
khT0 T1 k`2 khT0c k`1
S
2
khT0c k`1
kx xS k`1
1

|hT0c |(k)
khT0c k`1
k

kh(T0 T1
)c )
k2`2
kh
T0c
k2`1
n
X
khT0c k2`1
1
2
k
S
k=S+1
With
2 2S
12S ,
previous analysis gives

1
khT0 T1 k`2 khT0c k`1
S
2
khT0c k`1
kx xS k`1
1
And thus
khT0c k`1
1 + kx xS k`1
2
1
S
S
This is exactly the same constant!
khk`2 (1 + )
Examples of matrices obeying the UUP
Gaussian matrices
Random projections
Binary matrices
Bounded orthonormal systems
Gaussian matrices
Phi =randn(m,n)/sqrt(m); indep. normal entries with mean 0 and

std. 1/ m
Gaussian matrices
Phi =randn(m,n)/sqrt(m); indep. normal entries with mean 0 and

std. 1/ m
Condition holds if
m & S log(n/S)
Random projections
X = randn(n,m); (sample m dimensional Gaussian vectors)

Phi = qr(X,0); (orthonormalize the columns and transpose)
Phi = Phi/sqrt(n/m); (renormalize)
Random projections
X = randn(n,m); (sample m dimensional Gaussian vectors)

Phi = qr(X,0); (orthonormalize the columns and transpose)
Phi = Phi/sqrt(n/m); (renormalize)
Condition holds if
m & S log(n/S)
Binary matrices
with symmetric i.i.d. entries P(i,j = 1/ m) = 1/2
Binary matrices
Condition holds if
m & S log(n/S)
Binary matrices
Condition holds if
m & S log(n/S)
DeVore et al., Pajor et. al.
Binary matrices
Condition holds if
m & S log(n/S)
DeVore et al., Pajor et. al.
Same result is true for large classes of distributions, i.e. subGaussian
distributions

U : n n orthogonal matrix; U U = I
obtained by selecting
m rows from U uniformly at random (and
p
normalizing by n/m)
Recall coherence
(U) =
n sup |Uj,k |
j,k

p
normalizing by n/m)
Recall coherence
(U) =
n sup |Uj,k |
j,k
Theorem (C. and Tao (2004))

Condition holds if
m & 2 (U) S (log n)6

p
normalizing by n/m)
Recall coherence
(U) =
n sup |Uj,k |
j,k

Condition holds if
m & 2 (U) S (log n)6
Rudelson and Vershynin improved this and got (log n)5 instead

p
normalizing by n/m)
Recall coherence
(U) =
n sup |Uj,k |
j,k

Condition holds if
m & 2 (U) S (log n)6
Rudelson and Vershynin improved this and got (log n)5 instead
This is not trivial at all
Certainly not optimal
Improvements will be hard
Last time: incoherence
Sparse in basis 1
Undersample in basis 2
System
U = 2 1
Coherence
(U) =
n sup |Uj,k | =
j,k
(1)
(2)
n sup |hj , k i| = (1 , 2 )
j,k
Last time: incoherence
Sparse in basis 1
Undersample in basis 2
System
U = 2 1
Coherence
(U) =
n sup |Uj,k | =
j,k
(1)
(2)
n sup |hj , k i| = (1 , 2 )
j,k
Condition holds if
m & 2 (1 , 2 ) S (log n)5
What is Compressed Sensing?

Classical viewpoint
Measure everything (all the pixels, all the coefficients)
Keep largest coefficients: distortion is kf fS k

Classical viewpoint
Compressed sensing
Take m incoherent (e.g. Gaussian) measurements: y = f
Reconstruct by linear programming
f ? = arg min kf k`1
subject to f = y

Classical viewpoint
Compressed sensing
Take m incoherent (e.g. Gaussian) measurements: y = f
Reconstruct by linear programming
f ? = arg min kf k`1
subject to f = y
Same performance!
(Sketch) If m & S log(n/S),
kf ? f k kf fS k
Incoherent measurements compressed version of the object

Simultaneous signal acquisition and compression!

All is needed is to decompress...

All is needed is to decompress...
This suggests the possibility of compressed data acquisition

protocols which perform as if it were possible to directly acquire just
the important information about the image of interest.
Dennis Healy, DARPA

Justin Romberg Lecture 4

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Justin Romberg Lecture 4

Hochgeladen von

Copyright:

Verfügbare Formate

Compressive Sampling

and Frontiers in Signal Processing

New directions short course, IMA, University of Minnesota, June 2007

Lecture 4: The uniform uncertainty principle and its implications

Last time: exact recovery of sparse signals

Last time: exact recovery of sparse signals

We need to deal with compressible signals (not exactly sparse)

We need to deal with noise

Last time: exact recovery of sparse signals

We need to deal with compressible signals (not exactly sparse)

We need to deal with noise

Last time: uncertainty relation

Key by-product of signal recovery result:

Last time: uncertainty relation

Key by-product of signal recovery result:

The uniform uncertainty principle

The uniform uncertainty principle

Or equivalently for all T with |T| S

The uniform uncertainty principle

Or equivalently for all T with |T| S

Sparse subsets of column vectors are approximately orthonormal

Why is this is an uncertainty principle?

F is the n by n Fourier isometry

Why is this is an uncertainty principle?

F is the n by n Fourier isometry

Why is this is an uncertainty principle?

F is the n by n Fourier isometry

Foundational result of CS?

xS : best S-term approximation of x (S largest entries)

Foundational result of CS?

xS : best S-term approximation of x (S largest entries)

the statement below is due to C. (2007)

kx? xk`1 . kx xS k`1 ,

Foundational result of CS?

xS : best S-term approximation of x (S largest entries)

the statement below is due to C. (2007)

kx? xk`1 . kx xS k`1 ,

Foundational result of CS?

xS : best S-term approximation of x (S largest entries)

the statement below is due to C. (2007)

kx? xk`1 . kx xS k`1 ,

Deterministic: nothing is random here!

The condition 2S < 1 is necessary

Two S sparse signals produce the same measurements need 2S < 1

The condition 2S < 1 is necessary

Two S sparse signals produce the same measurements need 2S < 1

`1 recovery condition is only slightly stronger: 2S <

Suppose there is an S-sparse solution to y = x

Convex optimization problem (Linear Program)

Suppose there is an S-sparse solution to y = x

Convex optimization problem (Linear Program)

Proof of general `1 recovery

Proof of general `1 recovery

Proof is elementary but nontrivial

Proof of general `1 recovery

Proof is elementary but nontrivial

= kxk`1 + khk`1 (T0c ) khk`1 (T0 ) 2kxk`1 (T0c )

Uniform uncertainty principle implies

Uniform uncertainty principle implies

Uniform uncertainty principle implies

Uniform uncertainty principle implies

Uniform uncertainty principle implies

We are done! Well almost, we need to check ()

into subsets of size S and enumerate T0c as k1 , k2 , . . . , knS in