Beruflich Dokumente
Kultur Dokumente
Outline
• Logistics
• Motivation
• Convolution
• Filtering
Waitlist
• I think 1-way lectures are boring (and such context can easily be found
elsewhere). Discussions are way more fun! I encourage you to come to
class.
• I hate power-point. I’d rather write on board, but this room is not
conducive for it. I still encourage you to take notes.
• If you are going to come and check e-mail / Facebook, I’d rather you
drop now to make room for someone else who’d get more out of
lecture.
Outline
• Logistics
• Motivation
• Convolution
• Filtering
Computational perspective
Credited with early computational approach for vision
F[i,j]
Characterizing image
transformations
F[i,j] T G[i,j]
5 4 2 3 7 4 6 5 3 6 F[i] T G[i] 5 4 2 3 7 4 6 5 3 6
G = T (F )
G[i] = T (F [i])
(Abuse of T (↵F1[i]+does
notation: ↵F not2 ) = transformation
mean ↵G1 + Gis 2applied at each pixel separately)
How do we characterize
image processing operations ?
Properties of “nice” functional transformations
Additivity
T (F1 + F2 ) = T (F1 ) + T (F2 )
Scaling
T (↵F ) = ↵T (F )
Shift Invariance
G[i j] = T (F [i j])
Impulse response
[also called delta function]
Any function can be written as linear combination of shifted and scaled impulse reponses
= + + + +
... ...
F [i] =Representing
F [0] [i]signals+ Fwith [1] impulses.
[i 1]Any + .signal
. . can be expressed as a sum of scaled and
shiftedXunit impulses. We begin with the pulse or “staircase” approximation to a continuous
F [i]signal
= , as illustrated in Fig. 1. Conceptually, this is trivial: for each discrete sample of the
F [u] [i u]
original signal, we make a pulse signal. Then we add up all these pulse signals to make up the
u signal. Each of these pulse signals can in turn be represented as a standard pulse
approximate
X
scaled by the appropriate value and shifted to the appropriate place.
impulseIn mathematical
response,notation:
filter, kernel
T (F [i]) = F [u]T ( [i u])
u
X
G[i]As=we let F [u]H[i
approach zero, the u] where becomes
approximation H[i]better
= Tand( better,
[i]),andG[i]
the in = T (F [i])
the limit
equals . Therefore,
u
G=F ⇤H
Also, as , the summation approaches an integral, and the pulse approaches the unit impulse:
January 20, 2015
Example X
G[i] = F [i] ⇤ H[i] = F [u]H[i u]
u
H F
X
1 2 3 * 5 ⇤4F [i]2= 3 H[u]F
= H[i] 7 4[i 6u] 5 3 6
u
0 1 2 0 1 2 3 4 5 6 7 8 9
X
G[i] = F [i] ⌦ H[i] = H[u]F [i + u]
u
G[0] = ?
= F [i]G[1]
⇤ H[= ?i]
XX
G[i, j] = F ⇤ H = F [u, v]H[i u, j v]
u v
XX
G[i, j] = F ⇤ H = H ⇤ F = H[u, v]F [i u, j v]
u v
Example
1 2 3 * 5 4 2 3 7 4 6 5 3 6
-3 -2 -1 0 1 2 3 4 5 6 7 8 9
3 2 1
G[0] = 5x1 = 5
G[1] = 5x2+ 4x1 = 14
G[2] = 5x3 + 4x2 + 2x1 = 25
…
Preview of 2D
h
f
Properties of convolution
F ⇤H =H ⇤F Commutative
(F ⇤ H) ⇤ G = F ⇤ (H ⇤ G) Associative
(F ⇤ G) + (H ⇤ G) = (F + H) ⇤ G Distributive
N+M-1
>>conv(F,H,’full’)
N-M+1
>>conv(F,H,’valid’)
>>conv(F,H,’same’) N
January
Deva Ramanan 14, 2015
A simpler approach
January 14, 2015
XX
G[i, j] = F ⌦ H = H[u, v]F [i + u, j +
u v
XX XX
1 2 j]3 = F ⌦ HG[i,
G[i, 2 ⇤
=5 j] 4= G 3H =
H[u, 7v]F
4 [i + 5H[u,
6 u, j+ v]F
3 v]6 [i u, j
u v u v
-1 0 1 0X1X2 3 4 5 6 7 8 9
G[i, j] = G ⇤ H = H[u, v]F [i u, j v]
u v
Scan original F instead of flipped version.
What’s the math?
January
Deva Ramanan 14, 2015
(Cross) correlation
January 14, 2015
XX
G[i, j] = F ⌦ H = H[u, v]F [i + u, j +
u v
XX XX
1 2 j]3 = F ⌦ HG[i,
G[i, 2 ⇤
=5 j] 4= G 3H =
H[u, 7v]F
4 [i + 5H[u,
6 u, j+ v]F
3 v]6 [i u, j
u v u v
-1 0 1 0X1X2 3 4 5 6 7 8 9
G[i, j] = G ⇤ H = H[u, v]F [i u, j v]
u v
Scan original F instead of flipped version.
What’s the math?
u=k
X
F [i] ⌦ H[i] = H[u]F [i + u]
u= k
Properties
X
G[i] = F [i] ⇤ H[i] =X F [u]H[i u]
Xu (convolution)
G[i] = F [i] ⇤ H[i] =
G[i] = F [i] ⇤ H[i] = FF[u]H[i
[u]H[i u]u]
u
u
X
= H[i] ⇤ F [i] =
X F [i
X u]H[u]
H[i] ⇤⇤ FF[i]
= H[i] [i] == (commutative property)
= uH[u]F[i[i u]u]
H[u]F
uu
X
X
X
G[i] = F [i] ⌦ H[i] = F [i + u]H[u]
G[i] = F
F[i] ⌦ H[i]
[i] ⌦ H[i]== H[u]F[i[i++u]u]
H[u]F (cross-correlation)
u
uu
= [i]⇤⇤⇤H[
=FFF[i]
[i] H[i]i] i]
H[ (exercise for reader!)
X
XXX
XX
G[i, j]
G[i,
G[i, j] =FFF⇤⇤⇤H
j] =
= HH=
== FF[u,
F v]H[i
[u,[u, u,u,j ju,v]jv] v]
v]H[i
v]H[i
Image filtering h[⋅ ,⋅ ] 1 1 1
2D correlation k k
1 1 1
1 1 1
X X
f [.,.]
Gaussian filtering
G[i, j] = F ⌦ H = H[u, v]F [i + u, j + v]
A Gaussian
u= kernel
k v=gives
of the window
g[.,.]
k less weight to pixels further from the center
0 00 00 00 00 00 00 00 00 00 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30
0 0 0 90 90 90 90 90 0 0
1 2 1
0 00 00 090 9090 9090 9090 9090 900 00 0 0 20 40 60 60 60
2 4 2
0 0 0 90 90 90 90 90 0 0
0 1 30 2 60 1 90
00 00 090 9090 090 9090 9090 900 00 0 0 90 90
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0 0 30 50 80 80 90
0 0 0 0 0 0 0 0 0 0
0 00 00 9090 00 090 090 0 90 00 00 0 0 30 50 80 80 90
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0 0 20 30 50 50 60
0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30
0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0
0 This
0 kernel
0 is an0approximation
0 0 0 0 of a
0 Gaussian
0 function:
Slide by Steve Seitz
60
Convolution Template
= F [i] ⇤ H[ i]
vs correlation
G[i, j] = F ⇤ H = Deva(2-d)
XX
F Ramanan
[u, v]H[i u, j v]
u v
Convolution:
January
X14,
X 2015
h
G[i, j] = F ⇤ H = H ⇤ F = H[u, v]F [i u, j v] f
u v
Correlation: XX
G[i, j] = F ⌦ H = X Xv]F [i + u, j + v]
H[u, h
G[i, j] = F ⌦ Hu =v H[u, v]F [i + u, j + v] f
u v
XX
G[i, j] = G ⇤ H
Xk =X
k H[u, v]F [i u, j v]
G[i, j] = F ⇤ H = >> conv2(H,F)
H[u, v]F [i + u, j +convolution
v]
u v
u= k>>
v=filter2(H,F)
k correlation
f f f
g g
g g
g g
Border padding
Borders!
Examples of correlation
Linear filters: examples
1 1 1
1 1 1
1 1 1
=
Original Blur (with a mean
filter)
Source: D. Lowe
Examples offilterscorrelation
Practice with linear
0 0 0
0 1 0 ?
0 0 0
Original
Source: D. Lowe
Examples offilterscorrelation
Practice with linear
0 0 0
0 1 0
0 0 0
Original Filtered
(no change)
Source: D. Lowe
Examples of correlation
Practice with linear filters
Practice with linear filters
0 0 0
10 0 01 ?
0 0 0
Source: D.
Source: D. Lowe
Lowe
Examples of correlation
Practice with linear filters
0 0 0
0 0 1
0 0 0
Source: D. Lowe
What would this look like for convolution?
Examples of correlation
Practice with linear filters
Practice with linear filters
00 00 00
10 00 01 ?
00 00 00
Original
Original Shifted left
By 1 pixel
Source:
Source:D.
D.Lowe
Lowe
Examples
Practice with
with
of
linear correlation
filters
Practice with linear filters
Practice linear filters
0 0
00 0
00 00
0
0
0
10
0
1
00
0
01 ?
00 00 00
Original
Original Shifted left
Original Shifted left
By 1 pixel
By 1 pixel
Source: D. Lowe
Source:
Source:D.
D.Lowe
Lowe
Examples of correlation
Practice with linear filters
0 0 0
1 2 1
0 1 0
2 4 2 /16
0 0 0
1 2 1
Original Filtered
(no change)
0 0 00 00 0 0 0 0
0 0 21 00 - -0 1 0 ? 0
0 0 00 00 0 0 0 0
Source: D. Lowe
Examples of correlation
Practice with linear filters
0 0 00 00 1 2 1 0 0 0
( 0 0 11 00 - -2 4 2 /16 ) + 0 1 0
0 0 00 00 Sharpen1filter 2 1 0 0 0
blurred
Original image
image
Filtered
unit impulse
(identity)
(no change)
Source: D. Lowe
Unsharp filter
Examples
Image!rota>on
Image!rota>on! !
? ? ?
? ? ⊗= ? ? =
⊗ ?
? ? ?
h[m,n] h[m,n]
€ €
f[m,n] f[m,n]
g[m,n] g[m,n]
1
⇥ ⇤
1 1 1
Practice
Question: what withas
happens linear filters
we repeatedly convolve
an image F with filter H?
F F*H
0 0 0
0 1 0
0 0 0
Original Filtered
(no change)
Source: D. Lowe
Aside for the probability junkies: The PDF of the sum of two random variables = convolution of their
PDFs functions. Repeated convolutions => repeated sums => CLT
Gaussian
2 3
1 2 1
1 4
2 4 25
16
1 2 1
Gaussian filters
2 3
σ = 2 with 30 x 30 σ = 5 with 30 x 30
1 2 kernel 1 kernel
1 4
2 4 25
Standard deviation σ: determines extent of smoothing
• 16
1 2 1
63 Source: K. Grauman
Finite-support filters
Choosing kernel width
• The Gaussian function has infinite support, but discrete filters
use finite kernels
Filter Pyramid.
Figure 1: Gaussian + subsample (to are
Depicted exploit redundancy
four levels in output)
of the Gaussian pyamid,
levels 0 to 3 presented from left to right.
Burt & Adelson 83
http://persci.mit.edu/pub_pdfs/pyramid83.pdf
[2] P.J. Burt. Fast filter transforms for image processing. Computer Graphics
Smoothing
ussian filters vs edge filters
How should filters behave on a flat region with value ‘v’ ?
emplate matching
Main challenge: What is a
good similarity or
distanceH[i,j]
measure
between two patches?
al: find •
•
Correlation in image
Zero-mean correlation
• Sum Square Difference
• Normalized Cross Correlation
in challenge: What is a
good similarity
Can we use or
Side by Derek Hoiem
distance measure 53
Attempt 1: correlate with eye patch
k
X k
X
G[i, j] = H[u, v]F [i + u, j + v]
Matching with filters
u= k v= k
Goal: find
T in image (2K+1)2
= H Fij = ||H||||Fij || cos ✓, H, Fij 2 R
Method 0: filter the image with eye patch
h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
plate matching
k ,l
f = image
g = filter
nd in image
What went wrong?
plate matching
k ,l
f = image
g = filter
H
F ij
✓ij
nd in image
What went wrong?
Fij
H
emplate matching
Main challenge: What is a
good similarity or
distance measure
between two patches?
al: find•
•
Correlation in image
Zero-mean correlation
• Sum Square Difference
• Normalized Cross Correlation
distance measure
Attempt 1.5: correlate with zero-mean eye patch
Matching with filters
k
X k
X
Goal:
G[i,find
j] = in image
(H[u, v] H̄)F [i + u, j + v]
u= k v= k
Method 1: filter
Xk the
Xk image with zero-mean Xk eye
Xk
h[ m, n]== ∑ ( f [ k ,H[u,
l ] − fv]F
) ( g[i[+mu,+jk+
, nv]+ l ]H̄
) F [i + u, j +
u=k ,l k v= k mean of f u= k v= k
True detections
False
detections
57
MatchingAttempt
with filters 2: SSD
2
Goal: find
SSD[i, j]in=image
||H Fij ||
T
Method 2: SSD= (H Fij ) (H Fij )
n] =this∑be( gimplemented
h[ m,Can [ k , l ] − f [with
m + filtering?
k,n + l]) 2
k ,l
True detections
58
What will SSD find here?
Matching with filters
What’s the potential
Goal: find in image downside of SSD?
Method 2: SSD
h[ m, n] = ∑ ( g[ k , l ] − f [ m + k , n + l ] ) 2
mplate matching
k ,l
find in image
challenge: What is a
Input 1- sqrt(SSD)
Side by Derek Hoiem
in image
60
Input Normalized X-Correlation Thresholded Image
The above approaches to filtering were largely hand designed. This is partly
due to limitations in computing power and lack of access to large datasets in
Modern filter banks
he 80s and 90s. In modern approaches to image recognition the convolution
ernels/filtering operations areNeural
Convolutional oftenNets
learned from
(CNNs) huge
Lecun et alamounts
98 of training
data. Learn filters from training data to look for low, mid, and high-level features
62