(CV) Combined S1

Digital Image Acquisition, Sampling
and Quantization
4/1/2021 1
Outlines
• Image Acquisition
• Sampling and Quantization
• Image Sensors
4/1/2021 2
Image Acquisition
4/1/2021 3
Image description
f (x,y): intensity/brightness of the image at spatial coordinates (x,y)

0< f (x,y)<∞ and determined by 2 factors:
illumination component i(x,y): amount of source light incident
reflectance component r(x,y): amount of light reflected by objects
f (x,y) = i(x,y) r(x,y)
where
0< i(x,y)<∞: determined by the light source
0< r(x,y)<1: determined by the characteristics of objects
4/1/2021 4
Sampling and Quantization
quantization
sampling
4/1/2021 5
Sampling: Digitization of the spatial coordinates (x,y)
Quantization: Digitization in amplitude (also called gray-
level quantization)
8 bit quantization: 28 = 256 gray levels (0: black, 255: white)
Binary (1 bit quantization): 2 gray levels (0: black, 1: white)
Commonly used number of samples (resolution)

Digital still cameras: 640x480, 1024x1024, 4064 x 2704 so on
Digital video cameras: 640x480 at 30 frames/second
1920x1080 at 60 f/s (HDTV)
4/1/2021 6
An M x N digital image is expressed as

Columns
 f (0,0) f (0,1) . . . f (0, N  1) 
 f (1,0) f (1,1) . . . f (1, N  1) 
Rows

 . . . . . . 
 
 . . . . . . 
 . . . . . . 
 
 f ( M  1,0) f ( M  1,1) . . . f ( M  1, N  1)
N : No of Columns
M : No of Rows
4/1/2021 7
 Image coordinate convention (not valid for MATLAB!)
There is no universally accepted convention or notation. Always check carefully!
4/1/2021 8
4/1/2021 9
Digital Images
Digital images are 2D arrays (matrices) of numbers:
4/1/2021 10
Sampling
4/1/2021 11
Sampling
4/1/2021 COMSATS Institute of Information Technology, Lahore Digital Image Processing CSC331 12
Effect of Sampling and Quantization
250 x 210 samples 125 x 105 samples 50 x 42 samples 25 x 21 samples
16 gray levels 8 gray levels 4 gray levels Binary image

4/1/2021 13
4/1/2021 14
4/1/2021 15
4/1/2021 16
imread() – reading an image with different
postfixes
imresize() – resizing an image to any given size
figure – opening a new graphical window
subplot(#of row, # of col, location) – showing

different plots/images in one graphical window
imshow() – displaying an image

Sampling
im=imread('obelix.jpg');
im=rgb2gray(imread('obelix.jpg'));
im1=imresize(im, [1024 1024]);
im2=imresize(im1, [1024 1024]/2);
im3=imresize(im1, [1024 1024]/4);
im4=imresize(im1, [1024 1024]/8);
im5=imresize(im1, [1024 1024]/16);
im6=imresize(im1, [1024 1024]/32);
figure;imshow(im1)
figure;imshow(im2)
figure;imshow(im3)
figure;imshow(im4)
figure;imshow(im5)
figure;
subplot(2,3,1);imshow(im1);subplot(2,3,2);imshow(im2)
Quantization
im=imread('obelix.jpg');
im=rgb2gray(imread('obelix.jpg'));
im1=imresize(im, [1024 1024]);
im2= gray2ind(im1,2^7);
figure;
subplot(2,4,1);imshow(im1,[]);subplot(2,4,2);imshow(im2,[])
RGB (color) Images
4/1/2021 22
Image acquisition
Single imaging sensor
Line sensor
Array sensor
4/1/2021 23
Image Sensors
 Goal: to convert EM energy into electrical signals that

can be processed, displayed and/or interpreted as images.
 Common technologies:
– CCDs (charge-coupled devices)
– CMOS (complementary metal oxide semiconductor)
4/1/2021 24
Digital Camera Technologies
CCD Array Cameras

A CCD sensor is made up of an array of light-sensitive cells
called photosites, manufactured in silicon, each of which
produces a voltage proportional to the intensity of light falling on
them.
Every element in the array is linked (charge coupled) to other
element.
Charges are transferred serially out of the array through shifting
charges from one element to the other.
4/1/2021 25
Digital Camera Technologies
CMOS Array Cameras

Standard semiconductor
production line
Active pixel architecture
Photo-detector and amplifier
are both fabricated inside
each pixel.
4/1/2021 26
Digital camera technologies comparison
CCD (Charge Coupled Device) CMOS (Complementary Metal
–Specialized fabrication Oxide Semiconductor)
techniques are used so –Cheaper technology
expensive technology –Smaller size
–Larger size –Low power consumption
–Higher power consumption –Readout for selective area of an
because of the capacitive image is possible
architecture –Amplifier and additional
–Always have to read out the circuitry can be fabricated
whole image inside each pixel.
–Resolution is limited by –Higher resolution possible
sensor elements size –Stronger noise due to higher
–Less on-chip circuitry so dark currents because of more
lesser dark currents and on-chip circuitry
noise
4/1/2021 27
Acquisition of color images
Single sensor assembly
For still scenes
Three sensors
with prisms
Sensor arrays
a. Stripe filter pattern
b. Bayers filter pattern
4/1/2021 28
Foveon X3 imager
 This sensor uses 3 layers of
CMOS imagers.
 Each layer absorbs different
colours of light at different
depths.
 Single shot camera for three
colors
4/1/2021 29
Various commercial sensor sizes
Aspect Width Height

"Name"
Ratio (mm) (mm)
1/3.6" 4:3 4.0 3.0
1/3.2" 4:3 4.5 3.4
1/3" 4:3 4.8 3.6
1/2.7" 4:3 5.3 4.0
1/2" 4:3 6.4 4.8
1/1.8" 4:3 7.2 5.3
2/3" 4:3 8.8 6.6

Relative size of various digital camera sensors
1" 4:3 12.8 9.6
4/3" 4:3 18.0 13.5
EOS 10D 3:2 22.0 15.0
4/1/2021 30
Basic relationships and distance
measures between pixels
Pixel Neighborhood
The pixels surrounding a given pixel. Most neighborhoods used in

image processing algorithms are small square arrays with an odd
number of pixels.
4/1/2021 32
Neighbors of a Pixel
f(0,0) f(0,1) f(0,2) f(0,3) f(0,4) -

----
f(1,0) f(1,1) f(1,2) f(1,3) f(1,4) -
----
f(x,y) = f(2,0) f(2,1) f(2,2) f(2,3) f(2,4) - -
---
f(3,0) f(3,1) f(3,2) f(3,3) f(3,4) -
----
I I I I
I -----
I I I I
I -----
f(0,0) f(0,1) f(0,2) f(0,3) f(0,4) -

----
f(1,0) f(1,1) f(1,2) f(1,3) f(1,4) -
----
f(x,y) = f(2,0) f(2,1) f(2,2) f(2,3) f(2,4) - -
---
 A Pixel p at coordinates ( x, y) has 4 horizontal and vertical
f(3,0) f(3,1) f(3,2) f(3,3) f(3,4) -
neighbors.
----
I I I I
 Their coordinates are given by:
I -----
(x+1, y) (x-1, y) (x, y+1) &
I I I I
(x, y-1)
I -----
f(2,1) f(0,1) f(1,2)
f(1,0)
f(0,0) f(0,1) f(0,2) f(0,3) f(0,4) -

----
f(1,0) f(1,1) f(1,2) f(1,3) f(1,4) -
----
f(x,y) = f(2,0) f(2,1) f(2,2) f(2,3) f(2,4) - -
---
 A Pixel p at coordinates ( x, y) has 4 diagonal neighbors.
f(3,0) f(3,1) f(3,2) f(3,3) f(3,4) -
----
 Their coordinates are given by:
I I I I
(x+1, y+1) (x+1, y-1) (x-1, y+1) &
I -----
(x-1, y-1)
I I I I
f(2,2) f(2,0) f(0,2)
I -----
f(0,0)
 This set of pixels is called the diagonal-neighbors of p
Adjacency, Connectivity
Adjacency: Two pixels are adjacent if they are neighbors and their
intensity level ‘V’ satisfy some specific criteria of similarity.
e.g. V = {1}
V = { 0, 2}
Binary image = { 0, 1}
Gray scale image = { 0, 1, 2, ------, 255}
In binary images, 2 pixels are adjacent if they are neighbors & have
some intensity values either 0 or 1.
In gray scale, image contains more gray level values in range 0 to

255.
4-adjacency: Two pixels p and q with the values from set ‘V’ are
4-adjacent if q is in the set of N4(p).
e.g. V = { 0, 1}
1 1 0
1 1 0
1 0 1
p in RED color
q can be any value in GREEN color.
8-adjacency: Two pixels p and q with the values from set ‘V’ are
8-adjacent if q is in the set of N8(p).
e.g. V = { 1, 2}
0 1 1
0 2 0
0 0 1
p in RED color
q can be any value in GREEN color
m-adjacency: Two pixels p and q with the values from set ‘V’ are
m-adjacent if
(i) q is in N4(p) OR
(ii) q is in ND(p) & the set N4(p) n N4(q) have no pixels whose
values are from ‘V’.
e.g. V = { 1 }
0a 1b 1c
0d 1e 0f
0g 0h 1i
m-adjacent if
(i) q is in N4(p)
e.g. V = { 1 }
(i) b & c
0a 1b 1c
0d 1e 0f
0g 0h 1I
m-adjacent if
(i) q is in N4(p)
e.g. V = { 1 }
(i) b & c
0a 1b 1c
0d 1e 0f
0g 0h 1I
Soln: b & c are m-adjacent.

m-adjacent if
(i) q is in N4(p)
e.g. V = { 1 }
(ii) b & e
0a 1b 1c
0d 1e 0f
0g 0h 1I
m-adjacent if
(i) q is in N4(p)
e.g. V = { 1 }
(ii) b & e
0a 1b 1c
0d 1e 0f
0g 0h 1I
Soln: b & e are m-adjacent.

m-adjacent if
e.g. V = { 1 }
(iii) e & i
0a 1b 1c
0d 1e 0f
0g 0h 1i
m-adjacent if
(i) q is in ND(p) & the set N4(p) n N4(q) have no pixels whose
e.g. V = { 1 }
(iii) e & i
0a 1b 1c
0d 1e 0f
0g 0h 1I
m-adjacent if
(i) q is in ND(p) & the set N4(p) n N4(q) have no pixels whose
e.g. V = { 1 }
(iii) e & i
0a 1b 1c
0d 1e 0f
0g 0h 1I
Soln: e & i are m-adjacent.

m-adjacent if
e.g. V = { 1 }
(iv) e & c
0a 1b 1c
0d 1e 0f
0g 0h 1I
m-adjacent if
e.g. V = { 1 }
(iv) e & c
0a 1b 1c
0d 1e 0f
0g 0h 1I
Soln: e & c are NOT m-adjacent.
Connectivity: 2 pixels are said to be connected if their

exists a path between them.
Let ‘S’ represent subset of pixels in an image.
Two pixels p & q are said to be connected in ‘S’ if their exists a

path between them consisting entirely of pixels in ‘S’.
For any pixel p in S, the set of pixels that are connected to it in

S is called a connected component of S.
Paths
Paths: A path from pixel p with coordinate ( x, y) with

pixel q with coordinate ( s, t) is a sequence of distinct
sequence with coordinates (x0, y0), (x1, y1), ….., (xn, yn)
where
(x, y) = (x0, y0)

& (s, t) = (xn, yn)
Closed path: (x0, y0) = (xn, yn)

Paths
Example # 1: Consider the image segment shown in figure.

Compute length of the shortest-4, shortest-8 & shortest-m paths
between pixels p & q where,
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-4 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-4 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-4 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-4 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-4 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-4 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
So, Path does not exist.

Paths
Example # 1:
Shortest-8 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-8 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-8 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-8 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-8 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-8 path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
So, shortest-8 path = 4

Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
Paths
Example # 1:
Shortest-m path:
V = {1, 2}.
4 2 3 2q
3 3 1 3
2 3 2 2
p2 1 2 3
So, shortest-m path = 5

Regions & Boundaries
Region: Let R be a subset of pixels in an image. Two regions Ri and Rj are
said to be adjacent if their union form a connected set.
Regions that are not adjacent are said to be disjoint.
We consider 4- and 8- adjacency when referring to regions.
Below regions are adjacent only if 8-adjacency is used.
1 1 1
1 0 1 Ri
0 1 0
0 0 1
1 1 1 Rj
1 1 1
Regions & Boundaries
Boundaries (border or contour): Set of pixel in the region that

have one or more neighbors that are not in R.
0 0 0 0 0
0 1 1 0 0
0 1 1 0 0
0 1 1 1 0
0 1 1 1 0
0 0 0 0 0
RED colored 1 is NOT a member of border if 4-connectivity is used between

region and background. It is if 8-connectivity is used.
Distance Measures
City Block Distance: The D4 distance between p & q is defined

as
D4( p, q) = |x - s| + |y - t|
In this case, pixels having D4 distance from ( x, y) less than or equal

to some value r form a diamond centered at ( x, y).
2
2 1 2
2 1 0 1 2
2 1 2
2
Pixels with D4 distance ≤ 2 forms the following contour of constant
distance.
Distance Measures
Chess-Board Distance: The D8 distance between p & q is

defined as
D8( p, q) = max( |x - s| , |y - t| )
In this case, pixels having D8 distance from ( x, y) less than or equal

to some value r form a square centered at ( x, y).
2 2 2 2 2
2 1 1 1 2
2 1 0 1 2
2 1 1 1 2
2 2 2 2 2
Pixels with D8 distance ≤ 2 forms the following contour of constant
distance.
References
• Lecture Slides from Paresh Kambe
Color Image Processing
Color Perception
Color perception is psychophysical phenomenon that combines

two main components:
1. The physical properties of light sources, absorption and

reflectance capabilities of surfaces.
2. The physiological and psychological aspects of the human

visual system (HVS).
Importance of color images
1. Color is a powerful descriptor that can be used in automated

image analysis such as segmentation, object detection,
tracking and identification.
2. Humans have capability to distinguish thousands of
color/intensities, as compared to only two dozens of gray
shades.
78
Two major areas of color image processing
1. Full color processing

Images are acquired with full color sensor such as TV camera
or color scanner.
2. Pseudo color processing
Pseudo color images are generated by assigning shade of
colors to monochrome intensity images.
79
Color Spectrum
In the mid seventeen century, Sir Issac Newton
discovered that light passed through the glass prism
comes out as spectrum of colors ranging from violet
to red.
So, the color spectrum can be divided into SIX
regions.
80
Characterization of Light
 Achromatic Light
– Intensity measured by gray levels
 Chromatic Light
There are three basic quantities to describe the quality of a chromatic light.
– Radiance : total amount of energy that flows from the light source (Watts)
– Luminance: perceived amount of energy ( lumens)
– Brightness: a subjective measure same as the achromatic notion of intensity, it
describes the sensation of a color.

81
Primary Colors of Light
Primary Colors (RGB)

Red, Green, Blue
Additive model
Secondary Colors
 Secondary Colors (CMY)
Cyan (green + blue)
Magenta (red + blue)
Yellow (red + green)
An RGB to CMY conversion
 C  1  R 
 M   1  G 
    
 Y  1  B 
Pigment Primaries
Color Distinguishing Characteristics
 The following THREE attributes are used to distinguish one color from
another:
1. Intensity(Brightness): Chromatic notion of intensity, used to
describe color sensation.
2. Hue: associated with the dominant wavelength in a mixture of
light waves. It represents dominant color as perceived by an
observer.
3. Saturation: refers to the relative purity or the amount of white
light mixed with a hue ( Spectrum colors are fully saturated).
 Hue & saturation taken together are called chromaticity
Trichromatic coefficients
 Tristimulus value: amount of R, G, and B required to
form a particular color ( X, Y , Z)
 Trichromatic
X
coefficients:
Y
portions
Z
of R, G, and B (x,
x y, z) , y , and z  .
X Y  Z X Y  Z X Y  Z
 x  y  z 1
Chromaticity Diagram
• Chromaticity diagram tells us in which

proportion the primary colors are to be mixed to
get any other color.
Color Models/Spaces
 Color models specify a coordinate system and subspace within that system, where each
color is represented by a single point.
 Several color models have been proposed to specify colors for different purposes (e.g.,
photography, physical measurements of light, color mixtures, etc.).
 Some prominent color models are as follows:
• RGB
• CMY
• CMYK
• HSI/HSV
• YCbCr
• L*a*b
Analysis of Color Spaces
RGB
88
RGB
Properties
• Default color space and primary colors of light
• Based on Cartesian coordinate system R
• It is an elementary & simplest representation based in terms of
wavelengths of R, G & B in white light
• Device-Independent
Applications G
• Used in various colored image processing applications including
boundary extraction on color bases, segmentation and
recognition of objects in color images
• The hardware oriented models most commonly use RGB space
• RGB is supported by various devices like CRT Monitors, B
Cameras, etc.
RGB
Pros
• More near to real colors
• Simple and easy to implement
• Matches nicely with the human visual system (eye), therefore strongly perceptive to R, G & B primaries.
• Ideal for image color generation
• R, G & B components with equal contributions produce a gray level shades
• If components are contributed with their maximum equal strengths then white color is produced (Black if no
contribution)
Cons
• Not intuitive to human perception
• High correlation between the channels
• Mixing of chrominance and luminance data
• Significant perceptual non uniformity
CMY
Properties
• Secondary colors of light derived from RGB by
subtracting one primary component
Applications
• Used in most of devices that deposit colors on
paper, such as printers, copiers, require CMY data.
Pros
• The maximum equal contribution of all components
of CMY produces black color
Cons
• Not suitable for describing colors in terms that are
practical for human interpretation
CMYK
• In CMYK, K is the fourth color, black; because equal amount of CMY
produces a muddy-looking black, since black is the predominant color in
printing, we need to produce true, pure black
• Conversion from CMY to CMYK:
K = min(C,M,Y)
if K = = 1 then tCMYK = (0,0,0,1), else
tCMYK = (c, m, y, k) = ((C - K) / (1 - K), (M - K) / (1 - K), (Y - K) / (1 -

K), K )
CMYK
 RGB and CMY are not well suited for describing colors for human interpretation
 Hue (H), saturation (S), and intensity (I)
– Hue: color attribute that describes a pure color (e.g., pure yellow or red)
– Saturation: a measurement of the degree to which a pure color is diluted by

white light
– Intensity can be decoupled from the color information (H and S)
 HSI is ideal for processing color image based on the color sensing properties of
the human visual system
4/1/2021 93
HSV/HSI
Characteristics
• Used in various Computer Graphics applications
H
• Also used in image processing applications such as
Image enhancement by manipulating only its intensity
channel
• As intensity channel alone can be handled separately S
so brightness can be controlled by just dealing with
this channel alone without affecting chromaticity
information
• Medical image segmentation using HSI model
V
YUV (YCbCr) color
YUV is used worldwide for TV and video encoding standards.
• YUV allows separating the color information from the
luminance component
Y
• A conversion of an RGB signal to YUV requires just a
linear transform-easy to implement
Y: Luminance
U, Cb: Chroma channel, U axis, blue component
V, Cr: Chroma channel, V axis, red component Cb
 Y   0.299 0.587 0.114   R 
    
U     0.147  0.289 0.436   G 
 V   0.615  0.515  0.1   B 
    
Cr
In MATLAB: rgb2ycbcr and ycbcr2rgb
YCbCr
Pros
• Useful if we need to change Brightness/Contrast component of a
color image without altering the color component of that image
Cons
• Not an absolute color space
• Can not be represented in the corresponding RGB domain. This
causes some difficulty in determining how to correctly interpret and
display some of its signals.
L*a*b
Properties
• L*a*b* space has CIE specifications that attempts to make the
luminance scale more perceptually uniform. It is derived from
L
CIE- XYZ space. L* is a non linear scaling of L
• Device-Independent
Applications
a
• Used in creative art, advertising, graphic art, digitized or
animated paintings etc.
• Applied to image coding for printing
• Color calibration
• Medical imaging especially dentistry b
• Segmentation of natural scenes
L*a*b
Pros
• L* values relates linearly to human perception of brightness.
• It is useful in imaging systems where exact perceptual reproduction of color image across the entire
system is of primary concern
• It is useful if we need to change Brightness/Contrast component of a color image without altering
the color component of that image
Cons
• Disadvantage in real time processing where computational resources are important
Most information in intensity
Only color shown – constant intensity Only intensity shown – constant color
Original image
Exercise
 [SR, Ex. 2.8] Skin Color Detection: Test your program

on your own picture and then on some others as well.
 Hint: May use the combination of HSV and YCbCr color spaces
 Relevant Readings and References:
1. http://www.hamradio.si/~s51kq/V-CIV.HTM
2. http://research.ijais.org/volume2/number2/ijais12-450264.pdf
Image Enhancement
• Pixel/Intensity Transformation
• Histogram Processing
• Spatial Filtering
Image Enhancement
 Image enhancement is a process to make the result more suitable than the
original image for a specific application
 Image enhancement is a subjective (problem/application oriented) image
enhancement approch.
 Spatial domain: Direct manipulation of pixel in an image (on the image
plane)
 Frequency domain: Processing the image based on modifying the Fourier
transform of an image
 Many techniques are based on various combinations of methods from
these two categories
102
Image Enhancement
 Goals:
– To improve the subjective quality of an image for human
viewing; (an image needs improvement)
– To modify the image in such a way as to make it more suitable

for further analysis and automatic extraction of its contents.
(low-level features extraction)
103
Image Enhancement Process
104
Image Enhancement
Types of image enhancement operations

Point/pixel operations: Output value at specific coordinates (x,y) is dependent
only on the input value at (x,y)
Local operations: The output value at (x,y) is dependent on the input values in
the neighborhood of (x,y)
Global operations: The output value at (x,y) is dependent on all the values in
the input image
4/1/2021 105
Basic concepts
Spatial domain enhancement methods can be generalized as
g(x,y)=T[f(x,y)]
f(x,y) : input image
g(x,y): processed (output) image
T[*] : an operator on f (or a set of input images), defined
over intensity (x,y) or neighborhood of (x,y)
Neighborhood about (x,y): a square or rectangular sub-
image area centered at (x,y)
106
Basic Concepts
3x3 neighborhood about (x,y)
107
Basic concepts
 Pixel/Point operation:
Neighborhood of size 1x1: g depends only on f at (x,y)
T: a gray-level/intensity transformation/mapping function
Let r = f(x,y) s = g(x,y)
r and s represent gray levels of f and g at (x,y)
Then s = T(r)
 Local operations: g depends on the predefined number of neighbors of f at
(x,y) implemented by using mask processing or filtering
 Masks (filters, windows, kernels, templates) : a small (e.g. 3×3) 2-D array,
in which the values of the coefficients determine the nature of the process
108
Common pixel operations
• Image negatives
• Log transformations
• Power-law
transformations
• Contrast stretching
109
Image Negative
Reverses the gray level order

For L gray levels the transformation
function is:
Here L = 2.
s = (L – 1) – r, where r  [0, L-1].
>> f = imread('xray.tif');
>> g = imcomplement(f);
>> imshow(f), figure, imshow(g)
Image negatives
s =T(r) = (L-1)-r
Input image (X-ray image) Output image (negative)
111
Log transformations
Function of s = c Log(1+r)
112
Log transformations
Properties of log transformations
– For lower amplitudes of input image the range of gray levels is
expanded
– For higher amplitudes of input image the range of gray levels is
compressed
Application:
– Dynamic range of a processed image far exceeds the capability
of the display device
• (e.g. display of the Fourier spectrum of an image)
– Also called “dynamic-range compression / expansion”
Log transformations
Fourier spectrum with values of The result applying log

range 0 to 1.5 x 106 scaled linearly transformation, c = 1
114
Log transformation
Used to compress the dynamic range of

values within an image.
s = c log(1 + double(r)), where c is constant

>> f = imread('spectrum.tif');
% here c = 1
>> g = im2uint8(mat2gray(log(1+double(f))));
Power-law Transformation
Basic form:
s = crg ,
where c & g
are positive
Plots of equation
s = crg,
For various values of
g (c  1)
116
Power-law Transformation
For γ < 1: Expands values of dark pixels, compress values of brighter pixels
For γ > 1: Compresses values of dark pixels, expand values of brighter
pixels
If γ=1 & c=1: Identity transformation (s = r)
A variety of devices (image capture, printing, display) respond according to
a power law and need to be corrected;
Gamma (γ) correction
The process used to correct the power-law response phenomena
117
Power-law Transformation: Example
Original Result of
satellite applying power-
image law
transformation
c = 1, g  3.0
Result of Result of
applying applying power-
power-law law
transformation transformation
c = 1, g  4.0 c = 1, g  5.0
118
Power-Law Transformation Example
Useful for brightening the darkest regions (0 < g < 1)

or darkening the brightest regions in an image (g > 1).
s = rg
Syntax:
g = imadjust(f, [low_in high_in], [low_out high_out], gamma);
>> f = imread('xray.tif');
>> g = imadjust(f, [ ], [ ], 0.5);
Contrast stretching
Goal:
• Increase the dynamic range of the gray levels for low contrast images
• Low-contrast images can result from
– poor illumination
– lack of dynamic range in the imaging sensor
– wrong setting of a lens aperture during image acquisition
120
Contrast-Stretching
121
Contrast-Stretching
 Auto Contrast
4/1/2021 122
Contrast-Stretching
I =Auto Contrast Example
imread('salzburg_before.png');
f = double(I);
fmin = min(min(f))
fmax = max(max(f))
g = uint8(255*(f-fmin)/(fmax-fmin));
imshow(g)
123
Contrast-Stretching
s = T(r) = 1 / (1 + (m/r)E), where 0 < m < r
Output values are scaled to the range [0, 1]
g = 1./(1 + (m./(double(f) + eps)).^E)
>> f = imread('pollen.tif');
>> fd = im2double(f);
>> m = mean2(fd)
>> g = 1./(1+(m./(fd+eps)).^4);
Exercise
Identify which intensity transformation was used on liftingbody.png to
create each of the four results below. Write a script to reproduce the
results using the intensity transformation functions.
Original
Result 1 Result 2 Result 3

Result 4
What is a Histogram?
An image histogram is a gray-scale value distribution showing the frequency
of occurrence of each gray-level value.
Histogram Processing
Generating & Plotting Image Histograms:
h(rk) = nk
where rk is the kth intensity value,

nk is the number of pixels in the image with
intensity rk.
Normalized Histogram:
p(rk) = h(rk)/n
= nk/n, (Relative frequency)
where n = MxN, the size of image
>> imhist(f);
Histogram Equalization
>> f = imread('pollen.tif');
>> h = histeq(f);
>> subplot(2,2,1), imshow(f),
subplot(2,2,2), histogram(f),
subplot(2,2,3), imshow(h),
subplot(2,2,4), histogram(h)
Exercise
Write a program which can read an image as an input and do the
following automatically. Show the results of all steps.
a) Find the type of image: binary, gray or RGB.
b) Find the issue in image, over dark, over bright, low contrast, or
normal. (Hint: can use histogram).
c) Resolve the issue if any and show the final image after
enhancement.
d) Test your program on following images:
Neighborhood Processing
Spatial Filtering
w1 w2 w3 f1 f2 f3 g1 g2 g3
w4 w5 w6 f4 f5 f6 g4 g5 g6
w7 w8 w9 f7 f8 f9 g7 g8 g9
g5 = f1*w1+ f2*w2+ f3*w3+ f4*w4+ f5*w5+ f6*w6+ f7*w7+ f8*w8+ f9*w9

Spatial Filtering - Without Padding
Mask (Filter) Given Image Filtered Image
0 -1 0 2 3 5 7 2
-1 4 -1 3 4 7 5 2 -3 ? ?
0 -1 0 5 6 5 3 1 ? ? ?
2 6 2 2 1 ? ? ?
7 3 3 2 0
2*0 + 3*(-1) + 5*0 + 3*(-1) + 4*4 + 7*(-1) + 5*0 + 6*(-1) + 5*0 = -3

Spatial Filtering - With Zero Padding
0 -1 0
-1 4 -1
2 3 5 7 2
0 -1 0
3 4 7 5 2
2 5 6 5 3 1
2 6 2 2 1
7 3 3 2 0
0 -1 0
-1 4 -1
2 3 5 7 2
0 -1 0
3 4 7 5 2
2 1 5 6 5 3 1
2 6 2 2 1
7 3 3 2 0
0 -1 0
-1 4 -1
2 3 5 7 2
0 -1 0
3 4 7 5 2
2 1 3 5 6 5 3 1
2 6 2 2 1
7 3 3 2 0
0 -1 0
-1 4 -1
2 3 5 7 2
0 -1 0
3 4 7 5 2
2 1 3 16 5 6 5 3 1
2 6 2 2 1
7 3 3 2 0
0 -1 0
-1 4 -1
2 3 5 7 2
0 -1 0
3 4 7 5 2
2 1 3 16 -1 5 6 5 3 1
2 6 2 2 1
7 3 3 2 0
0 -12 03 5 7 2
-1 43 -14 7 5 2
0 -1 0
2 1 3 16 -1 5 6 5 3 1
1 2 6 2 2 1
7 3 3 2 0
02 -13 0 5 7 2
-13 44 -1 7 5 2
0 -1 0
2 1 3 16 -1 5 6 5 3 1
1 -3 2 6 2 2 1
7 3 3 2 0
Spatial Filtering - Padding Options
>> f=[1 2; 3 4]
f=
1 2
3 4
>> fp=padarray(f, [2 2], 0) >> fp=padarray(f, [2 2], 'replicate')

fp = fp =
0 0 0 0 0 0 1 1 1 2 2 2
0 0 0 0 0 0 1 1 1 2 2 2
0 0 1 2 0 0 1 1 1 2 2 2
0 0 3 4 0 0 3 3 3 4 4 4
0 0 0 0 0 0 3 3 3 4 4 4
0 0 0 0 0 0 3 3 3 4 4 4
>> fp=padarray(f, [2 2], 'symmetric') >> fp=padarray(f, [2 2], 'circular')
fp = fp =
4 3 3 4 4 3 1 2 1 2 1 2
2 1 1 2 2 1 3 4 3 4 3 4
2 1 1 2 2 1 1 2 1 2 1 2
4 3 3 4 4 3 3 4 3 4 3 4
4 3 3 4 4 3 1 2 1 2 1 2
2 1 1 2 2 1 3 4 3 4 3 4
Basic Types of Spatial Filtering
 In spatial filtering (vs. frequency domain filtering), the output image is
computed directly by simple calculations on the pixels of the input image.
 Spatial filtering can be either linear or non-linear.
 For each output pixel, some neighborhood of input pixels is used in the
computation.
 In general, linear filtering of an image f of size MxN with a filter mask of
size mxn is given by a b
g ( x, y )    w(s, t ) f ( x  s, y  t )
s  a t b
where a = (m-1)/2 and b = (n-1)/2
 This concept is called convolution. Filter masks are sometimes called
convolution masks or convolution kernels.
Basic
Nonlinear
Types spatial filteringFiltering
of Spatial usually uses a neighborhood
too, but some other mathematical operations are used.
These can include conditional operations (if …, then …),
statistical (sorting pixel values in the neighborhood,
mean, median), etc.
Nonlinear Spatial Filtering - Example
>> f = imread('cktboard.tif');
>> fn = imnoise(f, 'salt & pepper', 0.2);
>> gm = medfilt2(fn);
>> gms = medfilt2(fn, 'symmetric');
>> imshow(f), figure, imshow(fn), figure, imshow(gm), figure, imshow(gms)
(a) (b) (c (d
) )
Segmentation of Water in Bottles
b = imread('bottles.jpg');
bL = im2bw(b,0.2);
bH = im2bw(b,0.7);
L = (bL-bH);
L1 = medfilt2(L,[9 9],'symmetric');
subplot(2,3,1),imshow(b),title('Bottles'
),
subplot(2,3,2), imshow(bL),
title('Liquid White'),
subplot(2,3,3), imshow(bH),
title('Liquid Black'),
Morphological Image Processing
• Erosion
• Dilation
• Opening
• Closing
Morphology
Morphological image processing (or morphology)
describes a range of image processing techniques that
deal with the shape (or morphology) of features in an
image
Morphological operations are typically applied to
remove imperfections introduced during segmentation,
boundary extraction, pruning, thinning,
Skeletonization, etc.
What is mathematical morphology?
• Basic principle: the extraction of geometrical and

topological information from an unknown set (an image)
through transformations using another, well-defined, set,
known as structuring element (SE).
• In morphological image processing, the design of SEs,

their shape and size, is crucial to the success of the
morphological operations that use them.
• Set Theory
148
Fundamental concepts and operations
 Basic set operations:
– Complement
– Difference
– Translation
– Reflection
149
150
(c) (d)

 Logical equivalents of set
(e)
theory operations
(f)
Figur e 13.1 Basic set operations: (a) set A; (b) translation of A by x = (x 1 , x 2 ); (c) set B ; (d)
– Intersection ~ logical AND
reﬂection of B ; (e) set A and its complement A c ; (f) set difference (A − B ).
The equivalent expression using conventional image processing notation would be:
⇢
1 if A (x, y) and B (x, y) are both 1
C (x, y) = (13.6)
0 otherwise
This expression leads quite easily to a single MATLAB statement that perform the
– Similarly: intersection operation using the logical operator AND (&). Similarly, complement can be
obtianed using the unary NOT (~) operator, set union can beimplemented using the logical
operator OR (| ) and set difference (A − B ) can be expressed as ( A & ~B) . Figure 13.2
• Complement ~ logical NOT for two binary input images. Please notethat wehavefollowed
showsrepresentativeresults
• theIPT convention, representing foreground (1-valued) pixelsaswhitepixelsagainst ablack
Union ~ logical OR
background.
• Difference ~ A AND (NOT B)
13.2.1 The structuring element
The structuring element (SE) is the basic neighborhood structure associated with morpho-
logical image operations. It is usually represented as a small matrix, whose shape and size
151
 Logical equivalents of set theory operations
152
The structuring element
The structuring element (SE) is the basic neighborhood

structure associated with morphological image
operations.
It is usually represented as a small matrix, whose shape
and size impact the results of applying a certain
morphological operator to an image.
Although a structuring element can have any shape, its
implementation requires that it be converted to a
rectangular array.
For each array, the shaded squares correspond to the
members of the SE whereas the empty squares are used
for padding, only. 153
 Examples:
square cross
 MATLAB functions
– strel
– getsequence
154
 Examples:
155
Structuring Elements, Hits & Fits
Structuring Element
B
Fit: All on pixels in the structuring element
cover on pixels in the image
A Hit: Any on pixel in the structuring element

C covers an on pixel in the image
Miss: No on pixel in the structuring element

covers an on pixel in the image
All morphological processing operations are based on these simple ideas

Fitting & Hitting
0 0 0 0 0 0 0 0 0 0 0 0
1 1 1
0 0 0 1 1 0 0 0 0 0 0 0
1 1 1
0 0 1 B1 1 1 1 0 C
0 0 0 0
1 1 1
0 1 1 1 1 1 1 1 0 0 0 0
Structurin
0 1 1 1 1 1 1 1 0 0 0 0 g
0 0 1 1 1 1 1 1 0 0 0 0 Element
0 1 0
1
0 0 1 1 1 1 1 1 1 0 0 0 1 1 1
0 0 1 1 1 1 1 A
1 1 1 1 0 0 1 0
0 0 0 0 0 1 1 1 1 1 1 0 Structurin
0 0 0 0 0 0 0 0 0 0 0 0 g
Element
2
Structuring Elements
Structuring elements can be any size and make any shape
However, for simplicity we will use rectangular
structuring elements with their origin at the middle pixel
0 0 1 0 0
1 1 1 0 1 0 0 1 1 1 0
1 1 1 1 1 1 1 1 1 1 1
1 1 1 0 1 0 0 1 1 1 0
0 0 1 0 0
Erosion
• Erosion of image f by structuring element s is given by f

s
• The structuring element s is positioned with its origin at

(x, y) and 1 ifthe
s fits
new f pixel value is determined using the
g ( x, y )  
rule: 0 otherwise
Erosion shrinks objects
Erosion Example
Original Image
Processed Image With Eroded Pixels
Structuring Element
Erosion
• The value of the output pixel is the minimum value of all

the pixels in the input pixel's neighborhood.
• In a binary image, if any of the pixels is set to 0, the
output pixel is set to 0.
161
Dilation and erosion
 Erosion – geometrical interpretation
162
Dilation
• Dilation of image f by structuring element s is given by f  s

• The structuring element s is positioned with its origin at (x, y) and the new
pixel value is determined using the rule:
1 if s hits f
g ( x, y )  
0 otherwise
Dilation enlarges objects
Erosion
 Erosion– MATLAB example
a = [ 0 0 0 0 0 ; 0 1 1 1 0; 1 1 1 0 0; 0 1 1 1 1; 0 0 0 0 0]
se1 = strel('square',3)
b = imerode (a,se1)
se2 = strel('rectangle', [1 3])

c = imerode (a,se2)
164
Dilation Example
Original Image Processed Image
Structuring Element
Dilation
– The effect of dilation is to “grow” or “thicken” objects in a

binary image.
• Simplest Application is “bridging the gaps”.
• The extent and direction of this thickening is controlled by the size
and shape of the structuring element.
• Mathematically:
166
Dilation
 The value of the output pixel is the maximum value of
all the pixels in the input pixel's neighborhood.
 In a binary image, if any of the pixels is set to the value
1, the output pixel is set to 1.
167
Dilation
 Dilation – geometrical interpretation
168
Dilation
 Dilation – Example
169
Dilation
 Dilation – MATLAB example
a = [0 0 0 0 0;
0 1 1 0 0;
0 1 1 0 0;
0 0 1 0 0;
0 0 0 0 0]
se1 = strel('square',3)
b = imdilate (a,se1)
se2 = strel('rectangle', [1
3])
c = imdilate (a,se2)
170
 Dilation and Erosion – geometrical interpretation
171
Compound Morphological Operations
More interesting morphological operations can be
performed by performing combinations of erosions and
dilations
The most widely used of these compound operations are:
 Opening: f ○ s = (f  s)  s
 Closing: f  s = (f  s)  s
Compound Operations (Opening & Closing)
 Opening: erosion followed by dilation

 Mathematically:
or:
 In MATLAB: imopen
173
Compound operations (Opening & Closing)
 Closing: dilation followed by erosion

 Mathematically:
 In MATLAB: imclose
174
Compound Operations (Example)
175
176
Operations
supported by
bwmorph
177
Basic Morphological Algorithms
 Boundary Extraction
 Hole Filling
 Extraction of Connected Components
 Thinning
 Thickening
 Skeletons
 Pruning
 Reconstruction
178
t=double(multithresh(b,2))/256
bL = im2bw(b,t(1));
bH = im2bw(b,t(2));
L = bL - bH;
se5 = strel('disk',5);
se10 = strel('disk',10);
L1 = imopen(L,se5);
L2 = imclose(L1,se10);
subplot(2,3,1),imshow(b),title('Bottles'),
subplot(2,3,2), imshow(bL), title('Liquid
White'),
subplot(2,3,3), imshow(bH), title('Liquid
Black'),
subplot(2,3,4), imshow(L), title('Liquid
with noise'),
subplot(2,3,5), imshow(L1), title('Liquid
with less noise'),
subplot(2,3,6), imshow(L2), title('Pure
Liquid');
Exercise
 Collect some bottles at your home, partially and

completely filled with some liquid. Take pictures
of these bottles. Write a program which can label
only incompletely filled bottles and test on these
pictures.
Image Segmentation
[SR] Szeliski R., Computer Vision - Algorithms and Applications, Springer, 2011. (Ch.
4, 5)
[GR1] Gonzalez R. C., Woods R. E., Eddins S. L., Digital Image Processing Using
Matlab, Pearson Education, 2nd edition, 2009. (Ch. 10)
[GR2] Gonzalez R. C., Woods R. E., Eddins S. L., Digital Image Processing, Pearson
Education, 3rd edition, 2007. (Ch. 10)
Image Segmentation
 Image segmentation is a process of splitting/subdividing an image into

different meaningful regions according to specific application.
 Generally, image segmentation is considered as an initial step in many
machine vision applications.
 For example:
Detecting movement of different vehicles on the road
Image Segmentation
Model
Generati Models
on
Image Feature Classification

Preprocessing Segmentation
acquisition extraction / Regression
Problem Knowledge
Result
domain base
Image Segmentation Approaches
 There are two different approaches for Image segmentation:
1. Similarity-based approach
• Thresholding
• Region growing
• Region splitting and merging
2. Discontinuity-based approach
• Identification of Lines
• Identification of Edges
• Identification isolated points
Thresholding
 Segmentation into two classes/groups

 Foreground (Objects)
 Background
185
Thresholding
1 if f ( x, y )  T
g ( x, y )  
0 if f ( x, y )  T
Objects & Background

Thresholding
 GLOBAL
 ADAPTIVE/LOCAL
Global Thresholding
 Single threshold value for entire image

 Fixed ?
 Automatic
 Based on the histogram of an image,
partition the image histogram using a single
global threshold
 The success of this technique very strongly
depends on how well the histogram can be
partitioned
Images taken from Gonzalez & Woods, Digital Image Processing (2002)
Global Thresholding
Global Thresholding Algorithm
 The global threshold, T, is calculated as follows:
1. Select an initial estimate for T (typically the average grey
level in the image)
2. Segment the image using T to produce two groups of pixels:
o G1 consisting of pixels with grey levels >T and
o G2 consisting pixels with grey levels ≤ T
3. Compute the average grey levels of pixels in G1 to give μ1
and G2 to give μ2
Global Thresholding Algorithm
4. Compute a new threshold value:

1   2
T
2
5. Repeat steps 2 to 4 until: abs(Ti – Ti-1)<epsilon
 This algorithm works very well for finding

thresholds when the histogram is suitable
Thresholding using Otsu Method
We start by treating the normalized histogram as a

discrete pdf with nq
p r ( rq )  q  0, 1, 2; ..., L  1.
n

where n is the number of pixels in the image, nq is
the number of pixels that have intensity level rq and
L is the total number of possible intensity levels in
the image.
Suppose that a threshold k is chosen s.t. C0 is the
set of pixels with levels [0, 1, ..., k-1] and C1 is the
set of pixels with levels [k, k+1, ..., L-1].
 Otsu method choses the threshold value k that maximizes the

between-class variance
 B2  0 ( 0  T ) 2  1 ( 1  T ) 2
k 1 L 1
 where 0   pq (rq ) 1   pq ( rq )
q 0 q k
k 1 L 1
0   qpq ( rq ) / 0 1   qpq ( rq ) / 1
q 0 q k
k 1
T   qpq ( rq )
q 0
T = graythresh(f)
takes an image, computes its histogram, and then finds the threshold
value that maximizes the between-class variance.
The threshold is returned as a normalized value between 0.0 and 1.0,
where f is the input image and T is the resulting threshold.
Thresholding using Otsu Method: Example
>> I = imread('coins.png');
>> imhist(I)
>> level = graythresh(I)
level =
1600
1400
1200
1000
 0.4941 800
600
400
>> BW = im2bw(I,level); 200
0 50 100 150 200 250
>> imshow(BW)
Problems With Single Value Thresholding
 Single value thresholding only works for

bimodal histograms
 Images with other kinds of histograms need
more than a single threshold
 Uneven illumination can really upset a single
valued thresholding scheme
Problems With Single Value Thresholding (cont…)
Let’s say we want to isolate the contents

of the bottles.
Think about what the histogram for this
image would look like.
What would happen if we use a single
threshold value?
>> b = imread('bottles.jpg');
>> imhist(b);
t=double(multithresh(b,2))/256
bL = im2bw(b,t(1));
bH = im2bw(b,t(2));
L = (bL-bH);
L1 = medfilt2(L,[9 9],'symmetric');
subplot(2,3,1),imshow(b),title('Bottles'),
subplot(2,3,2), imshow(bL), title('Liquid
White'),
subplot(2,3,3), imshow(bH), title('Liquid
Black'),
t =
subplot(2,3,4), imshow(L), title('Liquid
with noise'),
0.2070 0.6758
subplot(2,3,5), imshow(L1), title('Pure
Liquid')
The Role of Uneven Illumination
Therefore, either
balance the
illumination or
apply adaptive
thresholding
Normal image Intensity ramp in Product of normal

the range [0.2, 0.6] image and
intensity ramp
Adaptive/Local Thresholding
An approach to handling

situations in which single
value thresholding will
not work is to divide an
image into sub images and
threshold these
individually
Since the threshold for
each pixel depends on its
location within an image
this technique is called
Adaptive/Local Thresholding Example
 The image shows an example of

using adaptive thresholding
with the image shown
previously
 As can be seen, success is
mixed
 But, we can further subdivide
the troublesome sub images for
more success
 Further subdivided the
troublesome part, successful
thresholding is achieved
The Role of Noise in Image Thresholding
Therefore the noise

should be removed
before thresholding
Noiseless image Image with additive Image with additive

Gaussian noise of Gaussian noise of
mean 0 and standard mean 0 and standard
deviation 10 deviation 50
Region-Based Segmentation
 Divide the image into regions

 R1,R2,…,RN
 Following properties must hold:
(For adjacent regions)

 Region Growing
 Region growing: groups pixels or subregions into larger regions.
 Pixel aggregation: starts with a set of “seed” points and from these grows
regions by appending to each seed points those neighboring pixels that
have similar properties (such as gray level).
1. Choose the seed pixel(s).
2. Check the neighboring pixels and add them to the region if they are similar to
the seed
3. Repeat step 2 for each of the newly added pixels; stop if no more pixels can be
added
Predicate: for example abs(zj - seed) < Epsilon

 Example
 Region Splitting
 Region Growing: Starts from a set of seed points.
 Region Splitting: Starts with the whole image as a single region and
subdivide the regions that do not satisfy a condition.
 Image = One Region R
 Select a predicate P (gray values etc.)
 Successively divide each region into smaller and smaller quadrant regions
so that:
P( Ri )  true
 Region Splitting
Problem?Adjacent regions could be same

Solution?Allow Merge
 https://www.youtube.com/watch?v=0kUGpgIrZIw
 https://www.youtube.com/watch?v=ZAXjI9CFvDU
 Region Merging
 Region merging is the opposite of region splitting.
 Merge adjacent regions Ri and Rj for which:
P( Ri  R j )  True
 Region Splitting/Merging
 Stop when no further split or merge is possible
 Example
1. Split into four disjointed quadrants any region Ri where P(Ri)=False
2. Merge any adjacent regions Rj and Rk for which P(Rj U Rk)=True
3. Stop when no further merging or splitting is possible

Summary
• In this lecture we have begun looking at

segmentation, and in particular thresholding, and
region-based segmentation.
• We saw the basic global thresholding algorithm
and its shortcomings
• We observed how noise or uneven illumination
effect image segmentation
• We also saw a simple way to overcome some of
these limitations using adaptive thresholding
Image Segmentation Part II
 Discontinuity-based approach
• Identification of Lines
• Identification of Edges
• Identification isolated points

The Segmentation Problem
Segmentation attempts to partition the pixels of an

image into groups that strongly correlate with the
objects in an image
Typically the first step in any automated computer
vision application
Segmentation Examples
Detection of Discontinuities
There are three basic types of grey level

discontinuities that we tend to look for in digital
images:
 Points
 Lines
 Edges
We typically find discontinuities using masks and
correlation
• The most common way to look for discontinuities
is to scan a small mask over the image.
• The mask determines which kind of discontinuity
to look for.
9
R  w1 z1  w2 z2  ...  w9 z9   wi zi
i 1
Point Detection
Point detection can be achieved simply using the mask below:
Points are detected at those pixels in the subsequent filtered

image that are above a set threshold
Point Detection (cont…)
X-ray image of Result of point Result of

a turbine blade detection thresholding
>> f = imread('point.tif');
>> imshow(f);
>> w = [-1 -1 -1; -1 8 -1; -1 -1 -1]
w=
-1 -1 -1
-1 8 -1
-1 -1 -1
>> g = abs(imfilter(double(f), w));
>> T = max(g(:))
T=
391
>> g = g >= T;
>> imshow(g);
>> g = abs(imfilter(double(f), w)) >= 10;
>> imshow(g);
Line Detection
The next level of complexity is to try to detect lines

The masks below will extract lines that are one pixel
thick and running in a particular direction
Line Detection (cont…)
Binary image of a wire bond

mask
After
Result of
processing with
thresholding
-45° line
filtering result
detector
>> f = imread('wirebond.tif');
>> imshow(f); % Fig 10.4(a)
>> w = [2 -1 -1; -1 2 -1; -1 -1 2];
w=
2 -1 -1
-1 2 -1
-1 -1 2
>> g = imfilter(double(f), w);
>> imshow(g, [ ]); % Fig 10.4(b)
>> gtop = g(1:120, 1:120);
>> figure, imshow(gtop, [ ]) % Fig 10.4(c)
>> gbot = g(end-119:end, end-119:end);
>> figure, imshow(gbot, [ ]) % Fig 10.4(d)
>> g = abs(g);
>> figure, imshow(g, [ ]); % Fig 10.4(e)
>> T = max(g(:))
T=
1530
>> g = g >= T;
>> figure, imshow(g); % Fig 10.4(f)
Edge Detection
An edge is a set of connected pixels that lie on the

boundary between two regions

Edges & Derivatives
We have already spoken about how

derivatives are used to find

discontinuities
1st derivative tells us where an edge is.
Look for big absolute value
The sign of the 2nd derivative can be
used to determine whether an edge pixel
lies on the dark or light side of an edge.
Look for sign change(zero crossing)
Derivatives & Noise
Derivative based edge detectors are extremely Original 1st derivative 2nd
derivative
sensitive to noise. We need to keep this in mind.
# Detection of Discontinuities
Gradient Operators
 First-order derivatives:
– The gradient of an image is a directional change in the
intensity or color in an image. The gradient of an image f(x,y) at
 x   fx 
location (x,y) is defined as the vector:
G
f      f 
G y   y 
– The magnitude of this vector:
f  mag (f )  G  G  2
x
2
y 
1
2
 Gy 
 ( x, y )  tan 1  
– The direction of this vector:
 Gx 
# Detection of Discontinuities
Gradient Operators
Common Edge Detectors
Given a 3*3 region of an image the following edge detection filters can
be used
Common Edge Detectors
Prewitt masks for

detecting diagonal edges
Sobel masks for

detecting diagonal edges
Edge Detection Example
Original Image Horizontal Gradient Component
Vertical Gradient Component Combined Edge Image

Edge Detection Problems
• Often, problems arise in edge detection in that there

is too much detail.
• For example, the brickwork in the previous
example.
• One way to overcome this is to smooth images
prior to edge detection
Edge Detection Example With Smoothing
Original Image Horizontal Gradient Component
Vertical Gradient Component Combined Edge Image

Edge Detection Using Matlab
Syntax:
[g, t] = edge(f, ‘method’, parameters)

>> f = imread('building.tif');
>> imshow(f); % Fig. 10.6(a)
>> [gv, t] = edge(f, 'sobel', 'vertical');
>> imshow(gv); % Fig. 10.6(b)
>> t
t=
0.0516
>> gv = edge(f, 'sobel', 0.15, 'vertical');
>> imshow(gv); % Fig. 10.6(c)
>> gboth = edge(f, 'sobel', 0.15);
>> imshow(gboth); % Fig. 10.6(d)
>> w45 = [-2 -1 0; -1 0 1; 0 1 2]

w45 =
-2 -1 0
-1 0 1
0 1 2
>> g45 = imfilter(double(f), w45, 'replicate');
>> T = 0.3*max(abs(g45(:)))
T=
211.8000
>> g45 = g45 >= T;
>> figure, imshow(g45); % Fig. 10.6(e)
>> wm45 = [0 1 2; -1 0 1; -2 -1 0]
wm45 =
0 1 2
-1 0 1
-2 -1 0
>> gm45 = imfilter(double(f), wm45,
'replicate');
>> T = 0.3*max(abs(gm45(:)))
T=
206.7000
>> gm45 = gm45 >= T;
>> figure, imshow(gm45); % Fig. 10.6(f)
Gradient Operators
 Second-order derivatives: (The

Laplacian) 2 f 2 f
 f  2  2
2
The Laplacian of a 2D function f(x, y) is x y
defined as
Two forms are in practice [GR, Ch.3]

Laplacian Edge Detection
We encountered the 2nd-order derivative based Laplacian filter

already.
The Laplacian is typically not used by itself as it is too sensitive to

noise.
Usually when used for edge detection the Laplacian is combined
with a smoothing Gaussian filter.
Laplacian Of Gaussian
 Consider the function:

A Gaussian function
2
r

h ( r )  e 2 2
where r 2  x 2  y 2
and  : the standard deviation
 The Laplacian of h is
r2
 r     2 2
2 2 The Laplacian of a
 h( r )   
2
e Gaussian (LoG)
 
4

 The Laplacian of a Gaussian sometimes is called the Mexican hat function. It
also can be computed by smoothing the image with the Gaussian smoothing mask,
followed by application of the Laplacian mask.
Laplacian Of Gaussian
The Laplacian of Gaussian (or Mexican hat) filter uses the Gaussian
for noise removal and the Laplacian for edge detection
Laplacian Of Gaussian Example

LoG using Matlab
>> I = imread('building.tif');
>> imshow(I);
>> ILoG = edge(I, 'log');
>> imshow(ILoG);
Canny Edge Detection
The Canny edge detector is an edge detection
operator that uses a multi-stage algorithm to
detect a wide range of edges in images. It was
developed by John F. Canny in 1986.
Canny Edge Detection: Process of
Algorithms
o Apply Gaussian filter to smooth the image in order to remove the
noise
o Find the intensity gradients of the image
o Apply non-maximum suppression to get rid of spurious response to
edge detection (take the point that is local maximum of the gradient
magnitude)
o Apply double threshold to determine potential edges
o Track edge by hysteresis: Finalize the detection of edges by
suppressing all the other edges that are weak and not connected to
strong edges.
Canny Edge Detection
>> I = imread('building.tif');
>> imshow(I);
>> IC = edge(I,'canny');
>> imshow(IC);
Summary
In this lecture we have begun looking at segmentation,

and in particular edge detection.
Edge detection is massively important as it is in many
cases the first step to object recognition.
References
1. http://www.cse.unr.edu/~bebis/CS485/
2. http://www.comp.dit.ie/bmacnamee/gaip.htm

(CV) Combined S1

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

(CV) Combined S1

Hochgeladen von

Copyright:

Verfügbare Formate

Digital Image Acquisition, Sampling

f (x,y): intensity/brightness of the image at spatial coordinates (x,y)

Commonly used number of samples (resolution)

An M x N digital image is expressed as

There is no universally accepted convention or notation. Always check carefully!

Digital images are 2D arrays (matrices) of numbers:

250 x 210 samples 125 x 105 samples 50 x 42 samples 25 x 21 samples

16 gray levels 8 gray levels 4 gray levels Binary image

imresize() – resizing an image to any given size

figure – opening a new graphical window

subplot(#of row, # of col, location) – showing

imshow() – displaying an image

Single imaging sensor

 Goal: to convert EM energy into electrical signals that

CCD Array Cameras

CMOS Array Cameras

Aspect Width Height

1/3.6" 4:3 4.0 3.0

1/3.2" 4:3 4.5 3.4

1/3" 4:3 4.8 3.6

1/2.7" 4:3 5.3 4.0

1/2" 4:3 6.4 4.8

1/1.8" 4:3 7.2 5.3

2/3" 4:3 8.8 6.6

1" 4:3 12.8 9.6

4/3" 4:3 18.0 13.5

EOS 10D 3:2 22.0 15.0

The pixels surrounding a given pixel. Most neighborhoods used in

f(0,0) f(0,1) f(0,2) f(0,3) f(0,4) -

f(0,0) f(0,1) f(0,2) f(0,3) f(0,4) -

f(0,0) f(0,1) f(0,2) f(0,3) f(0,4) -

In gray scale, image contains more gray level values in range 0 to

Soln: b & c are m-adjacent.

Soln: b & e are m-adjacent.

Soln: e & i are m-adjacent.

Connectivity: 2 pixels are said to be connected if their

Let ‘S’ represent subset of pixels in an image.

Two pixels p & q are said to be connected in ‘S’ if their exists a

For any pixel p in S, the set of pixels that are connected to it in

Paths: A path from pixel p with coordinate ( x, y) with

(x, y) = (x0, y0)

Closed path: (x0, y0) = (xn, yn)

Example # 1: Consider the image segment shown in figure.

So, Path does not exist.

So, shortest-8 path = 4

So, shortest-m path = 5

Regions that are not adjacent are said to be disjoint.

We consider 4- and 8- adjacency when referring to regions.

Below regions are adjacent only if 8-adjacency is used.

Boundaries (border or contour): Set of pixel in the region that

RED colored 1 is NOT a member of border if 4-connectivity is used between

City Block Distance: The D4 distance between p & q is defined

In this case, pixels having D4 distance from ( x, y) less than or equal

Chess-Board Distance: The D8 distance between p & q is

In this case, pixels having D8 distance from ( x, y) less than or equal

Color perception is psychophysical phenomenon that combines

1. The physical properties of light sources, absorption and

2. The physiological and psychological aspects of the human

1. Color is a powerful descriptor that can be used in automated

1. Full color processing

– Luminance: perceived amount of energy ( lumens)

– Brightness: a subjective measure same as the achromatic notion of intensity, it

describes the sensation of a color.

Primary Colors (RGB)

g5 = f1w1+ f2w2+ f3w3+ f4w4+ f5w5+ f6w6+ f7w7+ f8w8+ f9*w9

20 + 3(-1) + 50 + 3(-1) + 44 + 7(-1) + 50 + 6(-1) + 5*0 = -3