Sie sind auf Seite 1von 162

MID-LEVEL VISION:

ARTIFICIAL
Computer Vision 2018
What is the purpose of Artificial Mid-Level
Vision?
• The purpose of this slides is all about segmentation – how to group similar
elements of an image together so that we can better understand an image.
IMAGE FEATURES
Mid-Level Vision: Artificial
Computer Vision 2018
What are the image features we consider
in computer vision and how do they relate
to the Gestalt laws?
Image Features Gestalt Law
Location Gestalt law of proximity
Colour/intensity Gestalt law of similarity
Texture Gestalt law of similarity
Size Gestalt law of similarity
Depth None
Motion Gestalt law of common fate
Not separated by contour Gestalt law of common region
Form a known shape when assembled Top-down
What is an element of an image?
• An elements of an image could be a pixel or group of pixels.
What is a feature space?
• A feature space is a multidimensional map where all the elements of an image are
located in terms of their features.
What is the purpose of a feature space?
• The purpose of the features space is to see which elements are similar so that we
can group them together. We do this by determining the distance between each
point in the feature space.
Give an example of a feature space.
• An example would a feature space of the RGB intensity colours. Feature1=Red,
feature2=Green, feature3=Blue.
The feature space is used for determining
the similarity between elements. What
are some measures for similarity?
The method for
Largest resulting value when a calculating distance and
and b are very similar similarity will affect the
segmentation that is
produced.

Mean values

Smallest when a=b or they are


very similar
What are the two purposes of scaling
features?
1. Some features may be more important than others in certain applications. E.g.
colour might be more important then intensity of colour in one application.
2. When the features have different scales some will automatically have a greater
weighting than others. Solution: use percentages instead of actual numbers.
THRESHOLDING
Mid-Level Vision: Artificial
Computer Vision 2018
What is the feature space of thresholding?
• It is one-dimensional, i.e. solely intensity
What is thresholding?
• Regions can be defined by differences in brightness. All the elements with a
brightness above a certain level will only be included in the final segmented
image.
How can the foreground and background
change in the figure-ground
segmentation image with different
threshold values? What does this imply
about setting the thresholding value?
• Example: Best result with
lowest threshold value

What is means is that for each image the threshold value will be different. No universal value for
the threshold.
What are the methods for choosing a
threshold value?
Canny
method
Why might there be a problem with using
just one threshold value in an image?
This image
contains a
difference in
illumination
What is local/block thresholding?
• This method splits an image into rectangular blocks and a different threshold is
applied to each block (quadrant).
What are the drawbacks of thresholding?
Thresholding often results in:
1. Missing parts in the figure (e.g. edges with missing pixels)
2. Unwanted parts in the figure (e.g. extra pixels that are not part of any edge)
What are Morphological Operations used
for?
• Morphological operations are used to clean up the results of thresholding.
What are the two morphological
operations?
Methods:
1. Dilation
2. Erosion
What is dilation?
• Dilation – the process of expanding the area of the foreground pixels (1s) in a
binary image.
• Method:
1. Start with the original binary image
2. All background pixels that have a neighbouring foreground pixel are changed from 0
to 1.
• Neighbourhood definition: defined by a “structuring element” – e.g. a 3x3 pixel
square structuring element defines a neighbourhood where each pixel’s
neighbours are the 8 adjacent pixels horizontally, vertically and diagonally.
• Example:
What is erosion?
• Erosion – the process of shrinking the area of the foreground pixels (1s) in a binary
image.
• Method:
1. Start with the original binary image
2. All foreground pixels that have a neighbouring background pixel are changed from 1
to 0.

• Neighbourhood definition: defined by a “structuring element” – e.g. a 3x3 pixel


square structuring element defines a neighbourhood where each pixel’s
neighbours are the 8 adjacent pixels horizontally, vertically and diagonally.
• Example:
What processes would you use to get a
clear text image?
Processes:
1. Dilate the image – this is done first to fill the gaps, however the text will also
increase in size.
2. Erode the image – this is done second to get the text to the same size an in the
previous image
What processes would you use to get a
clear text image?
Process:
1. Erode the image – this is done first to get rid of the noise in the image, but also
shrinks the size of the text
2. Dilate the image – this is done second to expand the text so that it is the same
size as in the original image
REGION-BASED
SEGMENTATION
Mid-Level Vision: Artificial
Computer Vision 2018
What is the main drawback of
thresholding?
• A fundamental drawback of thresholding is that it does not take into account
spatial information about the image. Spatial information just means the level of
detail in the image.
What is region based segmentation?
Region based segmentation methods take into account the location of each pixel as
well as its intensity, or colour, or texture (or other features or a combination of
features that are being used to define similarity).
The feature space can be multi-dimensional ( following example has 3-D feature
space: colour)
What are the three main methods for
region-based segmentation?
Three methods:
1. Region growing
2. Region merging
3. Region splitting and merging
What is the region growing method?
Region growing method:
1. Start with one “seed” pixel, chosen arbitrarily
I. Give this pixel a label (defining the region it belongs to)
II. Examine all the unlabelled pixels neighbouring labelled pixels
III. If they are within similarity threshold, give then the same region label

2. Repeat until region stops growing, then choose another seed pixel which does
not yet belong to any region and start again
3. Repeat the whole process until all pixels have been assigned to a region
Given an example of the region growing
method.
Seed
In which situation may the region growing
method not work?
If similarity is based on intensity, then regions should stop growing at
discontinuities in intensity, i.e. edges.

However, region growth may “leak” through a single weak spot in the boundary.
Example:
What is the region merging method?
Region merging method:
1. Start by assigning a unique label to each pixel or group (e.g. 2x2, 3x3, 4x4) of
pixels.
I. A region’s properties are compared with those of an adjacent region
II. If they match, they are merged into a larger region and the properties of the new
region are computed (i.e. average of the feature vectors is computed)

2. Continue merging adjacent regions until a region cannot be merged with any of
its neighbours, it is marked “final”
3. The whole process repeats until all image regions are marked “final”
Given an example of the region merging
method.
We start at the
top left hand
corner and
merge
What is the drawback of the region
merging method?
The result of region merging usually depends on the order in which regions are
merged.
Due to the properties of the merged region being the average of the properties of
the constituent parts.
i.e. You get different segmentation regions depending on how you merge the pixels
What is the region splitting and merging
method?
Region splitting and merging method:
1. Start by labelling each pixel with the same label
2. For each region:
If all the pixels are not similar, split the four quadrants into different regions. Continue
until each region is homogeneous.

3. For each region:


Compare to neighbours and merge neighbouring regions which are similar. Continue until
no more regions can merge.
Give an example of the region splitting
and merging method.
What are the general problems with
region-based methods?
CLUSTERING
Mid-Level Vision: Artificial
Computer Vision 2018
What is the Clustering method?
Clustering tries to find natural groupings in the feature space.

Similarity is based on
distance when talking
about feature spaces
What are the two main sub-classes of the
clustering algorithm?
Two main sub-classes of the clustering algorithm:
• Partial Clustering – Data divided into non-overlapping subsets (clusters) such that
each data element is in exactly one subset.
• Hierarchical clustering – a set of nested clusters organized as a hierarchical tree.
What are the three main clustering
algorithms?
Clustering algorithms:
1. k-means clustering
2. Hierarchical clustering
3. Graph cutting
What is the k-means clustering method?
K-means clustering method:
1. Randomly choose k points to act as cluster centres
2. Allocate each elements to the cluster with the closest centre (measure all the
differences between each feature vector and the two cluster centres and see
which one is closer).
3. Compute new cluster centres as the means position of the elements in each
cluster
4. Until the cluster centres are unchanged redo from step 2
Give an example of a K-means cluster
method.
Stable state.
X stops moving
Does k-means clustering belong to the
partial clustering or hierarchical clustering
family?
• K-means clustering belongs to the partial clustering family
What are the three main problems of K-
means clustering?
Three problems of k-means clustering:
Problem 1: Results may differ depending on the location of the original cluster
centres.
Problem 2: Clustering is poor if true clusters are of different sizes
Problem 3: Clustering is poor if true clusters are not “globular” (i.e. non-spherical)
Problem 1
Problem 2
Problem 3
Show how the results of K-means
clustering might change if more than one
features are considered
5-dimensions: red, blue, green,
3-dimensions : red, blue, green
x-coordinate, y-coordinate
What is the Hierarchical clustering
method?
This method creates a set of nested clusters organised as a hierarchical tree:

If we set the threshold to o.1 points 1,3,2,5 are clustered together and 4,6 will make
another cluster.
What are the two sub-classes of methods?
Two sub-class methods:
1. Divisive clustering – the data set is regarded as a single cluster and then clusters
are recursively split along best boundary.
2. Agglomerative clustering – each data item is regarded as a cluster, and the
clusters are recursively merged by choosing two most similar clusters.
What is the basic Agglomerative
clustering algorithm?
Basic Agglomerative clustering algorithm:
1. Let each data point be a cluster
2. Compute the proximity matrix (i.e. calculate the distances between each
cluster)
3. Loop:
1. Merge the two closest clusters
2. Update the proximity matrix

4. End: if only a single cluster remains

Key operation is the computation of the proximity of the two clusters.


Different approaches to defining the distance between clusters distinguishes the
different algorithms.
Give an example of an agglomerative
clustering algorithm.
What is inter-cluster similarity?
It is the similarity between clusters of points
What are the algorithms for calculating
the inter-cluster similarity?
Inter-cluster algorithms:
1. Single-link clustering – distance between clusters is shortest distance between
elements (MIN distance)
2. Complete-link clustering – distance between clusters is longest distance
between elements (MAX distance)
3. Group-average clustering – distance between clusters is average of all distances
between elements (AVERAGE distance)
4. Centroid clustering – distance between clusters is the distance between the
average of the feature vectors in each cluster
Give examples of how these inter-cluster
algorithms may be calculated.
Single-Link Clustering

Solution:
P1-p3=78
P1-p4=80
P2-p3=67
P2-p4=64

Inter-cluster distance = 64 units


Compete-Link Clustering

Solution:
P1-p3=78
P1-p4=80
P2-p3=67
P2-p4=64

Inter-cluster distance = 80 units


Group-average Clustering

Solution:
P1-p3=78
P1-p4=80
P2-p3=67
P2-p4=64

Inter-cluster distance =
(78+80+67+64)/4 = 72.25 units
Centroid Clustering

p5 Solution:
Average(p3,p4)=p5
Average(p3,p4)=p6

Linter-cluster distance = 70.7 units

p6
Given examples of how the results may
change depending on the clustering
algorithm chosen.
How can a feature space be considered in
the form of a graph?
What is graph cutting?
Cut the links that
have weakest
similarities
What is the cost of cutting the edges in a
graph?
Cost of making a cut is the sum of all the
edges, as you need to go through each
edge to find the weakest ones.
How do you overcome creating very small
subgraphs?
Give examples of results of segmented
images from utilizing the cutting graph
method.
What are the problems associated with
normalized cuts?
This problem is NP-Hard because it is impossible to go through every single
subgraph and measure the weakest edge in polynomial time
FITTING
Mid-Level Vision: Artificial
Computer Vision 2018
What are fitting methods?
These are a class of methods that try to use a mathematical model to represent a
set of elements. They can be used to find the boundaries of objects, and hence,
segment the image.
What are some examples of fitting
algorithms?
Examples:
1. A model of the outline of an object that can be rotated and translated to
compare it with the image
2. If this model is found to closely fit a set of points or line segments this is unlikely
to be due to chance.
3. So we represent those elements using the model, i.e. the elements that fit the
model are grouped together.
How do we fit straight lines to a set of
points?
What is the Hugh Transform and how is it
used for fitting lines?
𝒓
Derive the formula: 𝒚 = 𝒙 𝐭𝐚𝐧 𝜽 +
𝐜𝐨𝐬 𝜽
Give examples of the Hough Transform
How can the Hough Transform also be
used for finding elements along a circular
path?
How can the Hough Transform be used for
non-regular shapes?
What are the advantages and
disadvantages of using the Hough
Transform?
What are active contours (“snakes”)?
Give an example of an active contour.
What are the problems of active
contours?
SUMMARY
Mid-Level Vision: Artificial
Computer Vision 2018

Das könnte Ihnen auch gefallen