Sie sind auf Seite 1von 4

Panoramic image stitching

3D Photography Initial Project Proposal


Javier Montoya

Introduction

The automatic alignment and stitching of images into seamless photo-mosaics represents
one of the oldest research topics in the computer vision agenda.
The goal of image alignment consists basically in finding the appropriate mathematical
model that relates pixel coordinates in one image to pixel coordinates in another image.
Image alignment techniques can be broadly divided into two main categories [14]: directbased (pixel-based) or feature-based techniques. In direct-based techniques, the images are
warped or shifted relative to each other to measure how similar they are, i.e. how similar
their pixels are. To find out how good the alignment is, an error metric is usually used
(e.g, sum of squared differences, sum of absolute differences, etc.). Once the error metric
has been selected, the next step is to perform a search to find out which is the function
that optimizes the alignment cost.
In feature-based techniques, a set of distinctive features are first extracted from each
image. Those features are then matched to establish a global correspondence and also
to then estimate the geometric transformation between the images. One of the benefits of
these approaches is their robustness against scene movements and also their less demanding
computational resources when compared to direct-based techniques. In addition, the major
advantage of these techniques relies on their capabilities to automatically discover the
overlap relationships among a set of unordered set of images [3].
Furthermore, the goal of image stitching consists basically in first choosing a final
compositing surface over which the aligned images are warped and placed. Examples
of composite surfaces include for instance flat, cylindrical, or spherical surfaces. Also
one important step consists in selecting which are the pixels that contribute to the final
composite and how to blend these pixels so as to minimize possible blurs, visible seams,
etc [14].

Overview of the project

The main goal of this project consists in implementing automated panorama stitching by
using feature-based techniques. The main steps of the proposal can be summarized as
follows:
1

Panoramic image stitching


1. Detecting and matching features.
2. Robustly recovering homographies.
3. Warping images and compositing them.

2.1

Detecting and matching features

The first step in the panorama image stitching pipeline is to extract local features from each
input image. To do this, a set of feature detectors and descriptors are combined together.
The feature detectors are used to find a set of salient points or keypoints in each image.
Those keypoints are of special interest mainly because they can be reliaby localized under
varying geometric transformations or image conditions, viewpoint changes, etc. The literature has indeed shown different keypoint detectors being probably the Harris [8] and the
Hessian [2] detectors the most popular ones. This is mainly because of their remarkable
robustness to image plane rotations, noise, and changes in illumination [13]. In the recent years, some efforts have been conducted to increase their invariance to scale changes.
The improved detectors include for instance: the Laplacian-of-Gaussian (LoG) [10], the
Difference-of-Gaussian (DoG) [11], and the Harris/Hessian Laplace detectors [12]. Ideally,
those keypoints are similar in the regions of overlap between the images. The feature descriptors are used to encode in a descriptor the interest regions extracted from the images
in such a way that the descriptor is suitable for later discriminative matching. One of the
most popular choices include the SIFT descriptor [11] and also the SURF descriptor [1]. It
is worth to mention that most of the descriptors are based on local gradient-orientation.
Once the local features of a given image have been extracted, they are matched to
similar-looking local features in the other images. A nave solution to find good corresponding local matches is to scan each feature in a given image and compare it to each
feature in another image. This linear-scan process is however time consuming and therefore
more sophisticated algorithms are needed for searching good matches. Traditionally, treebased algorithms (e.g. KD-Trees [6]) or hashing-based algorithms (e.g Locality-sensitive
hashing [7] ) are used for that purposes, in which the features are indexed.

2.2

Robustly recovering homographies

In the second step, once a set of feature matches has been established (i.e point correspondences), the matches are verified to determine whether they are geometrically consistent.
More precisely, the goal is to seek a linear geometric transformation that expresses the
allowed change between two corresponding feature configurations. One of those geometric
transformations is called homography or projective transformation. More formally, a homography is a projection of a plane onto another plane and is defined by a 3 3 matrix
with 8 degrees of freedom. Therefore, at least 8 linear equations are needed to solve the
8 unknowns of H. The 8 linear equations are based on four point correspondences. One
possible way to estimate the Homography from feature correspondences is through the Direct Linear Transformation (DLT ) method [9]. One more robust approach to estimate the

Panoramic image stitching

Homography is through the RAndom SAmple Consensus (RANSAC [5]) approach which
considers outliers correspondences.

2.3

Warping images and compositing them

Finally, once the Homography is recovered, the images can be composited to generate
the panorama. More precisely, this last step involves selecting a final compositing surface
(flat, cylindrical, spherical, etc.) and also a reference image. Traditionally, one of the
images is selected as reference and then all other images are warped into the reference
coordinate system. This step also involves selecting which of the pixels contribute to the
final composite and how to optimally blend those pixels to create good-looking panoramas.

2.4

Summary

In the present project, the feature extraction and matching is going to be programmed
from scratch. For feature extraction one possibility would be to implement either the SIFT
descriptor or the MOPS [4] descriptor. At the same time, to efficiently match the features,
a KD-Tree is going to be implemented. Furthermore, to robustly recover the homographies,
the Ransac approach is going to be programmed. Concerning the image composition, a
flat panorama is initially intended.

References
[1] H. Bay, T. Tuytelaars, and L. V. Gool, Surf: Speeded up robust features,
Euopean Conference on Computer Vision, (2006), pp. 404417.
[2] P. R. Beaudet, Rotationally invariant image operators, Proceedings of the 4th International Joint Conference on Pattern Recognition, (1978), pp. 579583.
[3] M. Brown and D. G. Lowe, Recognising panoramas, International Conference on
Computer Vision, (2003), pp. 12181227.
[4] M. Brown, R. Szeliski, and S. Winder, Multi-image matching using multi-scale
oriented patches, Computer Vision and Pattern Recognition, (2005), pp. 510517.
[5] M. A. Fischler and R. C. Bolles, Random sample consensus: a paradigm for
model fitting with applications to image analysis and automated cartography, Commun.
ACM, 24 (1981), pp. 381395.
[6] J. H. Friedman, J. L. Bentley, and R. A. Finkel, An algorithm for finding best
matches in logarithmic expected time, ACM Trans. Math. Softw., 3 (1977), pp. 209
226.

Panoramic image stitching


[7] A. Gionis, P. Indyk, and R. Motwani, Similarity search in high dimensions
via hashing, in Proceedings of the 25th International Conference on Very Large Data
Bases, VLDB 99, 1999, pp. 518529.
[8] C. Harris and M. Stephens, A combined corner and edge detection, Proceedings
of The Fourth Alvey Vision Conference, (1988), pp. 147151.
[9] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision,
Cambridge University Press, New York, NY, USA, 2 ed., 2003.

[10] T. Lindeberg, Scale-space theory: A basic tool for analysing structures at different
scales, Journal of Applied Statistics, 21 (1994), pp. 224270.
[11] D. G. Lowe, Distinctive image features from scale-invariant keypoints, International
Journal on Computer Vision, 60 (2004), pp. 91110.
[12] K. Mikolajczyk and C. Schmid, Scale & affine invariant interest point detectors,
International Journal on Computer Vision, 60 (2004), pp. 6386.
[13] C. Schmid, R. Mohr, and C. Bauckhage, Evaluation of interest point detectors,
International Journal on Computer Vision, 37 (2000), pp. 151172.
[14] R. Szeliski, Image alignment and stitching: a tutorial, Foundations and trends in
computer graphics and vision, 2 (2006), pp. 1104.

Das könnte Ihnen auch gefallen