C V: C C M: Omputer Ision Omputation of Amera Atrix

C OMPUTER V ISION : C OMPUTATION OF C AMERA
M ATRIX
IIT Kharagpur
Computer Science and Engineering,

Indian Institute of Technology
Kharagpur.
(IIT Kharagpur) Camera Matrix Feb ’10 1 / 47

O UTLINE I
Computation of Camera matrix

Computation of camera matrix is known as resectioning.
We shall study numerical methods for estimating the camera
projection matrix from corresponding 3-space and image entities.
A 3D point X gets mapped to its image x under the unknown
camera mapping.
Given sufficiently many correspondences Xi ↔ xi the camera
matrix P can be identified.
P can also be determined from sufficiently many corresponding
world and image lines.

Given a number of point correspondences Xi ↔ xi we are required
to find the 3 × 4 camera matrix P such that xi = PXi for all i.
This problem is similar to computing the 2D projective
transformation H.
For each correspondence Xi ↔ xi we derive the relation:
   1 
 0T −wi XiT yi XiT   P 
 2 
 wi XiT 0T −xi XiT  P  = 0
 

   3 
T
−yi Xi T
xi Xi 0T P
where each Pi T is a 4-vector, the i th row of P.

Since these equations are linearly dependent, we can choose only
the first 2 equations.

#  P1 
 
" T T T
0 −wi Xi yi Xi 
 2 
 P  = 0
wi XiT 0T −xi Xi T  3 
P
For a set of n point correspondences we obtain a 2n × 12 matrix A
by stacking up the equations for each correspondence.
The projection matrix P is computed by solving the set of
equations Ap = 0 where p is the vector containing the entries of
matrix P.

Minimal Solution Case 1
Since P has 12 entries, it has 11 dof, we need 11 equations to
solve for P.
Given 11 equations (from 6 point correspondences), the solution
is exact, i.e. the space points are projected exactly onto their
measured images.
Solution is obtained by solving Ap = 0, where A is 11 × 12 matrix
in this case.
In general A will have rank 11, and the solution vector p and the
solution vector p is the 1-dimensional right null-space of A.

Over-determined solution Case 2
Point measurements have noise.
Number of correspondences available n ≥ 6.
Exact solution to Ap = 0 is not possible.
A solution for P is obtained by minimizing an algebraic or
geometric error.
Minimize ||Ap|| subject to some normalization constraint.
1) ||p|| = 1
2) ||p̂3 || = 1 where p̂3 is the vector
(p31 , p32 , p33 )T which are the first 3 entries in the
last row of P.
The residual Ap is known as the algebraic error.

Objective: Gold Standard Algorithm: Stage 1
Given n ≥ 6 world to image point correspondences {Xi ↔ xi }, determine

the Maximum Likelihood estimate of the camera projection matrix P, i.e. P
which minimizes i d(xi , PXi )
P
Algorithm:
(i) Linear Solution: Compute an initial estimate of P using a linear algo-
rithm (DLT):
(a) Normalization: Use a similarity transformation T to normalize the

image points, and a second similarity transformation U to normalize the
space points.
Normalized image points are x̃i = Txi ,
Normalized space points are X̃i = UXi ,
(b) DLT: Form the 2n × 12 matrix A by stacking equations generated

by each correspondence xi ↔ Xi . Write p for the vectors containing the
entries of the matrix P. A solution of Ap = 0 is obtained from the unit
singular vector of A corresponding to the smallest singular value.

Data Normalization
Data normalization must be carried out before estimating the 2-D
homography.
Points xi on the image plane must be translated so that their
centroid is at the origin, and scaled so that their RMS
√
(root-mean-squared) distance from the origin is 2.
Normalization for 3D points

The centroid of the points is translated to the origin, and the
coordinates
√ are scaled so that their RMS distance from the origin
is 3.
This approach is suitable for a compact distribution of points.

Data Normalization
Normalization for sparse 3D points
In the case of points at or near infinity in a plane, it is neither
reasonable nor feasible to normalize coordinates using the
isotropic (or non-isotropic) scaling schemes since the centroid and
scale are infinite or near infinite.
A method that seems to give good results is to normalize the set
of points x = (x i , y i , w i )T such that
X X X X
xi = y i = 0; x 2i + y 2i = 2 w 2i ;
i i i i
x 2i + y 2i + w 2i = 1 ∀i

Using Line correspondences for computing P
A line in 3D may be represented by two points X0 and X1 through
which the line passes.
Suppose this line gets projected onto the image line l.
The plane formed by back-projecting from the image line l is equal
to P T l.
The condition that point Xj lies on this plane is then
lT PXj = 0 for j = 0, 1.
Each choice of j gives a single linear equation in the entries of the

matrix P. Two such equations are obtained for each 3D to 2D line
correspondence.
These constraints can also be used along with the constraints
obtained by point correspondences.

What is DLT trying to minimize?
d(xi , PXi )2
P
DLT minimizes a mean square error: i
Suppose that
 
 Xi    
 x i   x̂ i 

 Y 
Xi =  i xi =  y i  PXi = ŵ i  ŷ i  = ŵ i x̂i
    

 Zi 
1 1
    
1


Consider Homography estimation problem....
We are given (measured) point correspondences hxi ↔ x0i i
   0   0 
 x i   x i   x̂ i 
0 0  0 
xi =  y i  xi =  y 0i  Hxi = x̂i =  ŷ i 
   
 0   0 
1 wi ŵ i
 
DLT minimizes the term: x0i × x̂0i = 0, formulated as Ai h

0 0
y 0i ŵ i − w 0i ŷ i
" #
x0i × x̂0i = 0 0 = Ai h = i error
w 0i x̂ i − x 0i ŵ i
The norm ||i || gives the algebraic error.

The algebraic distance for point correspondences hxi ↔ x0i i under
homography mapping H is
0 0 0 0
dalgebraic (x0i , x̂0i )2 = (y 0i ŵ i − w 0i ŷ i )2 + (w 0i x̂ i − x 0i ŵ i )2
The geometric distance is given as:

0 !2 0 !2 
 
 x 0i x̂ y 0
ŷ
dgeometric (x0i , x̂0i )2 =  0 − i0 + i0 − i0 

w i ŵ i w i ŵ i
0
ŵ i w 0i dgeometric = dalgebraic

The algebraic distance is related to but not the same as geometric
distance.
0
If ŵ i = w 0i = 1, then the two distances are identical.
0
For affine 2-D homographies, the value of ŵ i will always be 1.
 
 h11 h12 h13 
0
H A =  h21 h22 h23  x̂0i = H A xi then ŵ i = 1 if w i = 1
 
0 0 1
 
For affine homographies, geometric distances and algebraic distances

are identical. Hence geometric distances can be minimized by the linear
DLT algorithm based on algebraic distance.

Depth of Points
We next consider what is the 3D depth of the points acquired
using a camera projection P.

Depth of Points
Consider the camera matrix P projects a point X in 3-space to
image point x.
" #
C̃
P = [M | p4 ] C= PC = 0
1
 1T   1T 
 m p1   P 
  2T
P = [M | p4 ] =  m2T p2
 
 =  P 
 3T   3T 
m 1 P
 
 X  "  
#  x 
 Y  X̃
 
x = PX = w  y  = P3T X
 
X =   =
 Z  1
1
 
1
 
what is w?

Depth of Points What is w?
w = P3T X = P3T (X − C) since PC = 0

3T 3T
w = P (X − C) = m (X̃ − C̃)
m3 is the principal ray direction
w = m3T (X̃ − C̃) can be interpreted as the dot product of the ray
from the camera centre C to the point X, with the principal ray
direction.
If the camera matrix is normalized so that detM > 0

and ||m3 || = 1, then m3 is a unit vector pointing in the
positive axial direction.

w = m3T (X̃ − C̃) can be interpreted as the dot product of the ray
from the camera centre C to the point X, with the principal ray
direction.
w can be interpreted as the depth of the point X from the camera
centre C in the direction of the principal ray.

The interpretation of w as the depth assumes that the camera has
been normalized by multiplying it with an appropriate factor.
It is also possible to compute the depth of a point X without having
to normalize the camera matrix:
Let X = (X, Y, Z, T)T be a 3D point and P = [M | p4 ] be a

camera matrix for a finite camera. Suppose P(X, Y, Z, T)T =
w(x, , 1)T . Then
sign(detM)w
depth(X; P) =
T||m3 ||
is the depth of the point X in front of the principal plane of
the camera.

Depth of a point X with respect to a camera with projection P is
sign(detM)w
depth(X; P) =
T||m3 ||
This gives us w
depth(X; P) T||m3 ||
w=
sign(detM)
If (detM) > 0, T = 1, and ||m3 || = 1, then
w = depth(X; P)
Thus the value of w can be interpreted as the depth of the point X

form the camera in the direction along the principal ray, provided
the camera is normalized so that ||m3 || = 1.

d(xi , PXi )2
P
DLT minimizes a mean square error: i
Suppose that
 
 Xi    
 x i   x̂ i 

 Y 
Xi =  i xi =  y i  PXi = ŵ i  ŷ i  = ŵ i x̂i
    

 Zi 
1 1
    
1

Minimizing the algebraic error

X X X
d alg (xi , PXi )2 d alg (xi , ŵ i x̂i )2 (ŵ i d geom (xi , x̂i ))2
i i i
What is the geometric significance of ŵ i d geom (xi , x̂i ) ?

ŵd = f∆
Algebraic error i d alg (xi , PXi )2 being minimized is:
P
X X
(ŵ i d geom (xi , x̂i ))2 −→ f2 d geom (Xi , X0i )2
i i

Error term f2 d geom (Xi , X0i )2 can be interpreted as the geometric

P
i
error.
The distance d geom (Xi , X0i ) is the correction that needs to be made
to the measured 3D points in order to correspond precisely with
the measured image points xi .

The correction d geom (Xi , X0i ) must be made in the direction

perpendicular to the principal axis of the camera.
The point X0i is not the closest point X̂i to Xi that maps to xi .
For points Xi not too far from the principal ray of the camera, the
distance d geom (Xi , X0i ) is a reasonable approximation to the
distance d geom (Xi , X̂i ).
For points farther away from the principal ray, the distance
d geom (Xi , X0i ) will be slightly larger than d geom (Xi , X̂i ).
DLT
P will also tend to minimize the focal length f when it minimizes
f i d geom (Xi , X0i )2 .

By minimizing ||Ap|| subject to the constraint

||p̂3 = 1||, the solution obtained is trying to
minimize the 3D geometric distances.
The interpretation of minimizing geometric
distances is not affected by similarity
transformations (e.g. translation, scaling etc.) in
either 3D space or the image space.

Objective: Gold Standard Algorithm Stage 2
Given n ≥ 6 world to image point correspondences {Xi ↔ xi }, determine

the Maximum Likelihood estimate of the camera projection matrix P, i.e. P
which minimizes i d(xi , PXi )
P
Algorithm:
(i) Linear Solution: Compute an initial estimate of P using a linear algo-
rithm (DLT) as given in previous slide.
(ii) Minimize Geometric Error Using the linear estimate as the starting
point minimize the geometric error:
X X
d(x̃i , P̃X̃i )2 d(x̃i , x̂i )2
i i
over P̃ using an iterative algorithm such as Levenberg-Marquardt.
(iii) Denormalization: The camera matrix for the original (unnormalized)

coordinates is obtained from P̃ as: P = T−1 P̃U

Geometric Error
Error only in image measurements
If the world points are known accurately, then the measurement
errors are possible only in the image measurements.
The geometric error in the image is:
X
d(xi , x̂i )2
i
where xi is the measured point and x̂i is the point PXi which is the
exact image of Xi under P.

Geometric Error
Error in the world points
If the world points are not known accurately, then we may choose
to estimate P by minimizing a 3D geometric error, or an image
geometric error, or both.
The 3D geometric error for the world points:
X
d(Xi , X̂i )2
i
where X̂i is the closest point in space that maps onto xi via
xi = PX̂i

Geometric Error
Error in both world points and image points
We minimize a weighted sum of world and image errors.
The weights are chosen to reflect the relative accuracy of
measurements of the image and 3D points.
Image and world points are typically measured in different units.
X
γ d(Xi , X̂i )2 + ξ d(xi , x̂i )2
i

Estimation of an affine camera
An affine camera has P with the last row P3T = (0, 0, 0, 1)T .
 
 Xi   
 x i 
 Y 
 
Xi =  i  xi =  y i  xi × PXi = 0
 
 Zi 
1
 
1
 
   1 
 0T −w i XiT yi XiT   P 
 2 
Ap =  w i Xi T
0T −xi XiT

 P  = 0


  3 
−yi XiT T
xi Xi 0T P

Substituting for values: P3T = (0, 0, 0, 1)T and w i = 1
P1
" # ! !
0T −w i XiT yi
Ap = + =0
w i XiT 0T P2 −x i

Estimation of an affine camera
An affine camera has P with the last row P3T = (0, 0, 0, 1)T .
P1
" # ! !
0T −w i XiT yi
Ap = + =0
T
w i Xi 0T P2 −x i
Considering all point correspondences Xi ↔ xi

X X 2 2
d alg (xi , x̂i )2 = ||Ap||2 = x i − P1T Xi + y i − P2T Xi
i i
For affine cameras, the algebraic error and geometric error are the
same:
d alg (xi , x̂i ) = d geom (xi , x̂i )
Geometric image distances can be minimized by a linear
algorithm.

Objective: Gold Standard Algorithm: Affine Camera
Given n ≥ 4 world to image point correspondences {Xi ↔ xi }, deter-

mine the Maximum Likelihood Estimate
P of the affine camera projection
matrix P A , i.e. P which minimizes i d(xi , PXi ) subject to constraint
P3T = (0, 0, 0, 1)
Algorithm:
(i) Normalization: Normalized image points are x̃i = Txi , Normalized
space points are X̃i = UXi ,
(ii) DLT: Form the 2n × 8 matrix A8 by stacking equations generated
by each correspondence x̃i ↔ X̃i . Write p8 for the vectors containing the
entries of the matrix P.
 X̃i 0T   P̃1 
 T    !
x̃ i
A8 p8 = b  T   2  =
T 
0 X̃i
 
P̃
 ỹ i
(iii) Solve: A solution of A8 p8 = b is obtained by taking the pseudo

inverse of A8 to give p8 = A+
8
b and P3T = (0, 0, 0, 1)
(iv) Denormalization: P A = T−1 P̃ A U

Restricted Camera Estimation
 αx
 
s x0 
P = K[R | − RC] αy y0
 
K =  

1

Find the best-fit camera P subject to restrictive conditions on the

camera parameters.
The skew s is zero.
The pixels are square αx = αy
The principal point (x 0 , y 0 ) is known.
The complete camera calibration matrix K is known.
In some cases it is possible to estimate a restricted camera matrix with

a linear algorithm.
A restricted camera can be solved by minimizing either geometric or
algebraic error.

Minimizing Geometric Error Restricted Camera
Geometric error can be minimized with respect to the set of
parameters using iterative minimization like Levenberg-Marquardt.
If we want to minimize only the image errors, then the LM

minimization is minimizing a function f : R9 → R2n .
If we want to minimize both the image errors (2D), and space

point errors (3D) then the LM minimization is minimizing a function
f : R3n+9 → R5n since the 3D points must be included among the
measurements and minimization also includes estimation of the
true positions of the 3D points.

Minimizing Geometric Error Restricted Camera
Use a linear algorithm such as DLT to find an initial camera matrix.
Formulate the cost function for minimizing geometric error.
Assume that our constraints are s = 0 and αx = αy
Enforce soft constraints by adding extra terms to the cost function.
X
d geom (xi , PXi )2 + ws2 + w(αx − αy )2
i
The weights begin with low values and are increased at each
iteration of the estimation procedure.
The values of s and aspect ratio are drawn gently to their desired
values.
Finally these values may be clamped to their desired values for a
final estimation.

Minimizing Algebraic Error Restricted Camera
Minimizing algebraic error is equivalent to minimizing ||Ap||.
In the case of a restricted camera, we estimate only a subset of
parameters q
We have the map p = g(q)
Thus we minimize ||Ag(q)||
The minimization can be done using DLT.
The minimization can also be done using Levenberg-Marquardt ⇒
the minimization function f = ||Ag(q)||. Clearly f : R9 → R2n since
there are 2n constraints.
Is it possible to reduce the size of the minimization function f ?

The 2n × 12 matrix A may have very large number of rows.
It is possible to replace A by a square 12 × 12 matrix Â such that
T
||Ap|| = pT A T A p = pT Â Â p = ||Âp||
Matrix Â is called as the reduced measurement matrix.

T
A = UDV T A T A = (VDU T )(UDV T ) = (VD)(DV T ) = Â Â
If we define Â = DV T then minimizing ||Âp|| is equivalent to

minimizing ||Ap||.
When using Levenberg-Marquardt, the minimization function
f = ||Âp||. Hence f : R9 → R12 i.e. q 7→ Âp or q 7→ Âg(q)
The minimization problem ||Âp|| is independent of the number n of

point correspondences.

Summary:
Given a set of n correspondences Xi ↔ xi , the problem of
finding a constrained camera matrix P that minimizes sum
of algebraic distances i d alg (xi , PXi )2 reduces to the min-
P
imization of a function f : R9 → R12 independent of n.

Radial Distortion
Short focal length Long focal length

Radial Distortion
For real lenses, the pin-hole camera assumption does not hold.
Because of radial distortion, straight lines do not map to straight
lines.
The error is more significant as the focal length of the lens
decreases. Lenses which do not have radial distortion are very
costly.
A camera with a lens is not a linear device.

Radial Distortion
The cure for this distortion is to correct the image measurements

to those that would have been obtained under a perfect linear
camera action.
Radial distortion is measured with respect to the centre for radial
distortion.

Radial Distortion
Suppose a 3D point X projects to an image location (x̃, ỹ ) according to
linear projection.
(x̃, ỹ ) is the ideal (correct) image point position
(x d , y d ) is the actual image point position after radial distortion.
q
r̃ is the radial distance x̃ 2 + ỹ 2 from the centre for radial
distortion.
L(r̃ ) is a distortion factor, which is a function of the radius r̃ .
The radial (lens) distortion is modeled as:

! !
xd x̃
= L(r̃ )
yd ỹ

Radial Distortion
!
Correction of distortion
!
xd x̃
= L(r̃ )
yd ỹ
In pixel coordinates the correction is written as:
x̂ = x c + L(r )(x − x c ) ŷ = y c + L(r )(y − y c )
(x̂, ŷ ) are the corrected coordinates.

(x c , y c ) is the centre of radial distortion.
r 2 = (x − x c )2 + (y − y c )2 (assuming aspect ratio as unity)
The corrected points (x̂, ŷ ) are related to the coordinates of the 3D

world point by a linear projective camera.

Radial Distortion Correction of distortion
Choice of the distortion function

L(r ) is defined for positive values of r .
L(0) = 1
An arbitrary function L(r ) can be approximated as:
L(r ) = 1 + κ1 r + κ2 r 2 + κ3 r 3 + . . .
The coefficients of radial correction {κ1 , κ2 , κ3 , . . . , x c , y c } are

considered part of the interior calibration of the camera.
The principal point is often used as centre for radial distortion,
though these need not coincide exactly.

Estimation of distortion function Approach 1:

Approach 1: The distortion function may be included as part of the
imaging process, and the parameters {κ1 , κ2 , κ3 , . . . , x c , y c }
computed together with P during the iterative minimization of the
geometric error.

Estimation of distortion function Approach 2:

Approach 2: A straight scene line should be imaged as straight
line.
A cost function is defined on the imaged lines after the corrective
mapping by L(r ). e.g. the distance between the line joining the
imaged line’s ends and its mid-point.
The cost function is iteratively minimized over the parameters
{κ1 , κ2 , κ3 , . . . , x c , y c }.

Image obtained after correcting for the radial distortion.

C V: C C M: Omputer Ision Omputation of Amera Atrix

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

C V: C C M: Omputer Ision Omputation of Amera Atrix

Hochgeladen von

Copyright:

Verfügbare Formate

C OMPUTER V ISION : C OMPUTATION OF C AMERA

Computer Science and Engineering,

(IIT Kharagpur) Camera Matrix Feb ’10 1 / 47

Computation of Camera matrix

(IIT Kharagpur) Camera Matrix Feb ’10 2 / 47

where each Pi T is a 4-vector, the i th row of P.

(IIT Kharagpur) Camera Matrix Feb ’10 3 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 4 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 5 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 6 / 47

Given n ≥ 6 world to image point correspondences {Xi ↔ xi }, determine

(a) Normalization: Use a similarity transformation T to normalize the

(b) DLT: Form the 2n × 12 matrix A by stacking equations generated

(IIT Kharagpur) Camera Matrix Feb ’10 7 / 47

Normalization for 3D points

(IIT Kharagpur) Camera Matrix Feb ’10 8 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 9 / 47

Each choice of j gives a single linear equation in the entries of the

(IIT Kharagpur) Camera Matrix Feb ’10 10 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 11 / 47

DLT minimizes the term: x0i × x̂0i = 0, formulated as Ai h

The norm ||i || gives the algebraic error.

(IIT Kharagpur) Camera Matrix Feb ’10 12 / 47

The geometric distance is given as:

(IIT Kharagpur) Camera Matrix Feb ’10 13 / 47

For affine homographies, geometric distances and algebraic distances

(IIT Kharagpur) Camera Matrix Feb ’10 14 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 15 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 16 / 47

w = P3T X = P3T (X − C) since PC = 0

If the camera matrix is normalized so that detM > 0

(IIT Kharagpur) Camera Matrix Feb ’10 17 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 18 / 47

Let X = (X, Y, Z, T)T be a 3D point and P = [M | p4 ] be a

(IIT Kharagpur) Camera Matrix Feb ’10 19 / 47

Thus the value of w can be interpreted as the depth of the point X

(IIT Kharagpur) Camera Matrix Feb ’10 20 / 47

Minimizing the algebraic error

What is the geometric significance of ŵ i d geom (xi , x̂i ) ?

(IIT Kharagpur) Camera Matrix Feb ’10 21 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 22 / 47

Error term f2 d geom (Xi , X0i )2 can be interpreted as the geometric

(IIT Kharagpur) Camera Matrix Feb ’10 23 / 47

The correction d geom (Xi , X0i ) must be made in the direction

(IIT Kharagpur) Camera Matrix Feb ’10 24 / 47

By minimizing ||Ap|| subject to the constraint

(IIT Kharagpur) Camera Matrix Feb ’10 25 / 47

Given n ≥ 6 world to image point correspondences {Xi ↔ xi }, determine

over P̃ using an iterative algorithm such as Levenberg-Marquardt.

(iii) Denormalization: The camera matrix for the original (unnormalized)

(IIT Kharagpur) Camera Matrix Feb ’10 26 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 27 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 28 / 47

(IIT Kharagpur) Camera Matrix Feb ’10 29 / 47

Substituting for values: P3T = (0, 0, 0, 1)T and w i = 1

(IIT Kharagpur) Camera Matrix Feb ’10 30 / 47

Considering all point correspondences Xi ↔ xi

(IIT Kharagpur) Camera Matrix Feb ’10 31 / 47

Given n ≥ 4 world to image point correspondences {Xi ↔ xi }, deter-

(iii) Solve: A solution of A8 p8 = b is obtained by taking the pseudo

(IIT Kharagpur) Camera Matrix Feb ’10 32 / 47

Find the best-fit camera P subject to restrictive conditions on the

The norm ||i || gives the algebraic error.