Beruflich Dokumente
Kultur Dokumente
Handledare:
Christian Lundquist
isy, Linkpings universitet
Jacob Roll
Autoliv Electronics
David Forslund
Autoliv Electronics
Examinator:
Thomas Schn
isy, Linkpings universitet
Avdelning, Institution
Division, Department
Datum
Date
Rapporttyp
Report category
ISBN
Svenska/Swedish
Licentiatavhandling
ISRN
Engelska/English
Examensarbete
C-uppsats
D-uppsats
vrig rapport
2010-06-15
LiTH-ISY-EX--10/4386--SE
Serietitel och serienummer ISSN
Title of series, numbering
URL fr elektronisk version
http://www.control.isy.liu.se
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-57981
Titel
Title
Sammanfattning
Abstract
The goal of this work has been to improve the accuracy of a pre-existing algorithm
for vehicle pose estimation, which uses intrinsic measurements of vehicle motion
and measurements derived from far infrared images.
Estimating the pose of a vehicle, based on images from an on-board camera and
intrinsic measurements of vehicle motion, is a problem of simultanoeus localization
and mapping (SLAM), and it can be solved using the extended Kalman lter
(EKF). The EKF is a causal lter, so if the pose estimation problem is to be solved
o-line acausal methods are expected to increase estimation accuracy signicantly.
In this work the EKF has been compared with an acausal method for solving the
SLAM problem called smoothing and mapping (SAM) which is an optimization
based method that minimizes process and measurement noise.
Analyses of how improvements in the vehicle motion model, using a number
of dierent model extensions, aects accuracy of pose estimates have also been
performed.
Nyckelord
Keywords
visual odometry, smoothing and mapping, SAM, SLAM, automobile motion model,
FIR, monocular
Abstract
The goal of this work has been to improve the accuracy of a pre-existing algorithm
for vehicle pose estimation, which uses intrinsic measurements of vehicle motion
and measurements derived from far infrared images.
Estimating the pose of a vehicle, based on images from an on-board camera and
intrinsic measurements of vehicle motion, is a problem of simultanoeus localization
and mapping (SLAM), and it can be solved using the extended Kalman lter
(EKF). The EKF is a causal lter, so if the pose estimation problem is to be solved
o-line acausal methods are expected to increase estimation accuracy signicantly.
In this work the EKF has been compared with an acausal method for solving the
SLAM problem called smoothing and mapping (SAM) which is an optimization
based method that minimizes process and measurement noise.
Analyses of how improvements in the vehicle motion model, using a number
of dierent model extensions, aects accuracy of pose estimates have also been
performed.
Sammanfattning
Mlet med detta examensarbete var att frbttra precisionen hos en redan existerande algoritms skattningar av ett fordons pose (position och orientering), som
anvnder interna mtningar av fordonets rrelse samt mtningar erhllna frn
infrarda bilder.
Att skatta ett fordons pose, utifrn bilder frn en kamera ombord p farkosten
samt interna mtningar av fordonsrrelse, r ett problem av typen samtidig lokalisering och kartering (SLAM), och det kan lsas med ett utkat Kalmanlter (EKF).
EKF r ett kausalt lter, s om skattning av pose ska utfras i efterhand kan
icke-kausala metoder istllet anvndas och sdana metoder frvntas ge avsevrt
frbttrad precision i skattningarna. I detta examensarbete har EKF jmfrts med
en icke-kausal metod att lsa SLAM-problemet som kallas glttning och kartering
(SAM), en optimeringsbaserad metod som minimerar process- och mtbrus.
ven analyser av hur frbttringar av fordonets rrelsemodell, med ett antal
olika modelltillgg, pverkar precisionen i skattningarna av pose har genomfrts.
Acknowledgments
First of all I would like to thank David Forslund and Jacob Roll at Autoliv Electronics for the help, in all kinds of aspects, during this masters thesis. Of great
help has also Christian Lundquist and my examiner Thomas Schn been, constantly sharing ideas and answering questions throughout my work. My thanks
also goes to Zoran Sjanic and Martin Skoglund for giving me an introduction to
SAM.
Furthermore I want to take the opportunity to thank Thomas Karlsson and
Gran Forsling at the Mathematics department for, during all of my years at the
university, always allowing me to ask them questions, giving me helpful solutions
and explanations to mathematical problems and queries.
Last but not least, I would like to thank my family and Anna for all encouragement and support you have given me throughout the years.
vii
Contents
1 Introduction
1.1 Background . . . . . . . .
1.2 Problem Formulation . . .
1.3 Autoliv . . . . . . . . . .
1.4 Camera System Overview
1.5 Related Work . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
1
2
3
4
2 System Modeling
2.1 Coordinate Frames . . . . .
2.2 Rotation Matrices . . . . .
2.3 Vehicle Process Model . . .
2.3.1 Basic Model . . . . .
2.3.2 Model Extensions . .
2.4 Landmark Parametrization
2.5 Measurement Model . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
6
7
7
9
11
13
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
18
19
19
25
4 Results
4.1 Performance Measures . . . . . . . . . . .
4.1.1 Landmark Measurement Residuals
4.1.2 Trajectory Plot in Camera Image .
4.2 SAM and Model Extensions . . . . . . . .
4.3 Feature Association Improvement . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
27
27
28
29
54
5 Concluding Remarks
5.1 Conclusions . . . . . . . . . . . . . . . . .
5.1.1 SAM . . . . . . . . . . . . . . . . .
5.1.2 Vehicle Process Model Extensions
5.1.3 Feature Association Improvement .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
57
57
57
57
58
ix
Contents
5.2
Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
A Nomenclature
A.1 Mathematical Notations .
A.2 Abbreviations . . . . . . .
A.3 Landmark States . . . . .
A.4 Vehicle States . . . . . . .
A.5 Variables, Parameters and
A.6 Indices . . . . . . . . . . .
A.7 Top Notations . . . . . . .
59
61
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Functions
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
63
63
63
63
64
64
65
65
66
67
D Results
68
Chapter 1
Introduction
This section describes the problem to be solved in this work, and the context of
the problem.
1.1
Background
Autoliv Electronics has, together with the Division of Automatic Control at Linkping University, developed an o-line software tool for estimating the sequence of
poses of a vehicle (with pose meaning position and orientation), using images taken
by a far infrared (FIR) camera and measurements of vehicle acceleration, speed
and yaw rate. This tool uses sensor fusion of the vehicle measurements and FIR
images in order to perform simultaneous localization and mapping (SLAM), so it
is not only the vehicle poses that are estimated, but also positions of landmarks
seen by the camera. Figure 1.1 illustrates the SLAM problem; it shows ve consecutive vehicle poses, and measurements of landmarks from the dierent vehicle
positions. Note that although the camera only measures the direction to landmarks, it is possible to determine a landmarks position as the vehicle moves and
more measurements of the landmark are gathered.
1.2
Problem Formulation
The problem to be solved in this work is to nd out how the o-line estimation of
vehicle poses can be improved in terms of accuracy.
Before this work, pose estimation was performed using an extended Kalman
lter (EKF). Since the EKF is a causal method, and the pose sequence estimation
is performed o-line, using an acausal method was expected to improve the estimation signicantly, so the rst order of action was to implement such a method.
The problem of recovering the best possible estimate of the poses and landmark
positions is called smoothing and mapping (SAM, see for example [1]), and the
method proposed was linear least squares SAM, which means iteratively linearizing the vehicle and measurement models and nding a pose sequence estimate by
1
Introduction
Figure 1.1: Simultaneous localization (of the vehicle, illustrated by rectangles) and
mapping (of the landmarks, illustrated by circles) is performed by making some
sort of position measurements of the landmarks (e.g. bearing measurements) from
the dierent vehicle positions.
solving a linear least squares problem. Other proposed areas with potential for
increasing the estimation accuracy was improvements regarding the vehicle model,
feature detection and association algorithms (how to extract measurements from
the FIR images) and robustness against moving objects.
Since the ground truth for the vehicle poses is unknown, the estimation accuracy must be evaluated by indirect measures, so an important part of this work is
to nd good measures of estimation accuracy.
1.3
Autoliv
(a) City
(b) Countryside
Figure 1.2: These two images show comparisons between visual images and Night
Vision images
1.4
The Night Vision camera registers radiation in the far infrared (FIR) region at
30 Hz, with a resolution of 320240 pixels, and a spectral range of 814 m. An
advantage of using a FIR sensor, compared to near infrared (NIR) sensors, is that
body heat radiation from humans and animals lies within the FIR region ([13]
suggests that it is easier to detect distant pedestrian using FIR compared to using
NIR). On the other hand, non-living objects and structures are not registered so
clearly by a FIR camera, so for the sake of estimating vehicle poses the FIR camera
might not be the best choice. The camera system is passive, meaning that it does
not actively illuminate the road ahead with FIR radiation. Figure 1.3 shows the
camera when it is mounted to the car.
Figure 1.3: The images in this gure show the placement of the Night Vision
camera.
1.5
Introduction
Related Work
The EKF-based estimation of the vehicle poses that Autoliv used before this work
is described in [9]. Descriptions of a few versions of SAM can be found in [1] and
[11], where [11] describes the application of SAM to a problem similar to the one
in this work. The work in [1] is on the other hand more focused on fast execution
when using SAM, using factorizations of the information matrix, and [4] describes
how to further improve the execution time by using incremental updates of the
factorized information matrix.
In [8] it is shown how the uncertainty in the distance (depth) to landmarks
in images is more adequately represented by Gaussian noise when the distance
parametrization used is the inverse depth than when regular depth is used.
The model improvements for vehicle pitch dynamics evaluated during this work,
namely constant oset, a second order model and inuence from acceleration, are
described in [2].
Bundle adjustment (which is a computer vision term corresponding to what is
here referred to as SAM) is compared to ltering in [12]. The authors evaluate
bundle adjustment and ltering in terms of computational cost for decreasing
the robot pose estimate covariance, and found that ltering based methods only
might be the better choice when the available processing resources are small; they
conclude that bundle adjustment is superior in all other cases. Bundle adjustment
is still rather costly, and the authors of [10] state that this is because of the choice
of a single privileged coordinate frame; a choice which they avoid by taking a
relative approach to bundle adjustment. This approach uses some of the poses as
references, and for each other pose or landmark there is one reference pose which
it is related to.
In the solution developed during this work feature detection is performed using the Harris corner detector described in [3], and for feature description image
patches are used. For association of features in new images normalized crosscorrelation (NCC), see for example [7], between feature patches and the new images is used. A problem with using image patches as feature descriptors is that
they are not invariant to changes in scale and rotation, and [9] suggests using the
so-called Scale-invariant feature transform (SIFT) described in [6], instead of the
Harris detector and NCC of image patches. The SIFT algorithm detects and describes features in images and two of its advantages is that it is invariant to scale
and orientation of the features.
Chapter 2
System Modeling
The system of a vehicle moving in unknown surroundings, with measurements of
landmarks which are stationary in these surroundings, and measurements of the
vehicle motion, is modeled by a state space model. To begin with we have the
state of the vehicle (describing the current status of the position and motion of the
vehicle) at time t, denoted xvt . Based on the vehicle state xvt1 and some input ut
(in our case it is the acceleration of the vehicle), the next vehicle state xvt is given
by
(2.1a)
xvt = f (xvt1 , ut ) + wt ,
where f is the motion model function and wt is Gaussian process noise which
compensates for model errors. The position at time t of the jth landmark is
parametrized by its state xlj,t and the landmark motion model becomes
xlj,t = xlj,t1 ,
(2.1b)
since we assume that all landmarks are stationary. Then we have, at time t, the
vehicle measurements ytv according to
ytv = hv (xvt ) + evt ,
(2.1c)
(2.1d)
2.1
Coordinate Frames
There are three relevant coordinate frames for the combined vehicle and camera
system:
5
System Modeling
World (w): This is considered an inertial frame and it is xed to the surroundings of the vehicle.
Vehicle body (b): This frame is xed to the vehicle, with its origin located
in the middle of the rear axis. Coordinate frame b coincide with w at the
start of a scenario.
Camera (c): This frame is xed relative to b, and it is positioned in the
optical center of the camera.
For all three coordinate frames the x-axis is pointing forward, the y-axis is
pointing to the right and the z-axis is pointing downwards.
The yaw, pitch and roll angles describe rotations around z-, y- and x-axes
respectively, and unless stated otherwise, positive direction of angles are given by
the right-hand grip rule applied to the coordinate frame axis around which the
rotation occurs.
2.2
Rotation Matrices
1
0
0
Rx () = 0 cos sin ,
0 sin
cos
cos 0 sin
1
0 ,
Ry () = 0
sin 0 cos
cos sin 0
cos 0 .
Rz () = sin
0
0
1
(2.2a)
(2.2b)
(2.2c)
For any rotation matrix we have that R() = R()1 , and we also have
R = RT by denition.
1
If the angles yaw (), pitch () and roll () are used to describe the orientation
of coordinate frame (e.g. a vehicle) as seen from coordinate frame (e.g. the
world), the rotation matrix R for converting -coordinates to -coordinates is
given by
R = Rx ()Ry ()Rz () =
= Rx ()T Ry ()T Rz ()T .
(2.3)
(2.4)
Equation (2.5) concludes this theory section by dening a notation for the
rotation matrix, as a function of the three rotation angles:
R(, , ) = R .
2.3
(2.5)
This section describes the vehicle process models, in other words the models for
how vehicle states changes over time. The rst section is about the basic model
(also described in [9]) that was used before this work, and the remaining sections
are about the model extensions that have been tested during this work.
2.3.1
Basic Model
Before this work the used vehicle process model was the one described in this
section.
With the states described in Table 2.1, the vehicle state vector at time t is
given by
pt
vx,t
t
v
,
xt =
t
t
t
px,t
pt = py,t .
pz,t
(2.6a)
(2.6b)
(2.7)
System Modeling
State
p
vx
Description
Vehicle position in world coordinates.
Velocity of the vehicle in its forward direction.
Yaw angle (z-axis rotation), relative to the world coordinate frame.
Front wheel angle, relative to the vehicles forward direction.
Road pitch angle, relative to the world coordinate frame xy-plane.
Pitch angle of the vehicle, relative to the road.
Table 2.1: This table contains short descriptions of the vehicle states.
vx,t + T ut+1
,
f (xvt , ut+1 ) =
T vx,t
t + L tan t
t
Ct
(2.8)
B(xvt )Q B(xvt )T ,
T cos t cos t
0
T sin t cos t
0
0
B(xt ) = T sin t
,
1
0
0
I 44
vx
t
t
t =
t N (0, Q ),
t
t
vx
q
Q =
,
(2.9a)
(2.9b)
(2.9c)
(2.9d)
(2.9e)
2.3.2
Model Extensions
This section describes the vehicle process model extensions tested during this work.
Note that these model extensions can be used together in any combination.
We need to introduce some new notation: In order to describe the vehicle
process model for one of the vehicle states, we use superscripts. An example: For
the yaw angle () process model we would use the notation f .
Constant Oset in Car Pitch Angle
Since the stationary pitch angle of the camera, relative to the road, might be nonzero, due to the current load in the vehicle or misaligned camera mounting, the
state for the camera pitch oset 0 is added to the state vector. This oset state
is interpreted as the stationary car pitch angle, around which the car pitch angle
oscillates. The process model for car pitch angle and car pitch angle oset 0
becomes
f (xvt , ut+1 ) = C(t 0t ) + 0t ,
0
f (xvt , ut+1 ) = 0t ,
(2.10a)
(2.10b)
(2.11)
This model is very similar to the constant oset model for car pitch; the differences are that for acceleration no new states are required, and the oset due to
acceleration is not necessarily constant.
Note that the choice of using a linear relation between acceleration and stationary car pitch angle is mainly motivated by the fact that the springs, which are
part of the vehicle suspension, have a linear relation between displacement and
force (this is known as Hookes law).
The process noise variance for should not be changed when adding the car
pitch acceleration oset to the process model.
10
System Modeling
moment of inertia
Since the torque is unknown it will be represented by process noise and subsequently set to zero in
1
+ 2 + (2fn )2 = 0,
(2.13)
which is the time-continuous equation (without process noise) that the car pitch
process model will be based on.
Discretization and approximations of derivatives are performed according to
t = (tT ),
t+1 t
,
T
t+1 t
t = (tT
)
.
T
t = (tT
(2.14a)
(2.14b)
(2.14c)
The order of the linear ordinary dierential equation in (2.13) is two. This
means that a state space representation of the car pitch system requires two states;
is already a vehicle state, and is added to the vehicle state vector.
In the absence of process noise, we can identify t+1 as f (xvt , ut+1 ) and t+1
as f (xvt , ut+1 ), so (2.13) and (2.14) gives
f (xvt , ut+1 ) = t + T t ,
(
)
T
v
f (xt , ut+1 ) = 1 2
t T (2fn )2 t .
(2.15a)
(2.15b)
For the second order model of the car pitch to have the same characteristic
time (i.e. the same damping) as the rst order model in (2.8) we seek = (C).
The rst order model in (2.8) is a discretization of the continuous time model
+
1
= 0.
(2.16)
11
)
t .
(2.17)
T
.
1C
(2.18)
Using the second order model for car pitch, the process noise variance for is
lowered by an order of magnitude compared to the process noise variance when
using the basic model, making tightly connected to .
Then the process noise
variance for is set to approximately correspond to that of for when the basic
model is used, resulting in a value one or two orders of magnitude higher than the
basic process noise variance.
Roll Angle
For several reasons, such as that curves of country roads are banked, and that the
car may roll quite a bit when driving on uneven roads such as dirt roads, letting
the roll angle be constant at zero, which is done in the basic model where roll is
not a vehicle state, might be an inadequate approximation. The process model for
the combined roll angle of the car and road is given by
f (xvt , ut+1 ) = t .
(2.19)
Since the roll and pitch angles of automobiles have similiar behaviour in terms
of amplitude and natural frequency, the process noise variance for the roll angle
is set to be approximately the same as the basic model car pitch process noise
variance.
2.4
Landmark Parametrization
12
System Modeling
xljt (1),t
xl
jt (2),t
,
xlt =
..
.
xljt (Mt ),t
w
kj,t
w
j,t
,
xlj,t =
w
j,t
j,t
w
kj,t,x
w
w
kj,t
= kj,t,y
.
w
kj,t,z
State
kw
w
w
(2.20a)
(2.20b)
(2.20c)
Description
The position in world coordinates of the camera at the time when the
landmark was rst seen.
The azimuth angle of the landmark as seen from k w , relative to world
coordinate frame directions.
The elevation angle of the landmark as seen from k w , relative to world
coordinate frame directions, with positive angles towards the positive
z-axis.
The inverse depth (which is the inverse of the distance) from k w to the
landmark.
Table 2.2: This table contains short descriptions of the landmark states.
The reason for using inverse depth rather than regular distance is that we want
a more natural way to assign uncertainty to the estimates of distance. Since the
uncertainty of state estimates will be represented by normal distributions we get
higher uncertainty for large distances and lower uncertainty for small distances by
using inverse depth, and this is precisely what we want.
w
of landmark j
The landmark state xlj,t is a parametrization of the position lj,t
at time t, and the relationship between position and state, with landmark position
given in world coordinates, is
w
lj,t
w
cos w
cos j,t
j,t
1
w
w
cos w
= kj,t
+
.
j,t sin j,t
j,t
sin w
j,t
|
{z
}
mw
j,t
(2.21)
2.5
13
Measurement Model
(xvt )
vx,t
t
(
=
)
vx,t
,
vx,t
L tan t
(2.22)
where L is the wheel base of the vehicle. The landmark measurement model is
given by
(xvt , xlj,t )
Pn (pcj,t )
pcj,t
1
pcj,t,x
( c )
pj,t,y
,
pcj,t,z
pcj,t,x
= pcj,t,y = pc (xvt , xlj,t ) =
pcj,t,z
1 cb bw
w
b
=
R (R (j,t (kj,t
p t ) + mw
j,t ) j,t c ),
j,t
Rcb = R(c , c , c )T ,
{
R(t , t + t , t )T
bw
R =
R(t , t + t , 0)T
(2.23a)
(2.23b)
(2.23c)
if model contains ,
otherwise.
(2.23d)
where cb is the position of the camera in the vehicle body coordinate frame, and
Pn (pc ) is the so-called normalized pinhole projection of a point pc , which is given
in camera coordinates. Furthermore, Pn generates normalized camera coordinates
and c , c and c are the yaw, pitch and roll angles of the camera, relative to the
vehicle body coordinate frame.
Both of the two measurement noises evt and elj,t are independent and Gaussian,
according to
evt N (0, S v ),
elj,t N (0, S l ),
( vx
)
s
0
Sv =
,
0 s
( c
)
s
0
Sl =
,
0 sc
(2.24a)
(2.24b)
(2.24c)
(2.24d)
To translate
between
pixel
coordinates
and
normalized
camera coordi(
)T
l
nates y z , which is the kind of coordinates landmark measurements yj,t
are
given in, we use
14
System Modeling
( )
y
y ic
y
f
y
,
=
z
z ic
z
fz
(2.25)
(
)T
where yic zic denotes the image center (which is the intersection of the optical
axis and the image plane), and fy and fz are the focal lengths (given in pixels) in
y-direction and z-direction respectively.
Chapter 3
3.1
With xvt and xlt given by (2.6a) and (2.20a) respectively, the true state vector at
time t is given by
( v)
xt
,
(3.1)
xt =
xlt
and x
t|s is the estimate of xt using measurements up until time s {t 1, t}.
If the model and the initial values of x
and P are accurate, then we have
Pt|s = cov(xt x
t|s ). The model used is of course not perfectly accurate, but we
can still interpret Pt|s as a measure of how good x
t|s approximates xt . However,
because of the linearization done in the EKF, P tends to be underestimated, saying
that x
is less uncertain than it actually is, and this must be kept in mind if we are
looking at the actual values in P (e.g. plotting the condence bounds for landmark
positions). Pt|s can be decomposed according to
)
(
v
vl
Pt|s
Pt|s
Pt|s =
,
(3.2)
lv
l
Pt|s
Pt|s
v
vl
lv
where Pt|s
is the covariance of the vehicle states, Pt|s
and Pt|s
is the covariance
l
between the landmark states and the vehicle states, and Pt|s is the covariance of
the landmark states.
The vector ytl contains all the landmark measurements for time t, and hlt (xt )
is the corresponding measurement function; they are dened in (3.3). Remember
15
16
from Chapter 2 that the landmarks which are measured at time t have indices
jt (i), i {1, 2, . . . , Mt }, where Mt is the number of measured landmarks at time
t. We have that
yjl t (1),t
yl
jt (2),t
,
ytl =
..
.
yjl t (Mt ),t
l v l
t
jt (2),t )
.
hlt (xt ) =
..
.
hl (xvt , xljt (Mt ),t )
(3.3a)
(3.3b)
(3.4a)
(3.4b)
(3.4c)
We then get the time update (i.e. the prediction) of the states according to
x
vt|t1 = f (
xvt1|t1 , ut ),
x
lt|t1
Pt|t1 =
x
lt1|t1 ,
(
Ftv
0
(3.5a)
(
0
Ftv
Pt1|t1
I
0
0
I
)T
(
+
Qt
0
)
0
,
0
(3.5b)
(3.5c)
where Qt = Q(
xvt1|t1 ) (see (2.9)).
The next step is the measurement update, which adjusts the state predictions
from the time update, using measurements. Stl is the noise covariance for measurement ytl , so it is a block diagonal matrix with Mt number of S l matrices on its
diagonal. With S l and S v given by (2.24), the measurement update is given by
17
x
t|t = x
t|t1 + Ktv (ytv hv (
xvt|t1 )) + Ktl (ytl hlt (
xt|t1 )),
(
)
v
l l
v
Pt|t = Pt|t1 Kt Ht 0 Pt|t1 Kt Ht Pt|t1 ,
)1
(
)T ( ( v
)
(
)T
Ht 0 Pt|t1 Htv 0 + S v
Ktv = Pt|t1 Htv 0
,
(
)1
Ktl = Pt|t1 (Htl )T Htl Pt|t1 (Htl )T + S l
.
(3.6a)
(3.6b)
(3.6c)
(3.6d)
pt|t + Rwb cb
arctan y nl + t|t
y
x
nl =
arctan y nl t|t ,
z
0
{
R(t ,
t + t , t ) if model contains ,
Rwb =
R(t ,
t + t , 0) otherwise,
0
Pt|t 0 0
0
0 0
0
new
JT ,
Pt|t
=J
0
0 Sl
0
0
0 0 P0
(
I
nl
J = xnl
x
nl
0
0 0 xt|t 0 0
pt|t
x
new
t|t =
t|t
(3.7a)
(3.7b)
(3.7c)
(3.7d)
0 0
0
0
)
x
nl
y nl
x
nl
0
(3.7e)
where lines are added for better readability, at the same place in the dierent
matrices (which are of equal size). These lines are placed so they separate the old
covariance from the covariance of the new landmark state. Remember that cb is
the position of the camera in the vehicle body coordinate frame, and note that
the covariance for the new landmarks camera position state will be given by the
nl
covariance for the vehicle position in combination with pxt|t .
Due to the linearization of the non-linear landmark measurement function hl
it sometimes happens that the latest estimation of a landmarks inverse depth is
negative. This can happen to landmarks that are distant but with measurements
18
indicating that the current distance estimate is too low, so the distance is increased
by lowering the inverse depth estimation. To avoid this problem all inverse depth
estimates in x
lt|t are forced to be above a positive threshold min , thereby setting
a maximum allowed distance for landmarks. If min is small enough, making the
maximum allowed distance large enough, the measurement errors (for landmarks
further away than the maximum allowed distance) due to forcing the inverse depth
estimates is negligible.
v
Algorithm 1 describes the EKF algorithm used. P1|0
is the initial vehicle state
covariance, tf is the interval for when to search for new features, and M max is the
maximal number of visible landmarks.
Algorithm 1 Sensor fusion using EKF
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
3.2
Feature Detection
19
5:
6:
7:
8:
9:
10:
11:
12:
13:
3.3
Feature Association
3.4
The idea behind smoothing and mapping (SAM) is to simultaneously estimate the
vehicle states and the landmark states, utilizing both previous, present and future
measurements, as opposed to the EKF which cannot utilize future measurements.
This acausal estimation is performed by solving weighted least squares problems
iteratively, until desired accuracy of the converging solution is achieved.
Since the landmarks are stationary each landmark is represented by a timeindependent landmark state. The vehicle however is not stationary, so the vehicle
states for all the time steps t {1, 2, . . . , N } are included in the state vector of
the SAM algorithm. Mathematical descriptions of the states used in the SAM
algorithm can be found in (3.8).
20
6:
7:
8:
9:
10:
11:
12:
Not all landmarks from the EKF run are included in the SAM run; a threshold
for the minimum number of times a landmark has been visible is used to sort out
landmarks with few measurements. With M number of landmarks included in
the SAM run, the landmark index of landmark number i {1, 2, . . . , M } in the
SAM run is denoted j(i). The state vector x, containing both landmark states
and vehicle states, is given by
( v)
x
,
xl
v
x1
xv2
xv = . ,
..
x=
xvN
l
xj(1)
xl
j(2)
xl =
.. ,
.
xlj(M )
w
j
l
.
xj = w
j
j
(3.8a)
(3.8b)
(3.8c)
(3.8d)
The variable x
is the current estimate of x. During the rst SAM iteration
x
is based on the state estimate from the EKF. The vehicle states xvt are simply
21
collected from all the time steps of the EKF run. Landmark states xlj on the
other hand are based on the last estimation by the EKF of the landmark state in
question.
Instead of including the camera positions in the landmark states xlj we use the
vehicle state xvtc (j) , from the time tc (j) when landmark j was rst seen, to calculate
the camera position. This can be done since we assume that the camera is rmly
mounted in the car. We can recompose a landmark state using the function g,
which returns the complete landmark state (i.e. camera position, azimuth angle,
elevation angle and inverse depth) according to
)
ptc (j) + Rwb cb
,
=
xlj
{
R(tc (j) , tc (j) + tc (j) , tc (j) )
=
R(tc (j) , tc (j) + tc (j) , 0)
(
(3.9a)
if model contains ,
otherwise.
(3.9b)
Note that xlj,t , with time index, denotes the landmark state that includes camera
position, whereas xlj , without time index, denotes the landmark state that excludes camera position.
l
Since the number of landmark measurements yj,t
is not the same for every time
step, k is used to numerate the complete series of landmark measurements. We
introduce the notation jk for the index of the landmark associated with measurement k, and similarly tk for the time when measurement k was performed. Using
the just introduced notation, yjl k ,tk is the measurement we get from landmark
measurement number k.
Vehicle measurements ytv are performed once every time step, so no notation
other than the time t is required to numerate these measurements.
Linearization at x = x
of the process model and measurement model, and
using the fact that the input signal ut is non-variable (it is actually a measurement
although we treat it as an input signal) results in
x
vt + xvt = f (
xvt1 , ut ) + Ft xvt1 + wt ,
v
v v
xt ) + Htv xvt + evt ,
yt = h (
(3.10a)
(3.10b)
yjl k ,tk = hl (
xvtk , g(
xvtc (jk ) , x
ljk )) + Hkl xvtk +
+
where
Jk xljk
eljk ,tk ,
(3.10c)
(3.10d)
22
Ft =
Htv =
Hkl =
Hkl,c =
f (xvt1 , ut )
,
v
xvt1
xt1 =
xv
t1
hv (xvt )
,
xvt xv =xv
t
t
hl (xvt , g(
xvtc (jk ) , x
ljk ))
,
v v
xvt
xt =
xt
k
l v
v
l
h (
xtk , g(xtc , x
jk ))
,
v v
xvtc
xtc =
xt
hl (
xvtk , g(
xvtc (jk ) , xlj ))
Jk =
xlj
(3.11a)
(3.11b)
(3.11c)
(3.11d)
c (jk )
,
xlj =
xlj
(3.11e)
(3.12a)
(3.12b)
(3.12c)
we can form a least squares optimization expression, minimizing the sum of squared
weighted measurement and process noises, according to
x = arg min
x
= arg min
((
t
((
wt 2Q1
t
evt 2(S v )1
)
+
)
eljk ,tk 2(S l )1
)
Ft xvt1 xvt at 2Q1 + Htv xvt cvt 2(S v )1 + (3.13)
t
)
Hkl xvtk + Hkl,c xvtc (jk ) + Jk xljk clk 2(S l )1 ,
v0 , which we dont
where Qt = Q(
xvt1 ) (see (2.9)). Note that for t = 1 we need x
v
have, so we let F1 = 0, a1 = 0 and Q1 = P1|0 , which is the initial vehicle state
covariance for the EKF estimation. This way the resulting term xv1 2Q1 makes
1
sure that the smoothed state estimate x
s stays reasonably close to x
.
The vector norm P 1 used in (3.13) is the so called Mahalanobis distance,
and it is given by
e2P 1 = eT P 1 e = (P T /2 e)T (P T /2 e) = P T /2 e22 .
(3.14)
23
A problem with using the Mahalanobis distance is that Qt is singular (see Appendix B for a proof), so its inverse does not exist. To get around this problem
we add I, where > 0 is a small number, to Qt before the inversion.
T /2
T /2
(3.15)
(3.16)
In order to get good accuracy for the state estimate we repeat the SAM algorithm, using x
=x
s , until x becomes smaller than some predened threshold.
Example 3.1
To show the structure of the matrix A, an example scenario is used to illustrate
how A is composed of the Jacobians Htv , Hkl , Hkl,c and Jk , but with all the weight
matrices omitted for better visibility (the symbol used later on is to be interpreted as "roughly similar to"). The example scenario has:
N = 5 time steps.
M = 2 landmarks.
Measurements of landmark j = 1 at times t {1, 2, 3, 4}.
Measurements of landmark j = 2 at times t {3, 4, 5}.
This results in the following landmark measurements, enumerated by k
k
tk
jk
1
1
1
2
2
1
3
3
1
4
3
2
5
4
1
6
4
2
7
5
2
The structure of A, as generated by Algorithm 4, is described in (3.17). Remember that the subscript k of Hkl,c and Hkl denotes landmark measurement number, while the subscript t of Htv denotes time.
Note that:
A11 relates vehicle states to vehicle state prediction errors.
A21 relates vehicle states to measurement prediction errors.
24
(
A=
A11
A21
0
A22
I
F2
H1v
l,c
H + H1l
1
l,c
H2
H l,c
H5l,c
I
F3
I
F4
I
F5
I
J1
H2v
H2l
J2
H3v
H3l
l,c
H4 + H4l
H6l,c
H7l,c
J3
H4v
H5l
H6l
J5
H5v
H7l
J4
J6
(3.17)
J7
3.5
25
In order to derive FIR images measurements that contains much information about
the vehicle poses we want many measurements, but at the same time we want few
landmark states since also these are estimated from the measurements. A method
to increase the number of landmark measurements per landmark is described below.
A major disadvantage with association of features using normalized crosscorrelation of image patches is that change in scale is not handled. To overcome
this issue, without implementing a whole new feature representation, scaling of
the image patches that describes the features can be performed according to
Lj,t
Dj,t
Dj,t L0
=2
+ 1,
2
w
w
lj,t
kj,t
1
=
,
= w
w
wb
b
lj,t (
pt + R c )
j,t lj,t (
pt + Rwb cb )
(3.18a)
(3.18b)
where Lj,t is the size of the enlarged patch for landmark j at time t, L0 is the size
of the original patch (with size being the length of the quadratic patches sides).
pt , kw , j,t and lw are EKF estimates of states and variables described in Section
j,t
j,t
w
2.4, and Rwb is the same as in (3.7). Note that lj,t
is not a state and is therefore
not itself estimated, it is instead calculated using the estimated landmark states.
The quantity Dj,t is a scale factor which, based on the EKF estimate of landmark state and vehicle state, estimates how much larger landmark j will appear in
the image at time t, compared to when the landmark was rst seen and the image
patch describing the feature was stored. The scale factor is simply calculated as
the distance to the landmark when it was rst measured, divided by the current
distance to it.
Scaling the feature patches can, besides prolonging the life-time of landmarks,
also be expected to improve the quality of landmark measurements. The key to
this is that when scaling is not used, and we have a landmark patch which ought
to be enlarged, then the association of this patch with new image might become
oset in some direction. Consider for example the top of a spruce tree and lets
say that the feature patch describing this landmark is taken far away so that the
patch depicts the upper half of the tree. If we associate this feature patch with
an image taken at half the original distance to the tree it is likely that we succeed
in nding an association, but the patch will be associated with the top quarter of
the tree, so the measurement is higher up than it should be.
Note that Lj,t is Dj,t L0 rounded to the nearest odd integer. This is done to
make sure that also the enlarged patch has one pixel which represents the center
of the patch. Also note that Dj,t is an approximation of the real scale change, but
this approximation holds since the landmarks are small compared to the distance
to them, resulting in small angles.
26
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
Generate x
v from all x
vt|t .
Generate x
l from all x
lt|t , using the last estimate of the individual landmarks.
Exclude landmarks that has fewer measurements than a predened threshold.
Update yjl k ,tk (taken from the EKF algorithm) by removing measurements
belonging to landmarks removed in the previous step.
repeat
Initialize a and c as empty column vectors, and A21 and A22 as empty
matrices. (These variables will grow iteratively as new rows are added to
them.)
for t = 1 to N do
if t = 1 then
(
v
v)
v T /2
Set A11 = (P1|0
)
I 0n (N 1)n .
v
Set a = 0n 1 .
else
v
v)
T /2 ( nv (t2)nv
Append Qt
0
Ft I 0n (N t)n to A11 .
T /2
Append Qt
at to a.
end if
(
v
v)
Append (S v )T /2 02(t1)n Htv 02(N t)n to A21 .
Append a two rows of zeros to A22 .
Append (S v )T /2 cvt to c.
for all k : tk = t do
if t = tc (jk ) then (
v
v)
Append (S l )T /2 02(t1)n Hkl,c + Hkl 02(N t)n to A21 .
else
(
v
v
v)
(S l )T /2 02(tc (jk )1)n Hkl,c 02(ttc (jk )1)n Hkl 02(N t)n
is appended to A21 .
end if
(
)
l
l
Append (S l )T /2 02(jk 1)n Jk 02(M jk )n to A22 .
Append (S l )T /2 clk to c.
end for
end for (
)
A11
0
Let A =
.
(A
)21 A22
a
Let b =
.
c
Calculate x and x
s according to (3.15) and (3.16) respectively.
s
Set x
=x
.
for all j do
Make sure j min .
end for
until max abs(x ) is smaller than predened threshold
Chapter 4
Results
This chapter contains a description of the result presentation, followed by the
results of using SAM and vehicle model extensions, and lastly results of introducing
scale change compensation to the feature association.
4.1
Performance Measures
The results in terms of vehicle pose estimation accuracy of SAM and vehicle model
extensions are given in the form of two measures. The rst one, landmark measurement residuals, is an indirect but quantied measure of pose estimation accuracy,
whereas the second measure, trajectory plot in camera image, is direct but nonquantied and also to some extent subjective.
4.1.1
(
The root mean square, RMS(v), of a vector v = v1
v
u
N
u1
RMS(v) = t
v2 .
N i=1 i
. . . vN
)T
RN is
(4.1)
To measure how well image positions for landmarks are predicted, i.e. how
far o the predictions were from the measurements on average, the RMS(
cl ) of
l
the landmark measurement residuals c (given in pixels) is used. With fy and fz
as the focal lengths (given in pixels) in y-direction and z-direction respectively,
N
M k = t=1 Mt as the total number of landmark measurements and te (j) as the
end time for landmark j, i.e. the latest time that it was measured, cl is given
according to
27
28
Results
cl1
cl = ... ,
cl k
( M
)
fy 0
l
ck =
cl ,
0 fz k
{
xvtk |tk , x
ljk ,te (jk )|te (jk ) )
yjl k ,tk hl (
l
ck =
l
l v
yjk ,tk h (
xtk , g(
xvtc (jk ) , x
ljk ))
(4.2a)
(4.2b)
if EKF estimates are used,
if SAM estimates are used.
(4.2c)
4.1.2
Plotting the sequence of estimated vehicle positions (the trajectory) in the rst
FIR camera image provides an intuitive way to evaluate the accuracy of the pose
estimates.
The vehicle pose consists of three-dimensional position and three-dimensional
orientation, and the part of the pose sequence that the trajectory basically shows is
position and yaw angle (since yaw is tightly connected to the sequence of positions),
but not pitch angle or roll angle.
The obvious drawback of using the trajectory plot as a measure of pose estimation accuracy is that it is not quantitative but instead has to be assessed
subjectively.
4.2
29
We introduce the following abbreviations for the vehicle process model extensions
described in Section 2.3.2:
Oset: Constant oset in car pitch angle.
Acc: Acceleration oset in car pitch angle.
Roll: Roll angle.
2nd: Second order model for car pitch.
For each data sequence used to evaluate the SAM algorithm and model extensions proposed in this thesis, both EKF and SAM estimates has been generated
for all the 16 combinations of vehicle process model extensions, and all values of
RMS(
cl ) for the sequences are found in Appendix D. This section summarizes the
results found in Appendix D, and for each sequence a few vehicle trajectories are
plotted in the rst FIR camera image of the sequence. To illustrate the trajectory
and landmarks a map of estimated landmark positions and vehicle trajectory is
shown for each sequence.
It is important to note that not all sequences contains behaviour that the model
extensions are ment to handle. If for example the true car pitch oset is zero the
oset model extension cannot be expected to improve the pose estimates for that
particular sequence. The same goes for the acceleration oset in car pitch angle
for any sequence in which the vehicle moves at constant speed. To give an idea
about which model extension that are given a chance to improve the results for a
particular sequence, all sequences below are accompanied by a short description
of its pitch motion, roll motion and acceleration, but note that it is generally
unknown whether or not the car pitch has a constant oset.
30
Results
Sequence A
The car pitch and roll oscillates quite a bit during this sequence, but the acceleration is virtually zero. Table 4.1 presents the results for the mean landmark
measurement residuals, when using SAM compared to using only EKF, and when
using the dierent vehicle model extensions, compared to not using them. Figure
4.1 shows a top view of the estimated map and trajectory for Sequence A, while
Figures 4.2 and 4.3 show trajectory plots which compares the dierent vehicle
model extensions, and SAM versus EKF.
RMS(
cl )
2.71
0.51
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.1: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence A.
200
150
y [m]
100
50
50
0
500
1000
1500
x [m]
Figure 4.1: Sequence A: This gure show a map of the SAM estimates, using the
basic vehicle process model.
31
Figure 4.2: Sequence A: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.3: Sequence A: Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
32
Results
Sequence B
In this sequence there are moderate amounts of pitch and roll motion, but the
acceleration is very close to zero. Table 4.2 presents the results for the mean
landmark measurement residuals, when using SAM compared to using only EKF,
and when using the dierent vehicle model extensions, compared to not using them.
Figure 4.4 shows a top view of the estimated map and trajectory for Sequence B,
while Figures 4.5 and 4.6 show trajectory plots which compares the dierent vehicle
model extensions, and SAM versus EKF.
RMS(
cl )
1.18
0.46
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.2: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence B.
60
40
y [m]
20
0
20
40
60
100
200
300
400
500
600
700
x [m]
Figure 4.4: Sequence B: This gure show a map of the SAM estimates, using the
basic vehicle process model.
33
Figure 4.5: Sequence B: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.6: Sequence B: Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
34
Results
Sequence C
In this sequence the car pitch oscillates moderately, but roll angle oscillation is
low. However there is some roll motion due to road geometry, and the level of
acceleration is low. Table 4.3 presents the results for the mean landmark measurement residuals, when using SAM compared to using only EKF, and when using
the dierent vehicle model extensions, compared to not using them. Figure 4.7
shows a top view of the estimated map and trajectory for Sequence C, while Figures 4.8 and 4.9 show trajectory plots which compares the dierent vehicle model
extensions, and SAM versus EKF.
RMS(
cl )
3.70
0.65
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.3: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence C.
50
y [m]
50
100
150
0
200
400
600
800
1000
x [m]
Figure 4.7: Sequence C: This gure show a map of the SAM estimates, using the
basic vehicle process model.
35
Figure 4.8: Sequence C: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.9: Sequence C: Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
36
Results
Sequence D
This sequence contains some roll motion but not so much. The acceleration is
low and the pitch oscillates moderately. Table 4.4 presents the results for the
mean landmark measurement residuals, when using SAM compared to using only
EKF, and when using the dierent vehicle model extensions, compared to not
using them. Figure 4.10 shows a top view of the estimated map and trajectory
for Sequence D, while Figures 4.11 and 4.12 show trajectory plots which compares
the dierent vehicle model extensions, and SAM versus EKF.
RMS(
cl )
1.68
0.58
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.4: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence D.
80
60
40
y [m]
20
0
20
40
60
80
100
200
400
600
800
1000
x [m]
Figure 4.10: Sequence D: This gure show a map of the SAM estimates, using the
basic vehicle process model.
37
Figure 4.11: Sequence D: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.12: Sequence D: Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
38
Results
Sequence E
This sequence contains almost no acceleration, low roll motion, but the pitch
oscillates moderately. Table 4.5 presents the results for the mean landmark measurement residuals, when using SAM compared to using only EKF, and when
using the dierent vehicle model extensions, compared to not using them. Figure
4.13 shows a top view of the estimated map and trajectory for Sequence E, while
Figures 4.14 and 4.15 show trajectory plots which compares the dierent vehicle
model extensions, and SAM versus EKF.
RMS(
cl )
0.91
0.45
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.5: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence E.
40
20
y [m]
0
20
40
60
80
100
200
300
400
500
600
700
x [m]
Figure 4.13: Sequence E: This gure show a map of the SAM estimates, using the
basic vehicle process model.
39
Figure 4.14: Sequence E: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.15: Sequence E Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
40
Results
Sequence F
In this sequence both pitch and roll motion is rather extensive, and there is also
a little bit of acceleration. Table 4.6 presents the results for the mean landmark
measurement residuals, when using SAM compared to using only EKF, and when
using the dierent vehicle model extensions, compared to not using them. Figure
4.16 shows a top view of the estimated map and trajectory for Sequence F, while
Figures 4.17 and 4.18 show trajectory plots which compares the dierent vehicle
model extensions, and SAM versus EKF.
RMS(
cl )
1.56
0.52
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.6: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence F.
20
0
y [m]
20
40
60
80
100
200
400
600
800
x [m]
Figure 4.16: Sequence F: This gure show a map of the SAM estimates, using the
basic vehicle process model.
41
Figure 4.17: Sequence F: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.18: Sequence F Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
42
Results
Sequence G
This sequence exhitibs moderate roll motion and acceleration but extensive pitch
motion. Table 4.7 presents the results for the mean landmark measurement residuals, when using SAM compared to using only EKF, and when using the dierent
vehicle model extensions, compared to not using them. Figure 4.19 shows a top
view of the estimated map and trajectory for Sequence G, while Figures 4.20 and
4.21 show trajectory plots which compares the dierent vehicle model extensions,
and SAM versus EKF.
RMS(
cl )
2.78
0.55
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.7: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence G.
0
20
y [m]
40
60
80
100
120
140
100
200
300
400
500
600
700
x [m]
Figure 4.19: Sequence G: This gure show a map of the SAM estimates, using the
basic vehicle process model.
43
Figure 4.20: Sequence G: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.21: Sequence G Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
44
Results
Sequence H
This sequence exhibits a large constant oset in car pitch (resulting in that the
horizon is on average 15-20 pixels below the image center, this can be seen in
the images below), extensive roll motion, moderate acceleration, but not much
car pitch oscillation. Table 4.8 presents the results for the mean landmark measurement residuals, when using SAM compared to using only EKF, and when
using the dierent vehicle model extensions, compared to not using them. Figure
4.22 shows a top view of the estimated map and trajectory for Sequence H, while
Figures 4.23 and 4.24 show trajectory plots which compares the dierent vehicle
model extensions, and SAM versus EKF.
Method
EKF
SAM
RMS(
cl )
4.23
0.70
RMS(
cl )
With model
Without model
Table 4.8: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence H.
20
15
y [m]
10
5
0
5
10
0
50
100
150
200
250
300
350
x [m]
Figure 4.22: Sequence H: This gure show a map of the SAM estimates, using the
basic vehicle process model.
45
Figure 4.23: Sequence H: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.24: Sequence H Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
46
Results
Sequence I
In this sequence pitch and roll motion is moderate, and there is some acceleration.
Table 4.9 presents the results for the mean landmark measurement residuals, when
using SAM compared to using only EKF, and when using the dierent vehicle
model extensions, compared to not using them. Figure 4.25 shows a top view of
the estimated map and trajectory for Sequence I, while Figures 4.26 and 4.27 show
trajectory plots which compares the dierent vehicle model extensions, and SAM
versus EKF.
RMS(
cl )
2.16
0.44
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.9: These tables show the eect on the mean landmark measurement residuals of using SAM and vehicle model extensions for Sequence I.
80
60
40
y [m]
20
0
20
40
60
0
200
400
600
800
x [m]
Figure 4.25: Sequence I: This gure show a map of the SAM estimates, using the
basic vehicle process model.
47
Figure 4.26: Sequence I: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.27: Sequence I Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
48
Results
Sequence J
This sequence contains extensive acceleration and moderate roll and pitch motion.
Table 4.10 presents the results for the mean landmark measurement residuals,
when using SAM compared to using only EKF, and when using the dierent
vehicle model extensions, compared to not using them. Figure 4.28 shows a top
view of the estimated map and trajectory for Sequence J, while Figures 4.29 and
4.30 show trajectory plots which compares the dierent vehicle model extensions,
and SAM versus EKF.
RMS(
cl )
3.33
0.68
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.10: These tables show the eect on the mean landmark measurement
residuals of using SAM and vehicle model extensions for Sequence J.
20
10
y [m]
10
20
30
0
100
200
300
400
x [m]
Figure 4.28: Sequence J: This gure show a map of the SAM estimates, using the
basic vehicle process model.
49
Figure 4.29: Sequence J: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.30: Sequence J Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
50
Results
Sequence K
This sequence exhibits moderate roll motion, and extensive acceleration and pitch
motion. Table 4.11 presents the results for the mean landmark measurement
residuals, when using SAM compared to using only EKF, and when using the
dierent vehicle model extensions, compared to not using them. Figure 4.31 shows
a top view of the estimated map and trajectory for Sequence K, while Figures
4.32 and 4.33 show trajectory plots which compares the dierent vehicle model
extensions, and SAM versus EKF.
RMS(
cl )
3.57
0.75
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.11: These tables show the eect on the mean landmark measurement
residuals of using SAM and vehicle model extensions for Sequence K.
15
10
5
y [m]
0
5
10
15
20
25
30
0
50
100
150
200
250
300
x [m]
Figure 4.31: Sequence K: This gure show a map of the SAM estimates, using the
basic vehicle process model.
51
Figure 4.32: Sequence K: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.33: Sequence K Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
52
Results
Sequence L
In this sequence both the acceleration and roll motion are extensive, but the pitch
oscillation is moderate. Table 4.12 presents the results for the mean landmark
measurement residuals, when using SAM compared to using only EKF, and when
using the dierent vehicle model extensions, compared to not using them. Figure
4.34 shows a top view of the estimated map and trajectory for Sequence L, while
Figures 4.35 and 4.36 show trajectory plots which compares the dierent vehicle
model extensions, and SAM versus EKF.
RMS(
cl )
2.42
0.66
Method
EKF
SAM
RMS(
cl )
With model
Without model
Table 4.12: These tables show the eect on the mean landmark measurement
residuals of using SAM and vehicle model extensions for Sequence L.
0
20
y [m]
40
60
80
100
120
140
200
400
600
800
x [m]
Figure 4.34: Sequence L: This gure show a map of the SAM estimates, using the
basic vehicle process model.
53
Figure 4.35: Sequence L: Trajectory for EKF estimates (dashdotted) and SAM
estimates (solid), using the basic vehicle process model.
Figure 4.36: Sequence L Trajectory for SAM estimates, using the models oset
(dotted), acc (dashed), roll (dashdotted) and 2nd (solid).
54
Results
Smoothed Trajectory
Figure 4.37 shows a part of the trajectory of Sequence A, where it clearly can be
seen that the EKF estimated trajectory is rather rough while the SAM estimated
trajectory is smooth.
Figure 4.37: Sequence A: Part of the trajectory for EKF estimates (dashdotted)
and SAM estimates (solid), using the basic vehicle process model.
4.3
The feature patch scaling described in Section 3.5 is intended to improve the quality of the landmark measurements and to increase the life-time of the landmarks,
with life-time being the number of times a landmark has been visible (measured),
resulting in more measurements per landmark state.
For a number of sequences measures of life-time has been generated, both with
and without the patch scaling, using the basic vehicle process model, and these
results are presented in table 4.13. The results are ambiguous, showing both
improvement and degradation of the mean landmark life-time.
Mean life-time
Sequence A
Sequence B
Sequence C
Sequence D
Sequence E
Sequence F
Sequence G
Sequence H
Sequence I
Sequence J
Sequence K
Sequence L
Scaling
24.9
27.0
47.7
33.5
32.1
20.8
22.7
32.1
40.4
35.5
37.5
14.2
55
No scaling
26.6
22.6
41.2
32.8
26.9
17.6
23.2
33.2
37.7
36.0
49.1
14.2
Table 4.13: Life-time of landmarks, with and without scaling of feature patches.
Chapter 5
Concluding Remarks
This work was aimed at improving the accuracy of vehicle pose estimates, and
this chapter contains conclusions about how the dierent investigated methods
performed, based on the results in the previous chapter, and a discussion of areas
which might be interesting to investigate in order to further improve the pose
estimation accuracy.
It should be noted that this work is in no way exclusive to or dependent on
FIR images, other kind of images such as NIR images or visual images may in
principle be used instead of FIR images.
5.1
Conclusions
This section contains the conclusions, based on the results, of this work.
5.1.1
SAM
It is clear from the sequences presented in Chapter 4 that the SAM algorithm outperforms the EKF algoritm in terms of pose estimation accuracy. This conclusion
is supported both by the landmark measurement residual measure, and by the
trajectory plots, which in the cases where the EKF estimates are clearly erroneous
the SAM estimates are considerably better.
Another aspect of SAM is that the estimated trajectory is smooth, as expected
(remember that SAM literally means smoothing and mapping), while the EKF
generates estimates that are generally much less smooth, especially when we get
erroneous association of landmarks. An example of this is found in gure 4.37.
5.1.2
As can be seen in the results, the process model extensions are not able to improve
the pose estimates as much as the SAM algorithm. They do however result in some
improvements, and a summary of the results along with conclusions are given in
this section.
57
58
Concluding Remarks
5.1.3
The results show that introducing scale change correction for association of feature
patches most of the time gives landmarks with basically unchanged or slightly
59
longer life-time, but the results of sequence K shows that the scaling might decrease
the life-time, although the life-time with scaling was still high for this sequence,
so it is possible that the risk of shortened life-time is not so big for sequence in
which the life-time is low when scaling is not used. Another possibility for that
sequence is that when no scaling was used there were some long-lived landmarks
with erroneous association, and introducing scaling benecially decreased the lifetime of these landmarks by letting only the correct associations remain.
5.2
Future Work
During this work it has become more and more clear that the problem with erroneous association of features is an important issue to be dealt with, since the
measurements are the foundation on which SAM and model extensions relies. The
natural way to improve landmark association is to use some other feature descriptor, and possibly also change the feature detection algorithm, but a straightforward
approach of reducing the rate of erroneous feature association is to add logic in
the SAM algorithm that removes unstable landmarks. Such an approach might
also result in robustness to moving objects, at least to some extent. Another way
to improve the measurements derived from the camera images is to extract image
measurements of yaw, pitch and roll change rates that are not based on landmarks,
but instead uses phase correlation or some similar method.
Bibliography
[1] F. Dellaert and M. Kaess. Square root SAM: Simultaneous localization and
mapping via square root information smoothing. International Journal of
Robotics Research, 25(12):11811203, 2006.
[2] E.D. Dickmanns. Dynamic Vision for Perception and Control of Motion.
Springer, Secaucus, NJ, USA, 2007.
[3] C. Harris and M. Stephens. A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference, pages 147151, Manchester, UK,
August 1988.
[4] M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Incremental Smoothing
and Mapping. IEEE Transactions on Robotics, 24(6):13651378, 2008.
[5] Q. Lin, F. Tjrnstrm, J. Roll, and B. Wass. Developing a far infrared based
night-vision system with pedestrian detection. VDI Berichte, (2038):153158,
2008.
[6] D.G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91110, 2004.
[7] Y. Ma, S. Soatto, J. Kosecka, and S.S. Sastry. An Invitation to 3-D Vision:
From Images to Geometric Models. Springer, 2004.
[8] J.M.M. Montiel, J. Civera, and A.J. Davison. Unied inverse depth
parametrization for monocular SLAM. In Proceedings of Robotics: Science
and Systems (RSS), Philadelphia, USA, August 2006.
[9] T.B. Schn and J. Roll. Ego-motion and indirect road geometry estimation
using night vision. In IEEE Intelligent Vehicles Symposium, Proceedings,
pages 3035, Xian, Shaanxi, China, June 2009.
[10] G. Sibley, C. Mei, I. Reid, and P. Newman. Adaptive Relative Bundle Adjustment. In Proceedings of Robotics: Science and Systems (RSS), Seattle,
USA, June 2009.
[11] Z. Sjanic, M. Skoglund, T. B. Schn, and F. Gustafsson. Solving the SLAM
Problem for Unmanned Aerial Vehicles Using Smoothed Estimates. In Proceedings of the Reglermte (Swedish Control Conference), Lund, Sweden,
June 2010.
61
62
Bibliography
[12] H. Strasdat, J.M.M. Montiel, and A.J. Davison. Real-time Monocular SLAM:
Why Filter?
In Proceedings of the IEEE International Conference on
Robotics and Automation (ICRA), Anchorage, Alaska, USA, May 2010.
[13] O. Tsimhoni, J. Brgman, and M.J. Flannagan. Pedestrian detection with
near and far infrared night vision enhancement. LEUKOS - Journal of Illuminating Engineering Society of North America, 4(2):113128, 2007.
Appendix A
Nomenclature
A.1
Mathematical Notations
f
N (, )
v2
arg minf (x)
time derivative df
dt of function f
normal distribution with mean and covariance matrix
2-norm of vector v
argument which minimizes the function f
max abs(v)
x
I nn
A.2
EKF
FIR
NCC
NIR
RMS
SAM
SLAM
A.3
kw
w
w
Abbreviations
extended Kalman lter
far infrared
normalized cross-correlation
near infrared
root mean square
smoothing and mapping
simultaneous localization and mapping
Landmark States
initial camera position in three dimensions
azimuth angle
elevation angle
inverse depth
63
64
A.4
p
vx
A.5
a
A
c
cb
C
e
f
F
g
h
H
j
J
k
l
L
m
M
N
P
Pn
Q
R
S
t
T
u
w
x
y
Nomenclature
Vehicle States
position in three dimensions
velocity of the vehicle in its forward direction
yaw angle
front wheel angle
road pitch angle
vehicle pitch angle
vehicle pitch angle oset
vehicle pitch angular acceleration
roll angle
A.6 Indices
A.6
b
c
j
k
l
t
v
w
x
y
z
A.7
x
Indices
vehicle body coordinate frame
camera, camera coordinate frame
landmark index
measurement number
landmark
time
vehicle
world coordinate frame
x-coordinate
y-coordinate
z-coordinate
Top Notations
state estimate from EKF algorithm
state estimate from SAM algorithm
landmark measurements given in pixel coordinates
65
Appendix B
nv m
Q R
m < nv ,
mm
nv
(B.1a)
(B.1b)
(B.1c)
(B.1d)
which results in
rank Qt = rank(B(xvt )Q B(xvt )T )
min(rank(B(xvt )), rank(Q B(xvt )T ))
/
/
rank(B(xvt )) min(nv , m) = m
(B.2)
m.
Finally, rank Qt < nv since m < nv , and we have that Qt is not of full rank, hence
it is singular.
66
Appendix C
Derivation of Landmark
Measurement Residual Limit
Suppose that the undiscretized landmark measurement y (given in pixels) is noise
free and unbiased, and that we have the prediction yp = y. The measurement
ym is y rounded to the nearest integer, resulting in the landmark measurement
residual cl (
y ) according to
cl (
y ) = ym yp = round(
y ) y =
y + 0.5 y.
(C.1)
Note that we are considering one scalar component of a single measurement, i.e.
we are looking at either the y-component or the z-component of the measurement.
The residual is periodic, so the RMS of it can be calculated over one period,
and we assume that y is uniformly distributed.
v
u
u
u
l
RMS(
c (
y )) = t
1
0.5 (0.5)
0.5
v
u 0.5
u
u
cl (
y )2 d
y=t
(0 y)2 d
y=
0.5
0.5
v
[ ]
u 0.5
(
)
u
0.5
1
y3
1 1
u
2
y d
y=
=
+
=
=t
3 0.5
3 23
23
(C.2)
0.5
1
= 0.29.
2 3
The conclusion from this is that regardless of how good the landmark measurement predictions are (as a result of an accurate vehicle process model), RMS(
cl )
cannot be expected to be below approximately 0.29. (Note however that this limit
is not strict in a mathematical sense, since by chance we might get samples of
y resulting in a RMS(
cl ) smaller than the limit, but the probability for this to
happen decreases rapidly as the number of measurements increases.)
67
Appendix D
Results
Tables D.1-D.12 show the complete results in terms of landmark measurement
residuals for all the sequences presented in the results chapter.
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
2.50
0.55
2.91
0.54
2.49
0.55
2.73
0.55
2.54
0.53
2.90
0.54
2.88
0.50
2.88
0.53
2.47
0.49
2.53
0.53
2.65
0.40
2.88
0.49
2.91
0.56
2.78
0.50
2.56
0.39
2.77
0.52
68
69
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
1.02
0.49
1.39
0.52
1.00
0.49
0.94
0.41
0.90
0.49
1.37
0.51
1.36
0.39
1.34
0.50
0.95
0.41
0.99
0.51
1.10
0.45
1.36
0.39
1.34
0.50
1.31
0.39
1.20
0.44
1.31
0.39
RMS(
cl )
EKF SAM
3.02
0.71
6.02
0.62
3.14
0.83
3.45
0.65
3.18
0.71
4.59
0.61
2.82
0.60
5.09
0.63
3.46
0.65
2.77
0.74
3.35
0.53
2.87
0.58
5.13
0.60
3.78
0.63
3.07
0.69
3.47
0.62
70
Results
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
1.44
0.66
1.81
0.56
1.34
0.64
1.93
0.55
1.08
0.65
1.82
0.56
1.64
0.50
1.69
0.57
1.90
0.55
1.09
0.66
1.91
0.55
1.65
0.50
1.74
0.57
1.57
0.51
1.97
0.65
2.37
0.56
RMS(
cl )
EKF SAM
0.99
0.47
0.86
0.44
0.97
0.47
0.95
0.45
0.98
0.47
0.86
0.44
0.90
0.45
0.82
0.43
0.93
0.45
0.97
0.47
0.94
0.45
0.88
0.42
0.83
0.43
0.84
0.44
0.92
0.45
0.85
0.44
71
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
1.46
0.55
1.74
0.53
1.48
0.53
1.47
0.48
1.37
0.56
1.84
0.53
1.61
0.49
1.64
0.52
1.47
0.48
1.31
0.64
1.39
0.49
1.69
0.49
1.80
0.53
1.57
0.49
1.40
0.48
1.69
0.49
RMS(
cl )
EKF SAM
1.95
0.53
3.43
0.54
2.04
0.66
2.03
0.50
3.00
0.61
3.50
0.54
2.75
0.54
4.72
0.55
1.80
0.52
2.17
0.65
2.51
0.53
3.10
0.48
3.78
0.57
2.47
0.51
2.11
0.53
3.12
0.55
72
Results
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
3.91
0.75
4.49
0.67
4.20
0.87
3.91
0.87
3.48
0.97
5.89
0.65
3.50
0.42
4.20
0.65
3.18
0.48
3.63
0.95
3.63
0.69
3.01
0.41
5.59
0.80
5.65
0.68
3.47
0.64
5.86
0.68
RMS(
cl )
EKF SAM
2.10
0.51
2.66
0.55
2.19
0.51
1.87
0.36
2.12
0.53
2.35
0.53
1.83
0.36
2.35
0.42
1.66
0.36
1.97
0.43
2.91
0.53
1.81
0.36
2.58
0.43
1.76
0.36
2.70
0.46
1.74
0.35
73
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
2.84
0.75
4.04
0.73
2.83
0.69
2.99
0.72
2.58
0.77
3.49
0.67
3.70
0.57
3.96
0.67
3.00
0.65
2.96
0.73
2.93
0.83
3.65
0.63
3.73
0.64
4.25
0.61
2.41
0.66
3.85
0.56
RMS(
cl )
EKF SAM
4.26
0.84
7.54
0.72
3.41
0.75
3.16
0.72
3.14
0.88
3.64
0.67
3.25
0.63
2.96
0.81
3.24
0.66
3.91
1.08
3.33
0.87
3.10
0.59
2.81
0.76
3.48
0.59
2.58
0.75
3.24
0.66
74
Results
Vehicle process model
Oset Acc Roll 2nd
RMS(
cl )
EKF SAM
2.53
0.68
2.82
0.68
2.59
0.68
1.93
0.60
2.02
0.77
2.57
0.61
2.15
0.67
4.19
1.29
1.78
0.59
2.02
0.66
1.83
0.56
2.30
0.57
2.87
0.66
2.53
0.52
2.24
0.50
2.38
0.53