Beruflich Dokumente
Kultur Dokumente
2, APRIL 2017
I. INTRODUCTION
VENT cameras [1] are biologically inspired sensors that to the depth of the scene, potentially causing a breakdown of
E overcome some of the limitations of traditional cameras:
they have a very fast response (in the order of microseconds),
the system if the 3D map used for localization falls out of the
field of view of the camera. Orientation estimation also finds
very high dynamic range and require low power and bandwidth. applications in camera stabilization [2] and in panoramic image
These advantages come from their fundamentally different prin- creation [3].
ciple of operation: they have independent pixels that sense and Contribution: This paper presents a novel method to estimate
asynchronously transmit brightness changes (called “events”). 3D rotational motion of an event camera. The method aligns
Hence, their output is not a sequence of frames at fixed rate events corresponding to the same scene edge by maximizing
but rather a spatially sparse, asynchronous stream of events. the strength of edges obtained by aggregating motion-warped
Event cameras offer great potential for high-speed robotics and events. Our method works directly on the event stream, hence it
applications with large illumination variations. However, new does not require the estimation of intermediate quantities such
methods have to be designed to cope with their unconventional as image intensity or optical flow like other approaches. Besides
output. the notable accuracy and robustness of the proposed method, its
In this paper we are interested in unlocking the high-speed most interesting insight is that it admits an intuitive formulation
capabilities of the sensor to estimate ego-motion. In particular, in terms of contrast maximization (Fig. 1), and that contrast is a
we focus on the restricted but important case of 3D rotational basic signal statistic with broad applicability. Thus, the method
motions (i.e., estimating the angular velocity of the camera). carries a new design philosophy for event-based algorithms.
Orientation estimation, besides being an important topic on its
own, is a recurrent topic in visual odometry scenarios, where II. RELATED WORK
the camera might move with negligible translation with respect
There are few works on 3D orientation estimation with event
Manuscript received September 10, 2016; revised December 6, 2016; ac- cameras. This may be due to the following facts: research is
cepted December 27, 2016. Date of publication January 4, 2017; date of current dominated by standard (frame-based) cameras, event cameras
version January 26, 2017. This paper was recommended for publication by As- have been commercially available only recently [1] and they
sociate Editor E. M. Mouaddib and Editor F. Chaumette upon evaluation of the
reviewers’ comments. This work was supported in part by the DARPA FLA are still expensive sensors since they are at an early stage of
Program, in part by the National Centre of Competence in Research Robotics development.
through the Swiss National Science Foundation, in part by the SNSF-ERC A generic message passing algorithm within an interacting
Starting Grant and the UZH Forschungskredit.
The authors are with the Robotics and Perception Group, University network to jointly estimate several quantities (called “maps”),
of Zurich, Zurich 8050, Switzerland (e-mail: guillermo.gallego@ifi.uzh.ch; such as, rotational ego-motion, image intensity and optical flow
davide.scaramuzza@ieee.org). from a stream of events was proposed by Cook et al. [4]. The
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. algorithm is not a traditional, feed-forward pipeline but can be
Digital Object Identifier 10.1109/LRA.2016.2647639 interpreted as a joint estimation of optical flow and image in-
2377-3766 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
GALLEGO AND SCARAMUZZA: ACCURATE ANGULAR VELOCITY ESTIMATION WITH AN EVENT CAMERA 633
tensity from the event stream while, at the same time, enforcing are triggered by the same scene edge (they are correspond-
that the resulting quantities (e.g., optical flow field) are con- ing events) and they all have the same polarity.1 Thus, we can
sistent with a global constraint: the estimated motion must be use the event polarities along trajectories to analyze the edge
rotational. structure, and therefore, reveal the unknown camera motion.
More recently, Kim et al. [3] presented a parallel tracking- In particular, we just consider the sum of the polarities along
and-mapping filter-based system that estimated the 3D ori- each trajectory, with as many trajectories as pixels in the image
entation of an event camera while generating high-resolution plane. If we naively sum the event polarities pixelwise (along
panoramas of natural scenes. The tracking thread estimated ro- trajectories parallel to the time axis), we will generate an event
tational motion by means of a particle filter using the event “image” showing the trace of the edges in the scene as they
stream and a given intensity image (the panorama). moved through the image plane [see Fig. 1(a)]. Observe that
Conradt [5] presented a simple algorithm to extract optical this is analogous to the motion blur effect in standard cameras,
flow information from event cameras, and as an application, he caused by large exposure times [see Fig. 3(a)]. The shapes of
showed that it can be used for ego-motion estimation. He first such traces provide visual cues of the motion that caused them,
computed the optical flow and then calculated the (3D) angular and once such a motion has been estimated, usually represented
velocity that best explained such a 2D flow. by a Point Spread Function (PSF), as shown in the bottom-left of
All previous works require an auxiliary variable such as op- Fig. 3(a), a sharp image can be obtained from the blurred one by
tical flow or image intensity to estimate angular velocity. For compensating for the motion, a process known as deconvolution
example, [4] and [5] estimate angular velocity given or together [see Fig. 3(b)]. Similarly, if we are able to estimate the motion of
with optical flow, whereas [3] requires an intensity image to an event camera, e.g., by searching for the trajectories that sat-
provide a likelihood for the events undergoing a candidate ro- isfy the above-mentioned property of corresponding events, we
tational motion. In contrast, our method shows that angular may compensate for it. The resulting event image, obtained by
velocity can be estimated directly, without having to reconstruct summing the event polarities along the pixel trajectories induced
image intensity or optical flow. by the true camera motion, does not suffer from accumulation
blur [see Fig. 1(b)], and consequently, has stronger edges than
III. METHODOLOGY those of the uncompensated one [see Fig. 1(a)]. Hence, a strat-
egy to estimate the ego-motion of the camera is to search for the
A. Intuitive Explanation of the Approach motion and scene parameters that maximize the strength of the
Event cameras have independent pixels that respond asyn- motion-compensated edges. In the case of rotational motions
chronously to brightness changes, which are due to mov- the problem simplifies since no scene parameters such as depth
ing edges, i.e., intensity gradients, in the image plane. Thus, are needed to represent the trajectories in the image plane; the
these cameras output a sequence of asynchronous “events” problem solely depends on the motion parameters (the angular
[see Fig. 2(a)]. Each event is described by a tuple ek = velocity of the camera).
(xk , yk , tk , ±k ), where xk = (xk , yk ) and tk are the spatio- Here we present a method that exploits the previous ideas to
temporal coordinates of the brightness change and ±k , the bi- estimate the motion undergoing a rotating event camera, namely
nary event polarity, is the sign of the brightness change [the by measuring the edge strength using image contrast, and
color of the dots in Fig. 2(a)]. Events are time-stamped with therefore, our method can be interpreted as motion estimation by
microsecond resolution (tk ). contrast maximization. Fig. 4 summarizes our approach, which
Fig. 2(a) shows the output of a rotating event camera over a we describe in the next sections. First, we show how to create
small time interval. Looking at the events only (i.e., omitting an event image (Section III-B), how to displace events accord-
the overlayed trajectories) it seems that the information of the ing to 3D rotational motion (Section III-C) and how to measure
moving edges that triggered the events is unintelligible. In the the strength of the (motion-compensated) edges (Section III-D).
example, the edges moved approximately along linear trajecto- Then, we discuss the maximization strategy and how to process
ries in the space-time volume of the image plane [see Fig. 2(a)], an entire event stream (Section III-E).
and it is only when the events are observed along such trajecto- 1 This holds except when the motion changes direction abruptly, which can
ries that the edge structure is revealed [see Fig. 2(b)]. Moreover, be detected since the camera triggers no events while it is at rest between both
Fig. 2 provides a key observation: the events along a trajectory states.
634 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 2, NO. 2, APRIL 2017
Fig. 4. Block diagram of the method: events from the camera are rotated
according to a candidate angular velocity ω, which is iteratively refined by a
maximization algorithm on the contrast of the images of rotated events I(x; ω).
D. Measuring the Edge Strength of an Event Image E and a candidate angular velocity ω. Hence, we can use it in
The goal of our ego-motion method is to use I(x; ω) to the above-mentioned optimization framework (Fig. 4): by max-
estimate the ω that aligns all corresponding events (those that imizing the contrast (i.e., quality) of the image of rotated events
were triggered by the same scene point) to the same (un-rotated) we will estimate the motion parameters that best compensate for
image point, thus effectively removing the accumulation blur. the rotational motion of the camera, i.e., those that best describe
Given a subset of events, we cast the ego-motion estima- the ego-motion.
tion problem into an optimization one: obtain the angular The contrast (5) is a non-linear function of the unknown
velocity ω that optimizes some distinctive characteristic of variable ω. It is unlikely that a closed-form solution to the
motion-compensated event images. But what are such distinc- contrast maximization problem
tive characteristics? Drawing an analogy with standard cam- max Var (I(ω, E))
ω
eras, motion-compensated intensity images look sharper and
have higher contrast than the blurred (uncompensated) ones, exists. Therefore, we use standard iterative non-linear algo-
as shown in Fig. 3(b). This is intuitive, since blur is given by rithms to optimize the contrast. In particular, we use the
the convolution of the original (sharp) image with a low-pass non-linear conjugate gradient (CG) method by Fletcher and
filter, and restoration consists in inverting such an operation, Reeves [10], CG-FR.
that is, high-pass filtering. Sharpening is nothing but increasing To process an entire stream of events, we use a temporal
the contrast along the edges, making the light side of the edge observation window consisting of a subset of events Em . We
lighter and the dark side of the edge darker. process the subset (i.e., maximize contrast) and then shift the
In the same way, distinctive characteristics of motion- window, thus selecting more recent events. The angular velocity
compensated event images are that they look sharper and estimated using Em provides an initial guess for the angular
have higher contrast than uncompensated ones [cf. Figs. 5(a) velocity of the next subset, Em +1 , thus effectively assuming a
and 5(b)]. In fact, both, sharpness and contrast, are related. constant velocity motion model. This scheme works very well in
Hence, we will use contrast to measure the quality of an event practice (in spite of the local convergence properties of standard
image. In general, contrast quantifies the amount by which optimization algorithms) since each subset Em usually spans a
the oscillation (or difference) of a signal stands out from the very small time interval, and therefore, the angular velocity does
average value (or background). Several contrast metrics are not significantly change between consecutive event subsets.
available (see [8]). The Weber contrast is defined locally, as The answer to the question of how to choose the number
.
CW = (I − Ib )/Ib , with I and Ib representing the uniform in- of events in each subset Em and how to shift the window is
tensities of a small image object and its large adjacent back- application-dependent: the two principal strategies consist of
ground, respectively. The Michelson contrast [9], defined as using a fixed time interval Δt and shift Δt or using a fixed
.
CM = (Im ax − Im in )/(Im ax + Im in ) with Im ax and Im in rep- number of events per subset N and per shift N . The first one
resenting the highest and lowest intensity, is suitable for images might be the choice of applications that must provide angular
with periodic patterns where there is no large area of uniform velocity estimates at a fixed rate. Since event cameras are data-
intensity. We measure the contrast of an image by means of its driven sensors, whose output depends on the amount of apparent
variance, as defined in Appendix A, motion, we opt for the second strategy (fixed number of events)
because it preserves the data-driven nature of event cameras: the
. 1 2
Var (I(ω, E)) = I(ω, E)(x) − μ(I(ω, E)) dx, (5) rate of ego-motion estimates will be proportional to the event
|Ω| Ω
rate, that is, to the apparent motion of the scene.
which is a measure of the spread or concentration of the im-
age values around the mean intensity and it does not depend IV. ALGORITHM DETAILS
on the spatial distribution of contrast in the image. Alternative
This section describes details of the proposed method. The
contrast metrics such as the RMS (Appendix A) or different
reader not interested in the details can jump to Section V.
p-norms, Cp ∝ Ω |I(ω, E)(x) − μ(I(ω, E))|p dx, with p ≥ 1,
An efficient implementation of the method requires providing
are also possible. We opt for the variance (2-norm in (5)) since
to the optimization algorithm not only the contrast but also its
it performs better than other metrics.
gradient with respect to ω. For completeness, such formulas are
Since event images add the polarity of the events, which
given in Appendix B.
are caused by scene edges, the contrast of the event im-
Formula (4) is an idealized description of the image of ro-
age measures the strength of the edges. Corresponding events
tated events. In practice, a digital image is synthesized, so
(Section III-A) have the same polarity, so a candidate ω that
the image domain Ω has to be discretized into pixels and the
aligns corresponding events (i.e, compensates for motion) will
two-dimensional Dirac delta has to be replaced by a suitable
sum their polarities, producing stronger edges, and therefore,
approximation, as in forward mapping of spatial transforma-
increasing the contrast.
tions [11, ch. 3]. The simplest one consists in a single-pixel up-
date: the rotated event at point xk (ω) only updates the value of
E. Ego-Motion Estimation by Contrast Maximization
the accumulated polarity at the nearest pixel. However, this is a
The contrast (5) of the image of rotated events I(ω, E) (4) crude approximation that produces undesirable rounding effects
provides a measure of the goodness of fit between the event data (“aliasing,” in the terminology of line rasterization in Computer
636 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 2, NO. 2, APRIL 2017
quartile (i.e., the median). The markers outside the box are the
minimum and maximum. We observe the increasing trend of
the motion speed also in the error box plots, suggesting that
there is a relative dependency: the larger the motion, the larger
the error can be. However, note that the errors are small relative
to the “size” of the motion: we report standard and maximum
deviations of approximately 20 and 80 deg/s, respectively, with
Fig. 8. Poster sequence. Error box plots, by intervals of 15 s. Left: Estimated
versus mocap. Right: Estimated versus IMU.
respect to peak excursions close to 1000 deg/s, which translate
into 2% and 8%, respectively.
Figs. 9 and 10 summarize our results for the boxes and
Fig. 7-top shows a segment of 8 seconds duration, with rotations dynamic sequences, respectively. For the boxes sequence,
dominantly around each axis of the event camera: first pan, then Fig. 9(a)-top shows the comparison of our method against
tilt, and, finally, roll. Fig. 7-bottom shows a 4 s segment at the ground truth over the whole sequence. The estimated motion
end of the sequence with the largest combined rotations in all with our method is identical to ground truth at this scale.
3 axes. Fig. 9(a)-bottom provides a zoom into the comparison plot, dur-
The plots for the same poster sequence comparing our ing a 4 second segment with high-speed motion (angular speeds
method against the IMU are very similar, and, therefore, are not of more than ±600 deg/s). Even at this zoom level, the lines of
included for brevity. Instead, we provide box plots with statistics both our method and ground truth are almost identical. A better
about the errors of our method against both, mocap and IMU, visualization of the magnitude of the errors is provided in the
in Fig. 8. Since the sequence presents increasing motion, we box plots of Fig. 9(b). This figure also shows the comparison of
split the analysis of the statistics in intervals, each of which lasts our method against the IMU, and it is analogous to Fig. 8. As it
15 seconds. Recall that the bottom and top of a box are the first can be observed, our method compares favorably in both cases:
and third quartiles, and the line inside the box is the second ground truth and IMU.
638 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 2, NO. 2, APRIL 2017
B. Computational Performance
Next, we provide some order of magnitude of the compu-
tational effort required by the method. The algorithm was im-
plemented in C++, without paying attention at optimizing the
code for real-time performance. The core of the algorithm is
the computation of the image of rotated events I(x; ω) in (4)
and its derivatives (Eq. (9) in Appendix B). For a subset of
15 000 events, this takes 2.7 ms on a standard laptop with an In-
tel(R) Core(TM) i7-3720QM CPU @ 2.60G Hz running single-
threaded. This operation is carried out multiple times within
the optimization algorithm. The method may be accelerated for
real-time performance.
Since the contrast (5) is computed using the image of rotated
events (4), the computational complexity of the method depends
on both the number of rotated events and the event image resolu-
tion. The method scales linearly with respect to both. Hence, to
reduce the computational cost, one may (i) compute the contrast
using fewer pixels, (ii) rotate fewer events or (iii) apply both
strategies. An appealing option is to maximize the contrast of
some regions of the image plane. It suffices to select the most
informative regions (those with the largest density of events), as
Fig. 10. Dynamic sequence. Comparison of the estimated angular velocity in direct methods for visual odometry [14] and SLAM [15], i.e.,
(solid line) against ground truth from the motion capture system (mocap) (dashed
line). (a) Whole sequence (top) and zoom-in of shaded region (bottom). (b) Error it suffices to select the events that most likely will be rotated to
box plots, by intervals of 15s. Left: Estimated versus mocap. Right: Estimated the regions of the image that will present the highest contrast,
versus IMU. which would then be tracked in time, for efficiency.
C. Discussion
During the experiments, we noticed that, in general, roll es-
timation is more subjective to errors than the estimation of pan
and tilt motions because the apparent motion of the center of
the image is smaller than the periphery, and, therefore, there is a
lack of events close to the center of rotation, with events mostly
appearing far away, in the periphery of the image.
Fig. 11. Dynamic sequence. Detail of person moving close to the camera.
Left: image of rotated events. Right: estimated angular velocity (solid line)
In all sequences, the event camera was moved in front of the
versus ground truth (dashed line). scene, about 1.5 m away or more. The motions were hand-held
and are inevitably corrupted by translation. However, the trans-
lational component of the motion was negligible with respect
Fig. 10(a)-top and 10(a)-bottom compare our method against to the mean scene depth (e.g., distance to the desk), that is,
ground truth over the entire dynamic sequence and over a six- motion was dominantly rotational, satisfying the hypothesis of
second segment (zoomed-in view) featuring high-speed motions our framework. We tested the algorithm on sequences with sig-
of up to 500 deg/s, respectively. This sequence depicts a desk nificant translation and, as expected for any algorithm designed
scene, with events being generated by both the static objects and for rotation-only motions, the algorithm provided an incorrect
a moving person. The events caused by the moving person do motion estimate since it tried to explain the translation using a
not fit the rotational motion model of a static scene. However, the rotation.
motion of the person is slow compared to the temporal resolution
of the event cameras, and, most of the time, our algorithm is not
affected by such motion. When a significant amount of the events VI. CONCLUSION
(typically, 20% or more) are triggered by the moving person, as We have developed a method that estimates the rotational
shown in half of the image plane in Fig. 11-left, the performance motion of an event camera. We have tested the algorithm on
GALLEGO AND SCARAMUZZA: ACCURATE ANGULAR VELOCITY ESTIMATION WITH AN EVENT CAMERA 639
sequences with millions of events, and the results obtained are of two 2D deltas like those in (4): δ (x)δ(y) ≈ (δ(x − x− ) −
very accurate, with angular velocity errors of 2% (standard devi- δ(x − x+ ))/h, with x± = (±h/2, 0) . A similar argument ap-
ation) and 8% (maximum). Due to the high temporal resolution plies to the second component of ∇δ. Finally, each 2D delta is
of the sensor, our method is able to track very high-speed mo- implemented by bilinear voting over four pixels, as explained
tions (≈1000 deg/s). Additionally, due to the sliding window in Section IV.
approach, our method can provide angular velocity estimates at Also, observe that, in (8), by linearity of the integral (6) and
the rate implied by the event resolution (1 μs; sliding the window the derivative, both operators commute:
by one event). Besides the remarkable accuracy and robustness
∂ ∂I(x; ω)
of the method, we believe that its most interesting characteristic μ I(x; ω) = μ ,
∂ω ∂ω
is its design philosophy: motion estimation by means of edge
alignment in terms of contrast maximization. so the left hand side can be computed once the derivative im-
age (9) has been obtained.
APPENDIX A Finally, for the numerically-better behaved contrast that in-
MOMENTS OF AN IMAGE (CONTINUOUS FRAMEWORK) cludes Gaussian smoothing (Section IV), the objective func-
tion is Var (Iσ (x; ω)), with Iσ (x; ω) = I(x; ω) ∗ Gσ (x). The
The mean and variance of an image F : Ω ⊂ R2 → R are
objective gradient has the same form as (8), but ∂ with I and
1 ω) (x; ω) = ∂ ω I(x; ω) ∗
. ∂ ∂
μF = μ(F ) = F (x)dx, (6) ∂ω I(x; replaced by Iσ and ∂ω σI
|Ω| Ω Gσ (x), respectively.
. 1
σF2 = Var (F ) = (F (x) − μF )2 dx, (7) ACKNOWLEDGMENT
|Ω| Ω
A video showing the performance of our method on several
respectively, where |Ω| is the area of the image domain. These
sequences is available at: http://youtu.be/v1sXWoOAs_0
formulas can be obtained by interpreting the values {F (x)}x∈Ω
as providing infinitely many samples of a random variable and
using the moments of such a random variable to define the REFERENCES
moments of the image. Hence, as it agrees with the intuition, [1] P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128×128 120 dB 15 μs
the mean μF is the average value of the image F over the domain latency asynchronous temporal contrast vision sensor,” IEEE J. Solid-State
Circuits, Feb. 2008.
Ω, and the variance is the average spread (i.e., dispersion) of F [2] T. Delbruck, V. Villeneuva, and L. Longinotti, “Integration of dy-
around μF , over the domain Ω. For more details, see [16]. namic vision sensor with inertial measurement unit for electronically
The mean square of an image is defined in the usual way, in stabilized event-based vision,” in Proc. IEEE Int. Symp. Circuits Syst.,
terms of the meanand the variance: MS = RMS2 = μ2F + σF2 , 2014, pp. 2636–2639.
[3] H. Kim, A. Handa, R. Benosman, S.-H. Ieng, and A. J. Davison, “Simul-
1 2
that is, MS = |Ω| Ω F (x)dx. taneous mosaicing and tracking with an event camera,” in Proc. British
Mach. Vis. Conf., 2014, pp. 1–12.
[4] M. Cook, L. Gugelmann, F. Jug, C. Krautz, and A. Steger, “Interacting
APPENDIX B maps for fast visual interpretation,” in Proc. Int. Joint Conf. Neural Netw.,
DERIVATIVE OF THE CONTRAST METRIC 2011, pp. 770–776.
[5] J. Conradt, “On-board real-time optic-flow for miniature event-based vi-
Efficient optimization schemes make use of the derivative of sion sensors,” in Proc. IEEE Int. Conf. Robot. Biomimetics, Dec. 2015,
the objective function to search for ascent/descent directions. pp. 1858–1863.
For the proposed contrast metric (5), the derivative is [6] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision,
2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 2003.
∂ (7) 1 ∂ρ(x; ω) [7] Y. Ma, S. Soatto, J. Košecká, and S. S. Sastry, An Invitation to 3-D Vision:
Var (I(x; ω)) = 2ρ(x; ω) dx, (8) From Images to Geometric Models. New York, NY, USA: Springer, 2004.
∂ω |Ω| Ω ∂ω [8] E. Peli, “Contrast in Complex Images,” J. Opt. Soc. Amer. A, vol. 7,
. no. 10, pp. 2032–2040, 1990.
where ρ(x; ω) = I(x; ω) − μ(I(x; ω)) and differentiation can [9] A. A. Michelson, Studies in Optics. New York, NY, USA: Dover, 1995.
be moved inside the integral since Ω is constant. The gradient [10] R. Fletcher and C. M. Reeves, “Function Minimization by Conjugate
of the image of rotated events is Gradients,” Comput. J., vol. 7, pp. 149–154, 1964.
[11] G. Wolberg, Digital Image Warping. Washington, DC, USA: IEEE Com-
∂I(x; ω) (4) ∂xk put. Soc., 1990.
=− ±k ∇δ x − xk (ω) , (9) [12] C. Brandli, R. Berner, M. Yang, S.-C. Liu, and T. Delbruck, “A 240 × 180
∂ω ∂ω 130dB 3us Latency Global Shutter Spatiotemporal Vision Sensor,” IEEE
k
J. Solid-State Circuits, vol. 49, no. 10, pp. 2333–2341, 2014.
where the derivative of each rotated event (Section III-C) is [13] E. Mueggler, H. Rebecq, G. Gallego, T. Delbruck, and D. Scaramuzza,
“The event-camera dataset and simulator: Event-based data for pose esti-
∂xk ∂ mation, visual odometry, and SLAM,” Int. J. Robot. Res., 2017, in press.
= W(xk ; ω, tk − t0 ). [14] C. Forster, M. Pizzoli, and D. Scaramuzza, “SVO: Fast semi-direct monoc-
∂ω ∂ω
. ular visual odometry,” in Proc. IEEE Int. Conf. Robot. Autom., 2014,
The gradient of the Dirac delta δ(x) = δ(x)δ(y) is com- pp. 15–22.
puted component-wise, ∇δ(x) = (δ (x)δ(y), δ(x)δ (y)) . We
[15] J. Engel, J. Schöps, and D. Cremers, “LSD-SLAM: Large-scale direct
monocular SLAM,” in Proc. Eur. Conf. Comput. Vis., 2014, pp. 834–849.
apply finite-differences with step h = 1 pixel to approximate [16] G. Gallego, A. Yezzi, F. Fedele, and A. Benetazzo, “Variational stereo
the derivative: δ (x) ≈ (δ(x + h/2) − δ(x − h/2))/h. Then, imaging of oceanic waves with statistical constraints,” IEEE Trans. Image
the first component of ∇δ is approximated by the difference Process., vol. 22, no. 11, pp. 4211–4223, Nov. 2013.