Sie sind auf Seite 1von 9

Robotics and Autonomous Systems 40 (2002) 111119

Self-localization using sporadic features


Gerhard K. Kraetzschmar , Stefan Enderle
University of Ulm, Neuroinformatics, D-89069 Ulm, Germany

Abstract Knowing its position in an environment is an essential capability for any useful mobile robot. Monte Carlo Localization (MCL) has become a popular framework for solving the self-localization problem in mobile robots. The known methods exploit sensor data obtained from laser range nders or sonar rings to estimate robot positions and are quite reliable and robust against noise. An open question is whether comparable localization performance can be achieved using only camera images, especially if the camera images are used both for localization and object recognition. In such a situation, it is both harder to obtain suitable models for predicting sensor readings and to correlate actual with predicted sensor data. Both problems can be easily solved if localization is based on features obtained by preprocessing images. In our image-based MCL approach, we combine visual distance features and visual landmark features, which have different characteristics. Distance features can always be computed, but have value-dependent and dramatically increasing margins for noise. Landmark features give good directional information, but are detected only sporadically. In our paper, we discuss the problems arising from these characteristics and show experimentally that MCL nevertheless works very well under these conditions. 2002 Published by Elsevier Science B.V.
Keywords: Self-localization; Visual features; Dynamic environments

1. Introduction A mobile robot that is expected to navigate a known environment in a goal-directed and efcient manner must have the ability to self-localize, i.e. to determine its position and orientation in the environment [6,9]. In the past few years, Monte Carlo Localization (MCL) has become a very popular framework for solving the self-localization problem in mobile robots [4,5]. This method is very reliable and robust against noise, especially if the robots are equipped with laser range nders or sonar sensors. In some environments, however, e.g. in the popular RoboCup domain [7], using a laser
Corresponding author. E-mail addresses: gkk@neuro.informatik.uni-ulm.de (G.K. Kraetzschmar), steve@neuro.informatik.uni-ulm.de (S. Enderle).

scanner on each robot may be difcult, or impossible, or too costly. Also, sonar data is extremely noisy due to the highly dynamic environment. Thus, enhancing the existing localization methods such that they can use other sensory channels, like uni- or omni-directional vision systems is an interesting open problem. In this work, we present a vision-based MCL approach using visual features which are extracted from a robots single camera and matched to a known model of the environment. 2. Markov localization In Markov localization [4], the position of a robot is estimated by computing the probability distribution over all possible positions in the environment. Let l = x, y, denote a position in the state space.

0921-8890/02/$ see front matter 2002 Published by Elsevier Science B.V. PII: S 0 9 2 1 - 8 8 9 0 ( 0 2 ) 0 0 2 3 6 - 1

112

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

Then, Bel(l) expresses the robots belief in being at location l. The algorithm starts with an initial distribution Bel(l) depending on the initial knowledge of the robots position. If the correct position of l is known, and Bel(l) = 0 for l = l. If the Bel(l) = 1 for l = l position is completely unknown, Bel(l) is uniformly distributed over the environment to reect this uncertainty. As the robot operates, two kinds of update steps are iteratively applied to incrementally rene Bel(l). 2.1. Belief projection across robot motion This step handles robot motion and projects the robots belief about its position across the executed actions. A motion model P (l|l , m) is used to predict the likelihood of the robot being in position l assuming that it executed a motion command m and was previously in position l . Here, it is assumed that the new position depends only on the previous position and the movement (Markov property). Using the motion model, the distribution of the robots position belief Bel(l) can be updated according to the commonly used formula for Markov chains [1]: Bel(l) = P (l|l , m) Bel(l ) dl . (1)

3. Monte Carlo localization The MCL approach [5] solves the implementation problem by representing the innite probability distribution Bel(l) by a set of N samples S = {si |i = 1, . . . , N}. Each sample si = li , pi consists of a robot location li and weight pi . The weight corresponds to the likelihood of li being the robots correct position, i.e. pi Bel(li ) [10]. Furthermore, as the weights are interpreted as probabilities, we assume N i=1 pi = 1. The algorithm for MCL is adopted from the general Markov localization framework described above. Initially, a set of samples reecting initial knowledge about the robots position is generated. During robot operation, the following two kinds of update steps are iteratively executed. 3.1. Sample projection across robot motion As in the general Markov algorithm, a motion model P (l|l , m) is used to update the probability distribution Bel(l). In MCL, a new sample set S is generated from a previous set S by applying the motion model as follows: for each sample l , p S a new sample l, p is added to S, where l is randomly drawn from the density P (l|l , m). The motion model takes into account robot properties like drift and translational and rotational errors. 3.2. Belief update and weighted resampling Sensor inputs are used to update the robots beliefs about its position. According to Eq. (2), all samples are re-weighted by incorporating the sensor data o and applying the observation model P (o|l ). Most commonly, sensors such as laser range nders or sonars, which yield distance data, are used. In this case, ideal sensor readings can be computed a priori, if a map of the environment is given. An observation model is then obtained by noisifying the ideal sensor readings, often simply using Gaussian noise distributions. Given a sample l , p , the new weight p for this sample is given by: p = P (o|l )p , (3)

2.2. Integration of sensor input The data obtained from the robots sensors are used to update the belief Bel(l). For this step, an observation model P (o|l ) must be provided which models the likelihood of an observation o given the robot is at position l . Bel(l) is then updated by applying Bayes rule as follows: Bel(l) = P (o|l ) Bel(l ), (2)

where is a normalization factor ensuring that Bel(l) integrates to 1. The Markov localization method provides a mathematical framework for solving the localization problem. Unlike methods based on Kalman ltering [8], it is easy to use multimodal distributions. However, implementing Markov localization on a mobile robot in a tractable and efcient way is a non-trivial task.

where is again a normalization factor which ensures that all beliefs sum up to 1. These new weights for the

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

113

samples in S provide a probability distribution over S , which is then used to construct a new sample set S. This is done by randomly drawing samples from S using the distribution given by the weights, i.e. for any sample si = li , pi S Prob(si S) pi . (4)

4.1. The RoboCup environment In RoboCup, two teams of robots play soccer against each other, where each team has three eld players and one goalie. The eld is about 9 m long and 5 m wide and has two goals which are marked blue and yellow for visual recognition. The eld has a green oor and is surrounded with a white wall. It has four corners each of which is marked with two green vertical bars of 5 cm width. 4.2. Feature-based modeling As described in Eq. (3), the sensor update mechanism needs a sensor model P (o|l) which describes how probable a sensor reading o is at a given robot location l. This probability is often computed by estimating the sensor reading o at location l and determine a similarity measure between the given measurement o and the estimation o. If sensor readings oi are camera images two problems arise: (i) Estimating complete images oi for each samples location is computation ally very expensive. (ii) Finding and computing a similarity measure between images is quite difcult. A better idea is to lift the similarity test to work on processed, more aggregated, and lower-dimensional data, such a feature vectors. The selection of the visual features is guided by several criteria: (i) uniqueness of the feature in the environment; (ii) computational effort needed to detect the feature in images; (iii) reliability of the feature detection mechanism. In the RoboCup domain, only few environment features can be detected while meeting these criteria: goal posts and eld corners as landmark features, and distances to walls as distance features. The corner detector yields ambiguous landmarks, because corners are indistinguishable from each other, while the four goal post-features are unique landmarks. At most eight landmark features can be detected overall. Only under very special circumstancesuse of an omni-directional camera, all features within the visual eld, and no occlusion of a feature by another robotare all eight features detectable at the same time. In addition to landmark features, we compute wall distance features at just four columns in the image. On our Sparrow robot, we use an uni-directional camera with a visual eld of roughly 64 (see Fig. 1). Consequently, even in the best case our robot can

The relationship is approximate only, because after each update step we add a small number of uniformly distributed samples, which ensure that the robot can re-localize in cases where it lost track of its position, i.e. where the sample set contains no sample close to the correct robot position. The MCL approach has several interesting properties: for instance, it can be implemented as an anytime algorithm [2], and it is possible to dynamically adapt the sample size (see [4] for a detailed discussion of MCL properties).

4. Vision-based localization There are mainly two cases where MCL based on distance sensor readings cannot be applied: (i) If distance sensors like laser range nders or sonars are not available. (ii) If the readings obtained by these sensors are too unreliable, e.g. in a highly dynamic environment. In these cases, other sensory information must be used for localization. A natural candidate is information from the visual channel, because many robots include cameras as standard equipment. An example for using visual information for MCL has been provided by Dellaert et al. [3]. On their indoor robot Minerva, they successfully used the distribution of light intensity in images obtained from a camera facing the ceiling. The advantage of this setup is that there are few disturbances caused by people in the environment, and that one can always obtain a comparatively large number of sensor readings, even if occasionally they may be very noisy (due to use of ashlights etc.). An interesting and open question is whether the MCL approach still works when the number of observations is signicantly reduced and when particular observations can be made only intermittently. In the following, we describe a situation where these problems arise and show how to adopt the MCL approach accordingly.

114

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

Fig. 1. The Sparrow-99 robot (left), and the eld used in the RoboCup middle-size robot league and the position of the visually detectable features.

see not more than four landmark features and four distance features at a time. Thus, features available for localization are sparse, at least compared to the standard approach, where a laser scanner provides hundreds of distance estimates per second. In many situations, cf. when located close to the middle line, facing either sideline, and looking down it detects none at all, because no feature is within the visual eld. Thus, the fact that the features can be detected only sporadically is another signicant difference to previous applications of MCL. We also performed some experiments in ofce environments, where we just applied a simple distance estimator technique. Vertical edges would be good candidates for landmarks. However, aside of selecting features that are sufciently easy and reliable to detect the problem of online generation of feature vector estimates based on given positions must be solved. Standard world representations used for robot mapping permit to compute distance estimates sufciently fast, but more advances features, like corners or edges of posters hanging on walls require an extensions of the representation and additional computational effort. In order to incorporate sensor information on a feature-based level, two steps are needed: (i) the features have to be extracted from the current camera

image; (ii) the samples weights have to be adapted according to some model P (o|l). This is described in the following paragraphs. 4.3. Feature detection The camera image is color-segmented in order to simplify the feature detection process. On the segmented image we apply lters detecting color discontinuities to yield the respective feature detectors (see Fig. 2). The goal post detector detects a horizontal whiteblue or a whiteyellow transition for the blue or yellow goal post, respectively. The corner detector searches for a horizontal greenwhitegreen transition in the image. The distance estimator estimates the distance to the walls based on detected vertical greenwhite transitions (greenblue or greenyellow in goal areas) in the image. At the moment, we select four specic columns in the image for detecting the walls. For ofce environments, we use only distance features. Fig. 3 shows the image of a corridor inside our ofce building, and the edges detected from it. The lines starting from the bottom of the image visualize the estimated distances to the walls. Note, that there are several errors mainly in the door areas. The

Fig. 2. Feature detection in RoboCup (from left to right): original image, segmented image, detection of goal post, corner, and walls.

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

115

Fig. 3. Feature detection in ofce environments and their projection into maps.

right-hand side of Fig. 3 shows projections of the estimated distances to maps of laser scans. The brighter dots show the estimated distances by the visual feature detector, while the smaller darker dots show the laser readings in the same situation. Note, that in the left image a situation is shown where a lot of visual distances can be estimated, while in the right image only four distances are estimated at all. Due to the simplicity of the applied feature detector often only few distance features are detected and therefore the situation is comparable to the RoboCup environment. 4.4. Weight update As described in Eq. (3), the sample set must be re-weighted using a sensor model P (o|l). Let the sensor data o be a vector of n features f1 , . . . , fn . If we assume that the detection of features depends solely on the robots position and does not depend on the detectability of other features, then the features are independent and we can conclude: P (o|l ) = P (f1 , . . . ,fn |l ) = P (f1 |l ), . . . , P (fn |l ). (5)

also demonstrates that probabilistic combination of evidence for several features signicantly improves the results. The gure illustrates various properties of our approach. First, the heuristic function for each of the four unique goal post-features has only a single peak. The shape of the function causes all samples with comparatively high angular error to be drastically down-valued and successively being sorted out. Thus, detecting a single goal post will reshape the distribution of the sample set such that mostly locations that make it likely to see a goal post in a certain direction will survive in the sample set. Furthermore, if two goal posts are detected at the same time, the reshaping effect will be overlayed and drastically focus the sample set. Secondly, there is only a single heuristic function that captures all of the ambiguous corner features. Each actually detected corner is successively given a probability estimate. If the corner detector

Note that n may vary, i.e. it is possible to deal with varying numbers of features. The sensor model P (fi |l) describes how likely it is to detect a particular feature fi given a robot location l. In our implementation, this sensor model is computed by comparing the horizontal position of the detected feature with an estimated position as determined by a geometrical world model. The similarity between feature positions is mapped to a probability estimate by applying a heuristic function as illustrated in Fig. 4. The application of these heuristics to the examples used in Fig. 2 are illustrated in Fig. 5, which

Fig. 4. Heuristics for estimating probabilities P (fi |l) given a current position.

116

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

Fig. 5. Probability distributions P (fi |l) for all eld positions l (with xed orientation) given the detection of a particular feature fi or combinations of them (from left to right): goal posts, eld corners, wall distances, corner and goal features combined, and all features combined.

misses corners that we expected to see, this does not do any harm. If the detector returns more corners than actually expected, the behavior depends on the particular situation: detecting an extra corner close to where we actually expected one, has a small negative inuence on the weight of this sample, while detecting a corner where none was expected at all has a much stronger negative effect. Note, that the robustness of the corner detector can be modeled by the heuristic function: having a robust detector means assigning very low probabilities to situations where the similarity between estimated and detected feature position is small, while for a noisy detector the heuristic function would assign higher probabilities in such situations. The effect of detecting too many features or detecting them at wrong positions is accordingly different.

The rst image in Fig. 6 shows the trajectory given by the odometry when executing a single round. Note the displacement at the end of the tour. The second image displays the corrected path found by the localization algorithm using only the goal posts as features. In the third image we can see a highly accurate path found by the localization algorithm which used all possible features. The fourth image shows the drift that occurs when moving multiple rounds without self-localization. The fth image displays the trajectory as determined with self-localization, which does not drift away. 5.2. Number of samples Implementing sample-based localization, it is important to have an idea of how many samples are needed. Obviously, a small number of samples is preferred, since the computational effort increases with the number of samples. On the other side, an appropriate number of samples is needed in order to achieve the desired accuracy. In this experiment, we moved four rounds exactly as in the previous experiment and used ve different numbers of samples: 50, 100, 150, 1000, 5000. Fig. 7 illustrates the average localization errors (left image) and both average error and maximum error (right) for different numbers of samples during the experiment. The result is that a few hundred samples

5. Experiments 5.1. Number of features A Sparrow-99 robot was placed in one corner of the eld. In order to have an accurate reference path, we moved the robot by hand along a rectangular trajectory indicated by the dots. This simple experiment demonstrates the practical usability of the approach. It is shown that the robot can localize robustly and that the accuracy can be improved by adding more features.

Fig. 6. Feasibility experiments.

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

117

Fig. 7. Average localization errors of the different sample numbers (left); average error and maximum error of the same ve runs (50, 100, 150, 1000 and 5000 samples used) (right).

are usually sufcient to keep the average localization error sufciently low. 5.3. Stability and robustness In order to evaluate the stability of our approach in the presence of sparse and sporadic data, we logged the number of landmark and distances features detected at every cycle (see Fig. 8 top and middle) and

compared this to the development of the standard deviation overall samples with respect to the true robot position (bottom). The graphics illustrate that: (i) distance feature detection is more robust than detection of landmark features; (ii) the method is quite stable with respect to sparsity and sporadicity of detected features; (iii) only after neither kind of features can be detected for some time the accuracy of the robot position degrades.

Fig. 8. Demonstrating the robustness of the approach.

118

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119

Fig. 9. Performance of sporadic visual localization during a sample run in an ofce environment.

5.4. Feasibility for ofce environments It was interesting to see whether our approach chosen for the RoboCup environment would also be feasible for more complex environments, like regular corridors and rooms. Our B21 robot Stanislav was positioned at one end (left side of Fig. 9) of a corridor of about 20 m length and 2.2 m width. The camera, mounted at about 1.5 m height, was directed straight to the front. Only distance features were computed. No use was made of the pan-tilt unit to reposition the camera during the experiment. We used 1000 samples which were initialized at the approximating robot position with a diameter of 1 m and an angular uncertainty of 20 . The robot was able to track the path along the corridor up to the right side and the way back to the starting point. Then, while rotating 180 , the camera could not see the oor any more. As a result, many wrong distance estimates were generated due to visual artifacts, and the samples drifted away. Potential remedies include active use of the pan-tilt ensure the oor can be seen at all times or exploiting additional landmark features such as corners, doors, or posters at walls. 6. Conclusions In this paper, we used the Monte Carlo approach for vision-based localization of a soccer robot on the RoboCup soccer eld. Unlike many previous applications, the robot could not use distance sensors like laser scanners or sonars. Also, special camera setups were not available. Instead, the onboard camera was used for localization purposes in addition to object recognition tasks. As a consequence, sensor input to update the robots belief about its position was sparse and sporadic. Nevertheless, the experimental evaluation

demonstrated that MCL works quite well even under these restrictive conditions. Even with very small numbers of detected features, the localization results can be quite usable, but naturally, increasing the number of visual features available dramatically enhances accuracy. Finally, rst experimental results indicate that the vision-based localization approach promises good results also in ofce environments. Considering that nowadays cameras are much cheaper than laser scanners and sonar rings, vision-based robot localization could provide a cost-effective alternative for commercial robot applications. Acknowledgements The authors would like to thank Marcus Ritter and Heiko Folkerts for implementing most of the algorithms and performing many experiments, and Stefan Sablatng, Hans Utz, and Gnther Palm for helpful discussions and their continuing support. References
[1] K. Chung, Markov Chains with Stationary Transition Probabilities, Springer, Berlin, 1960. [2] T. Dean, M. Boddy, An analysis of time-dependent planning, in: Proceedings of the AAAI-88, St. Paul, MN, 1988, pp. 4954. [3] F. Dellaert, W. Burgard, D. Fox, S. Thrun, Using the condensation algorithm for robust, vision-based mobile robot localization, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999, pp. 588594. [4] D. Fox, Markov localization: A probabilistic framework for mobile robot localization and navigation, Ph.D. Thesis, University of Bonn, Bonn, Germany, December 1998. [5] D. Fox, W. Burgard, F. Dellaert, S. Thrun, Monte Carlo localization: Efcient position estimation for mobile robots, in: Proceedings of the AAAI-99, Orlando, FL, 1999, pp. 343349. [6] J.-S. Gutmann, W. Burgard, D. Fox, K. Konolige, An experimental comparison of localization methods, in: Proceedings of the International Conference on Intelligent Robots and Systems (IROS98), Victoria, BC, October 1998. [7] H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, E. Osawa, H. Matsubara, RoboCup: A challenge problem for AI, AI Magazine 18 (1) (1997) 7385. [8] P.S. Maybeck, The Kalman Filter: An Introduction to Concepts, Springer, Berlin, 1990, pp. 194204. [9] B. Schiele, J.L. Crowley, A comparison of position estimation techniques using occupancy grids, in: Proceedings of the 1994 IEEE International Conference on Robotics and Automation, San Diego, CA, May 1994, pp. 16281634.

G.K. Kraetzschmar, S. Enderle / Robotics and Autonomous Systems 40 (2002) 111119 [10] A.F.M. Smith, A.E. Gelfand, Bayesian statistics without tears: A samplingresampling perspective, American Statistician 46 (2) (1992) 8488. Gerhard K. Kraetzschmar is a Senior Research Scientist and Assistant Professor at the Neural Information Processing Department at University of Ulm. His research interests include autonomous mobile robots, neurosymbolic integration, learning and memory, and robot control and agent architectures. He is a member of IEEE, AAAI, ACM, IAS, and GI.

119

Stefan Enderle is a Ph.D. candidate at the Neural Information Processing Department at University of Ulm. His research interests include sensor interpretation and fusion, map building and self-localization in robotics, and software development for autonomous mobile robots.

Das könnte Ihnen auch gefallen