Sie sind auf Seite 1von 5

Proc.

International Test and Evaluation Association Workshop on Modeling and Simulation, Las Cruces, NM, December 2004

A Navigation and Obstacle Avoidance Algorithm for Mobile Robots Operating in Unknown, Maze-Type Environments
R. Clark, A. El-Osery, K. Wedeward, and S. Bruder Intelligent Systems and Robotics Group (ISRG) New Mexico Tech Socorro, NM 87801 clark@nmt.edu

Abstract - This paper describes two complementary algorithms developed for mobile robots operating within unknown, maze-type environments. The first is an environmental mapping and navigation algorithm which ensures complete coverage of a maze with apriori unknown wall locations, and the second a stochastic learning automaton approach for general obstacle avoidance within the maze. The environmental mapping and navigation algorithm employs a modified, wall following type scheme for exploring the maze and a variation of a flood-fill strategy for path planning to ensure complete coverage of the maze. A stochastic learning automaton provides a means of avoiding collisions with unstructured obstacles in the maze based upon a reward and penalty system which is used to adapt action probabilities. Both approaches are implemented within the UMBRA modular simulation and visualization framework in order to validate the algorithms prior to implementation in a RTLinux-based hardware platform. Index Terms mobile robot, navigation, obstacle avoidance, path planning.

I. INTRODUCTION The problem of controlling mobile robots in uncertain environments is of central importance to many military [1], first responder [2], and space exploration [3] applications. Mobile robots are utilized in these applications for the expected benefits of reduced risk to humans, lower cost, and improved efficiency. While the advantages to using mobile robots are compelling, the challenge of operating within unstructured and uncertain environments has proved considerable [4]. To further technology development and at least partially address these difficulties a variety of mobile robot competitions [5] (e.g., DARPA Grand Challenge [6] and Trinity College Fire Fighting Home Robot Contest [7]) have been targeted towards the robotics research community. These competitions vary in venue from outdoor to indoor, but maintain the common theme of promoting innovative methodologies for mobile robots to traverse an environment with considerable uncertainty. This paper focuses on navigation approaches applicable to the mobile robots competing in the expert division of the Trinity Fire Fighting Home Robot Contest [7]. Simply

stated, it is desired that a mobile robot traverse a maze of a fixed outer dimension, but variable internal wall placement with additional randomly placed obstacles to locate and extinguish candles. The key features identified for a successful strategy are the abilities to completely explore the maze (to ensure all candles are extinguished) and negotiate around randomly placed objects in the process. Common techniques for achieving successful navigation include a combination of environmental mapping based upon laser range-finders [8], [9] and subsequent identification of unobstructed directions through which the mobile robot can traverse [10]. Due to the structure imposed on the maze of interest (all walls intersect at 90) and computational limitations of the mobile robot identified for implementation (microcontroller-based) two computationally simpler approaches have been developed. These novel approaches are presented in this paper and include: 1) an environmental mapping and navigation algorithm based upon a flood-fill strategy to ensure complete coverage of the maze, and 2) a Stochastic Learning Automaton (SLA) based upon a reward and penalty system for avoiding obstacles. Both approaches are implemented through a computer simulation and results demonstrating their capabilities are provided. The paper is organized as follows. In section II an approach to navigating and mapping a completely unknown maze is presented along with computer simulation results for a mobile robot. The SLA-based method of obstacle avoidance and its implementation on a mobile robot through computer simulation are described in section III. Finally, section IV summarizes the paper and draws some conclusions. II. ENVIRONMENTAL NAVIGATION AND MAPPING The environmental navigation and mapping strategy is comprised of two principal behaviors, namely, exploration and path planning. A fundamental underlying assumption of the problem being considered herein is that the structure of the maze is apriori unknown and thus typical maze solutions which require foreknowledge of the maze structure do not directly apply.

Proc. International Test and Evaluation Association Workshop on Modeling and Simulation, Las Cruces, NM, December 2004

2.1. Reactive Exploration To implement the exploration/discovery behavior a simple wall following strategy was adopted. Such an approach requires no foreknowledge of the maze wall structures and is appropriate for implementation on a mobile robotic platform with limited computational resources. Unfortunately, this approach does not guarantee complete coverage and can get stuck in cyclic patterns. This cyclic behavior will occur in the presence of an island room as the robot could continue to wall follow in a continuous circle. Due to the fact that the wall following (either left or right) algorithm has a memory depth of zero (i.e. relies only on the current state to make a decision), by detecting the first occurrence of a repeated environmental state this cyclic pattern can be avoided. In the case of a static 2 dimensional world the 2D position and 1D orientation uniquely define the current state of the robot. Once a repeated state is detected an alternative behavior is invoked. This could involve switching from left to right wall follow (or vice-versa) or invoking a different behavior. To improve the efficiency of the overall procedure the path planner is called in this case to direct the robot to an unexplored region as illustrated in Figure 1 below.
start no wall follow repeated state ? yes plan path to goal reached yes plan path to home

an arrow) adjacent to a 6 in the bottom right-hand corner of the image.

Figure 2. Flooding to an unoccupied area The flood-fill algorithm is used primarily for path planning. It is used to plan an optimal (shortest) path to the nearest unexplored cell in the event that a repeated state is detected and it is also used to plan a path back home when the goal is reached. Furthermore, the flood-fill guarantees completeness in that it is exhaustive in determining the existence of a potential solution; however, it can only plan a path within the confines of the known environment. An example of an exhaustively covered maze is shown in Figure 3. The arrows in Figure 3 indicate the latest direction of travel in a given cell which has been explored. Cells which have not been explored are labeled by a question sign (i.e. ?).

no

unexplored

done

Figure 1. Flow chart of the overall algorithm 2.2. Path Planning For the path planning a modified depth-first algorithm, commonly referred to as the flood-fill algorithm, was employed. The flood-fill algorithm basically incrementally floods the maze, starting from the current position, until the frontier of the flood reaches the destination. Once the destination is hit by a drop the shortest path is found by tracing backwards along any monotonically decreasing path from destination to the origin. In the flood-fill technique, each cell is labeled with a positive integer value representing the distance to the given cell from the origin. The origin, therefore, is labeled with a distance value of zero. Accessible cells adjacent to the origin cell have a value of one and so on. An example of flooding to the nearest unoccupied area is shown in Figure 2. The origination cell is labeled by a 0 and the closest unoccupied destination cell is labeled by a * (pointed to by

Figure 3. An exhaustively explored maze Sensors are used to detect which directions are blocked and which direction(s) of travel is unobstructed; thus,

Proc. International Test and Evaluation Association Workshop on Modeling and Simulation, Las Cruces, NM, December 2004

providing this necessary information to the wall-following algorithm. 2.3. Developing the Simulation To verify the effectiveness of the proposed algorithm, a simulation was developed using the guidelines of the Trinity Fire Fighting Robot Contest. The goal of this contest is to build an autonomous robot which will navigate a reconfigurable maze representing a house, find a lit candle, extinguish the flame, and return to the starting location. The physical maze is a 3m 3m square with 33cm high walls divided into hallways and rooms. Figure 4 illustrates a possible configuration of a three-room maze. The outer walls will be stationary; however, the inner walls that define the hallways and rooms can be changed between simulations and is unknown beforehand. There should be two to five rooms in any simulation and the room position, size and doorway location can be changed from one simulation to another. A room is defined as having at least a 2 2 cell area, where one cell length is approximately 50cm. A room does not need to be rectangular and it may have alcoves or bends. Everything else is a hallway, which may lead to dead ends. The hallways and doorways should be approximately 48cm wide. The final dimensions of the maze will have a maximum of 36 cells that must be explored.

III. STOCHASTIC LEARNING AUTOMATA FOR OBSTACLE AVOIDANCE 3.1. Approach Summary One of the main advantages to using a stochastic learning automata (SLA) in addressing the obstacle avoidance problem is that this approach requires no apriori knowledge of the environment in which the robot will operate or any analytical foreknowledge of the function to be optimized. In the scenario of a robot trying to go to a destination without any prior information about the environment, an optimal path is almost impossible to compute. In these cases the probabilistic approach motivated by penalty or reward from the surrounding environment can be a good choice. It is possible for this type of approach to avoid local minima where other algorithms would become trapped. A learning automata is a sequential machine characterized by a set of internal states, input actions, state probability distributions, a reinforcement scheme, and an output function; and is connected via a feedback loop to the environment as shown in Figure 5.

Random Environment

Learning Automata Figure 5. Automaton operating in random environment The probability distribution of the actions ( Pu ) is i adjusted using a reinforcement scheme to achieve the desired objective. At each step, the performance of the SLA in the environment is evaluated and is used to select the imposition of either a penalty (or unsatisfactory performance) (i.e. y=1), or a reward (non-penalty or satisfactory performance) (i.e. y=0). A stochastic automaton is a quintuple {Y, Q, U, F, G} where [11]: If Y consists of only two elements, 0 and 1, the environment is said to be P-model. When the input into the SLA is a member of a finite set in the closed interval [0, 1], then the environment is said to be Q-model. On the other hand, if the inputs are members of a non-finite set in the closed line segment [0, 1], the environment is known as S-model, Q is a finite set of states, Q = {q1, , qn}, U is a finite set of outputs, U = {u1, , um}, F is the next state function

Figure 4. An example of an unknown 2.5D maze 2.4. Simulation Results The reactive exploration/discovery behaviour described above was applied to a mobile robot through computer simulation in the UMBRA simulation framework. The mobile robot chosen for this simulation study was a two wheel differential drive vehicle shown in Figure 4. This robot used eight infrared sensors to detect the surrounding walls, and a compass was simulated to help the robot turn to the desired orientations and travel in a straight line. The robot was tested with a variety of maze configurations for which complete coverage was exhibited in all cases. The robot was also placed in several different starting locations in each of the mazes to test the robustness of the algorithm. Figure 3 shows the map built by the robot when placed in the maze shown in Figure 4.

q ( n + 1) = F [ y ( n ), q ( n )], and G is the output function defined as


u ( n ) = G [ q ( n )].

(1)

(2)

Proc. International Test and Evaluation Association Workshop on Modeling and Simulation, Las Cruces, NM, December 2004

In general, the function F is stochastic and the function G may be deterministic or stochastic. Due to the stochastic nature in state transitions, stochastic automata are considered suitable for modeling learning-systems. If the output of the automaton is uj, j=1, 2, , m , the random environment generates a penalty with probability j or a non-penalty with probability (1- j). The reinforcement scheme used to update probability distribution of action is as follows [12]. Assume that u(n) = ui, furthermore if y(n) = 0, then Pu ( n + 1 ) = (1 ) P u ( n ) + ,
i i

pseudo randomly placed inside the room with an assumed existence of a path to the desired location. Figure 6 shows the starting conditions for the simulation and Figure 7 shows the path taken by the robot to the desired location. It is noted that the robot comes very close to an obstacle. This is due to the probabilistic approach where the robot will sometimes take an action that is not desired and move too close to objects. It has been found with the current level of implementation that the more cluttered the environment the more bad movements are taken by the robot. These situations were easily fixed by adding a condition statement that kept the robot from coming too close to an object.

the

(3)

and
P u j ( n + 1 ) = ( 1 ) P u j ( n ), ( j i)

(4) .

If, y(n)=1, then

H Pu i ( n + 1) = Pu i ( n ) (1 Pu i ( n )) , 1 H

(5)

and

H Pu j ( n + 1) = Pu j ( n ) + Pu j ( n ) , ( j i ) 1 H
H = min{Pu1 ( n ),L , Pum ( n )},
0 < < 1,

(6)

where, (7) (8) (9) Figure 6. Simulation of SLA based obstacle avoidance

0 < < 1,
Pu1 (0) = L Pum (0) = 1 , m

(10)

The operation of the SLA is as follows. An action (i.e. the number of mixture components) is selected at random; if the action results in a reward, its probability distribution is increased and the probability distributions of the other actions are decreased based on Equations 3 and 4 wherein the learning rate is determined by . On the other hand if the randomly selected action results in a penalty, its probability distribution is decreased and the probability distributions of the other actions are increased based on Equations 5 and 6. The penalty or reward is assigned based on the distance () at the current iteration. If the current distance is less than or equal to that of the previous iteration, a reward is given, otherwise the action is penalized. 3.2. Simulation Results To test the SLA based obstacle avoidance procedure, a simulation was constructed in the UMBRA simulation framework. The mobile robot chosen for this simulation study was a two wheel differential drive vehicle. Five ultrasonic distance sensors were used to detect obstacles and surrounding walls. The goal destination was represented by a ball which may be placed anywhere inside the room. Obstacles were represented by boxes or cylinders measuring any size and diameter. These obstacles were

Figure 7. SLA obstacle avoidance path IV. CONCLUSION These simulations have investigated the problem of exploring a completely unknown maze and avoiding obstacles while moving to a desired location. While the reactive exploration/discovery behaviour has been shown to be very robust and very desirable in the simulations the obstacle avoidance algorithms will undergo some future revisions. Although the fundamental SLA based obstacle avoidance has been shown to work there exists the need for additional protective layers to make this approach more robust.

Proc. International Test and Evaluation Association Workshop on Modeling and Simulation, Las Cruces, NM, December 2004

V. FUTURE WORK The next step in the implementation of this research is to build a small robot that is able to use these algorithms in a real world application. A common problem observed with a small low power robot is the limited onboard processing. To address this limitation, a system would include two computational components: a remote computer (laptop) and a robot with a fully functional x86-class onboard computer. The remote computer runs navigation algorithms while the onboard computer provides preprocessed sensor information to the remote computer and executes commands issued by the remote computer. This approach effectively provides a behavioral decomposition wherein high-level decisions are made by the remote computer and low to mid-level decisions are made locally. These two systems communicate with each other through a wireless Ethernet link The next revision of the simulation involves the use of different sensors. It has been desired to simulate a very accurate distance sensor with a range greater than 3 meters like a laser range-finder. By sweeping this sensor with a servo or other means, distances can be measured from a point to multiple points on the surrounding walls. By know multiple distances to the same wall more information about the maze can be stored in the local map. By having a very accurate distance to a wall that is in front of the robot, it can center itself in a cell better while building a map and moving through the maze. Also by being able to see a greater distance, the map could have multiple cells explored from a single location, thus reducing the number of cells the robot has to visit for the map to be completed, and thus decreasing the amount of time needed to explore the entire maze. Currently, a simple full-reward or full-penalty SLA has been used as a proof-of-concept for obstacle avoidance and has shown to be promising. The next step to in these simulations is to develop a system which allows varying degrees of reward and punishment for all possible actions. VI. ACKNOWLEDGMENTS Special thanks to Sandia National Labs for the use of the UMBRA simulation framework. The funding and support for this work was sponsored by the Intelligent Systems and Robotics Group at New Mexico Tech. VII. REFERENCES
[1] Voth, D., A new generation of military robots, IEEE Intelligent Systems, Vol. 19, No. 4, July-Aug. 2004, pp. 2 3. [2] Murphy, R.R., Trial by fire: rescue robots, IEEE Robotics & Automation Magazine, Vol. 11, No. 3, Sept. 2004, pp. 50 61. [3] Katz, D.S, and R.R. Some, NASA advances robotic space exploration, Computer, Vol. 36, No. 1, Jan. 2003, pp. 52 61. [4] Wang, L., Computational intelligence in autonomous mobile roboticsA review, Proceedings of the 2002 International Symposium on Micromechatronics and Human Science, Nagoya, Japan, Oct. 20-23 2002, pp. 227 235. [5] Braunl, T., Research relevance of mobile robot competitions, IEEE Robotics & Automation Magazine, Vol. 6, No. 4, Dec. 1999, pp. 32 37. [6] DARPA Grand Challenge, retrieved October 26, 2004 from www.darpa.mil/grandchallenge/

[7] Trinity College Fire Fighting Home Robot Contest, retrieved October 26, 2004 from www.trincoll.edu/events/robot/ [8] Mazl, R. and L. PfeuEil, Building a 2D environment map from laser range-finder data, IEEE Intelligent Vehicles Symposium 2000, Dearborn (MI), USA, Oct. 3-5, 2000, pp. 290 295. [9] Horn, J. and G. Schmidt, Continuous localization of a mobile robot based on 3d-laser-range-data, predicted sensor images, and deadreckoning, Journal of Robotics and Autonomous Systems, Vol. 14, No. 2-3, May 1995, pp. 99 118. [10] Borenstein, J. and Y. Koren, Real-time obstacle avoidance for fast mobile robots in cluttered environments, Proceedings IEEE International Conference on Robotics and Automation, Cincinnati (OH), USA, May 13-18, 1990, pp. 572 577. [11] Fu, K.S., Learning control systems-review and outlook, IEEE Trans. Automatic Control Vol. AC-15, No. 2, April 1970, pp.210 221. [12] Poznyak, A.S. and K. Najim. Learning Automata and Stochastic Optimization, Lecture Notes in Control and Information Sciences 225, Springer-Verlag Telos, 1997.

Das könnte Ihnen auch gefallen