Sie sind auf Seite 1von 22

http://ai.stanford.

edu/~asaxena/rccar/

High Speed Obstacle


Avoidance using Monocular
Vision and Reinforcement
Learning
Jeff Michels
Ashutosh Saxena
Andrew Y. Ng

ICML
2005.

Problem

Drive a remote control car


at high speeds
Unstructured outdoor
environments
Off the shelf hardware,
inexpensive cameras and
little processing power

Vision and Driving Control

ICML 2005.

Jeff Michels, Ashutosh

QuickTime and a
TIFF (Uncompressed) decompressor
are needed to see this picture.

Prior Work: Vision

Estimating depth from multiple


images:

Stereovision (e.g., Scharstein &


Szeliski, 2002)
Depth from Defocus (e.g.,
Klarquist et al., 1995)
Optical Flow/Structure from
motion (e.g., Barron et al., 1994)

Motivation #1: Monocular vision.


Stereo vision has limits
baseline distance between
cameras
vibration and blur
We would like to explore the use of
monocular cues.

ICML 2005.

Jeff Michels, Ashutosh

Prior Work: Driving Control

Driving

Stereo-vision for driving (LeCun, 2003)


Highways with clear lane markings (Pomerleau, 1989)
Single camera for indoor robot, but known color and texture of ground
(Gini & Marchi, 2002)

Motivation #2: Reinforcement learning


Many past successes used model-based RL.
Does model-based RL still make sense even for tasks requiring
complex perception?
(To simulate vision input, we need to use computer graphics!)

ICML 2005.

Jeff Michels, Ashutosh

Approach
Vision System
Estimate distance to nearest obstacle in each
possible steering direction.
Driving Control
Map from the output of the vision system into
steering commands for the car.
Use reinforcement learning to learn the policy.

ICML 2005.

Jeff Michels, Ashutosh

Vision System:
Training Data

Image divided into vertical


columns corresponding to
possible steering
directions.
Image labeled with depth
for each vertical column
Laser range finder -ground truth distances

ICML 2005.

Jeff Michels, Ashutosh

Vision System: Monocular


Cues

Monocular Cues used by


humans for depth perception

Texture Variations - Laws


Texture Gradient (Linear
Perspective) - Radon, Harris
Haze - Color
Occlusion
Known Object Size

(Loomis, Nature 2001)

ICML 2005.

Jeff Michels, Ashutosh

Feature Vector: Monocular


Cues

Texture Variation
Texture Gradient
Occlusion, Object Size,
Global structure

Overlapping windows
Appending adjacent stripes
vectors

The feature vector size is


858 dimension

ICML 2005.

Jeff Michels, Ashutosh

Learning Algorithm

Supervised learning to estimate the distance d in each


column of the image.
Learn weights w via ordinary least squares with quadratic
cost.
depth

weights

arg minw i (di - wT xi )2


i = columns, images

features

Other regression methods (SVR, robust regression) gave


similar results

ICML 2005.

Jeff Michels, Ashutosh

Results: Learning Depth


Estimates
0.4

Errors on a log scale

E = | log10(d) log10(destimated) |

0.35

0.3

0.25

Able to predict depth with


a average error of 0.26
orders of magnitude.

0.2
Radon
(Texture
Gradient)

Harris
(Texture
Gradient)

ICML 2005.

Laws
(Texture
Variations)

All

Jeff Michels, Ashutosh

Synthetic Graphics Data

Graphics images for


training the vision
system.
Variable degree of
graphical realism
Can a system trained
on synthetic images
predict distances on
real images?

ICML 2005.

Jeff Michels, Ashutosh

Results: Combined Vision


System

When the distance to


nearest obstacle in the
chosen direction is less than
5 m, then it is a hazard.
Hazard rate improves by
combining the real and
synthetic trained system.

24% hazard rate reduction


over using only real images.

ICML 2005.

Jeff Michels, Ashutosh

Control: Reinforcement
Learning

Model based RL -- hard perception problem


Randomly generated environment in
graphics simulator
Pegasus (Ng & Jordan, 2000) to learn
control policy
Car initialized at (0,0) and ran for fixed time
horizon.
Learning algorithm converged after 1674
iterations of policy search.

ICML 2005.

Jeff Michels, Ashutosh

Reinforcement Learning:
Parameters

1: spatial smoothing of predicted distances

2: threshold distance for evasive action

3: steering angle parameter

4, 5: evasive action parameters

6: throttle parameter

ICML 2005.

Jeff Michels, Ashutosh

QuickTime and a
decompressor
are needed to see this picture.

ICML 2005.

Jeff Michels, Ashutosh

Results: Actual Driving


Experiments

ICML 2005.

Jeff Michels, Ashutosh

QuickTime and a
Cinepak decompressor
are needed to see this picture.

ICML 2005.

Jeff Michels, Ashutosh

QuickTime and a
decompressor
are needed to see this picture.

ICML 2005.

Jeff Michels, Ashutosh

Results: Driving Times

QuickTime and a
TIFF (LZW) decompressor
are needed to see this picture.

ICML 2005.

Jeff Michels, Ashutosh

Summary

Monocular depth estimation is an interesting


and important problem.
Supervised learning for depth estimation.
Model-based RL, using computer graphics
simulator, to learn controller.

ICML 2005.

Jeff Michels, Ashutosh

Extensions/Future Work

Learn complete
depth maps
Markov Random
Field (MRF) to
estimate depths.

Learning depth from single


monocular images,
Ashutosh Saxena, Sung H. Chung,
Andrew Y. Ng.
In NIPS 2005.

Image

ICML 2005.

Ground Truth

Jeff Michels, Ashutosh

Predicted

[also with Sung Chung.]

Contact:
Ashutosh Saxena, asaxena@cs.stanford.edu

http://ai.stanford.edu/~asaxena/rccar/
http://ai.stanford.edu/~asaxena/learningdepth/

ICML 2005.

Jeff Michels, Ashutosh

Das könnte Ihnen auch gefallen