Welcome to the ALVINN Project Home Page
Investigators: Dean Pomerleau and Todd Jochem
ALVINN (Autonomous Land
Vehicle In a Neural
Network)is a perception system which learns to control the
NAVLAB vehicles by watching a person drive. ALVINN's architecture consists of a
single hidden layer back-propagation network. The input layer of the network is
a 30x32 unit two dimensional ``retina'' which receives input from the vehicles
video camera. Each input unit is fully connected to a layer of five hidden units
which are in turn fully connected to a layer of 30 output units. The output
layer is a linear representation of the direction the vehicle should travel in
order to keep the vehicle on the road.
To drive the vehicle, a video image from the onboard camera is injected into
the input layer. Activation is passed forward through the network and a steering
command is read off the output layer. The most active output unit determines the
direction in which to steer.
To teach the network to steer, ALVINN is shown video images from the onboard
camera as a person drives, and told it should output the steering direction in
which the person is currently steering. The back-propagation algorithm alters
the strengths of connections between the units so that the network produces the
appropriate steering response when presented with a video image of the road
ahead of the vehicle. After about 3 minutes of watching a person drive, ALVINN
is able to take over and continue driving on its own.
Because it is able to learn what image features are important for particular
driving situations, ALVINN has been successfully trained to drive in a wider
variety of situations than other autonomous navigation systems which require
fixed, predefined features (like the road's center line) for accurate driving.
The situations ALVINN networks have been trained to handle include single lane
dirt roads, single lane paved bike paths, two lane suburban neighborhood
streets, and lined divided highways. In this last domain, ALVINN has
successfully driven autonomously at speeds of up to 70 mph, and for distances of
over 90 miles on a public highway north of Pittsburgh.
Specialized networks are trained for each new road type. The networks are
trained not only to output the correct direction to steer, but an estimate of
its reliability. ALVINN uses these reliability estimates to select the most
appropriate network for the current road type, and to switch networks as the
road type changes.
The current challenge for vision based on-road navigation researchers is to
create systems that maintain the performance of the existing lane keeping
systems, while adding the ability to execute tactical level driving tasks like
lane transition and intersection detection and navigation.
There are many ways to add tactical functionality to a driving system.
Solutions range from developing task specific software modules to grafting
additional functionality onto a basic lane keeping system. Solutions like these
are problematic because they either make reuse of acquired knowledge difficult
or impossible, or preclude the use of alternative lane keeping systems.
A more desirable solution is to develop a robust, lane keeper independent
control scheme that provides the functionality to execute tactical actions.
Based on this hypothesis, techniques that are used to execute tactical level
driving tasks should:
- Be based on a single framework that is applicable to a variety of tactical
level actions,
- Be extensible to other vision based lane keeping systems, and
- Require little or no modification of the lane keeping system with which it
is being used.
A framework, called Virtual Active Vision, which
provides this functionality through intelligent control of the visual
information presented to the lane keeping system, has been developed. Novel
solutions based on this framework for two classes of tactical driving tasks,
lane transition and intersection detection and traversal, are presented in
detail. Specifically, algorithms which allow the ALVINN lane keeping system to
robustly execute lane transition maneuvers like lane changing, entrance and exit
ramp detection and traversal, and obstacle avoidance have been tested.
Additionally, with the aid of active camera control, the ALVINN system enhanced
with Virtual Active Vision tools can successfully detect and navigate basic road
intersections.
tjochem@ri.cmu.edu
pomerlea@cs.cmu.edu
References
- Jochem, Todd M., Pomerleau, Dean A., and Thorpe, Charles E. "Vision Guided Lane
Transition" IEEE Symposium on Intelligent Vehicles, September 25-26, 1995,
Detroit, Michigan, USA.
- Jochem, Todd M., Pomerleau, Dean A., and Thorpe, Charles E. "Vision-Based Neural
Network Road and Intersection Detection and Traversal," IEEE Conference on
Intelligent Robots and Systems, August 5-9, 1995, Pittsburgh, Pennsylvania,
USA.
- Jochem, Todd M. "Using
Virtual Active Vision Tools to Improve Autonomous Driving Tasks" Jochem
Thesis Proposal. (1.2 Mbyte postscript file which contains color images. It
will print on black and white printers. Also, the .ps file is last page first.
Sorry.) Also available as a html
page - the images aren't quite as nice, but it is much quicker to load.
- Jochem, Todd M., Pomerleau, Dean A., and Thorpe, Charles E.
(February 1993) "MANIAC: A Next
Generation Neurally Based Autonomous Road Follower." Proceedings of the
International Conference on Intelligent Autonomous Systems: IAS -3,"
Pittsburgh, Pennsylvania, USA. Also appears in the "Proceedings of the Image
Understanding Workshop," April 1993, Washington D.C., USA. (38Kbyte .ps
file.)
Note: Pomerleau's thesis on the ALVINN system titled Neural
Network Perception for Mobile Robot Guidance is available only in book
form.
Back to the NavLab
Project Home Page.
tjochem@ri.cmu.edu