Welcome to the ALVINN Project Home Page

Investigators: Dean Pomerleau and Todd Jochem


ALVINN (Autonomous Land Vehicle In a Neural Network)is a perception system which learns to control the NAVLAB vehicles by watching a person drive. ALVINN's architecture consists of a single hidden layer back-propagation network. The input layer of the network is a 30x32 unit two dimensional ``retina'' which receives input from the vehicles video camera. Each input unit is fully connected to a layer of five hidden units which are in turn fully connected to a layer of 30 output units. The output layer is a linear representation of the direction the vehicle should travel in order to keep the vehicle on the road.

To drive the vehicle, a video image from the onboard camera is injected into the input layer. Activation is passed forward through the network and a steering command is read off the output layer. The most active output unit determines the direction in which to steer.

To teach the network to steer, ALVINN is shown video images from the onboard camera as a person drives, and told it should output the steering direction in which the person is currently steering. The back-propagation algorithm alters the strengths of connections between the units so that the network produces the appropriate steering response when presented with a video image of the road ahead of the vehicle. After about 3 minutes of watching a person drive, ALVINN is able to take over and continue driving on its own.

Because it is able to learn what image features are important for particular driving situations, ALVINN has been successfully trained to drive in a wider variety of situations than other autonomous navigation systems which require fixed, predefined features (like the road's center line) for accurate driving. The situations ALVINN networks have been trained to handle include single lane dirt roads, single lane paved bike paths, two lane suburban neighborhood streets, and lined divided highways. In this last domain, ALVINN has successfully driven autonomously at speeds of up to 70 mph, and for distances of over 90 miles on a public highway north of Pittsburgh.

Specialized networks are trained for each new road type. The networks are trained not only to output the correct direction to steer, but an estimate of its reliability. ALVINN uses these reliability estimates to select the most appropriate network for the current road type, and to switch networks as the road type changes.

The current challenge for vision based on-road navigation researchers is to create systems that maintain the performance of the existing lane keeping systems, while adding the ability to execute tactical level driving tasks like lane transition and intersection detection and navigation.

There are many ways to add tactical functionality to a driving system. Solutions range from developing task specific software modules to grafting additional functionality onto a basic lane keeping system. Solutions like these are problematic because they either make reuse of acquired knowledge difficult or impossible, or preclude the use of alternative lane keeping systems.

A more desirable solution is to develop a robust, lane keeper independent control scheme that provides the functionality to execute tactical actions. Based on this hypothesis, techniques that are used to execute tactical level driving tasks should:

A framework, called Virtual Active Vision, which provides this functionality through intelligent control of the visual information presented to the lane keeping system, has been developed. Novel solutions based on this framework for two classes of tactical driving tasks, lane transition and intersection detection and traversal, are presented in detail. Specifically, algorithms which allow the ALVINN lane keeping system to robustly execute lane transition maneuvers like lane changing, entrance and exit ramp detection and traversal, and obstacle avoidance have been tested. Additionally, with the aid of active camera control, the ALVINN system enhanced with Virtual Active Vision tools can successfully detect and navigate basic road intersections.

tjochem@ri.cmu.edu

pomerlea@cs.cmu.edu

References


Note: Pomerleau's thesis on the ALVINN system titled Neural Network Perception for Mobile Robot Guidance is available only in book form.

Back to the NavLab Project Home Page.


tjochem@ri.cmu.edu