ELECTRONIC MUSICIAN july 1990 copyright pp 32,35,36,38,40. excerpts 1232w

Into New Worlds: Virtual Reality and the Electronic Musician by David (Rudy) Trubitt


In the NASA-Ames virtual system, a device called the "Convolvotron" creates three-dimensional sound within a pair of normal stereo headphones. Up to four discrete audio channels can be individually placed and/or moved in an imaginary sphere surrounding the listener. As with VR video displays, the perceived location of the sound remains constant regardless of head position. The Convolvotron is a two-board set that works with IBM PCs.

Work on this device began in 1986, when Scott Fisher, project leader for the VR VIEW system at NASA-Ames asked perceptual psychologist Elizabeth Wenzel about the feasibility of adding 3-D sound to NASA's VR system. Dr. Wenzel decided that it was possible and enlisted the aid of Professor Fred Wightman (currently at the University of Wisconsin) and Scott Foster, president of Crystal River Engineering, to develop the system. Professor Wightman was known for his highly accurate measurements of the ear canal, while Scott Forster had the necessary background to design the hardware. Besides functioning as a 3-D sound source for VR use, the Convolvotron also was designed as an aid to psychoacoustical research.

Before jumping into details on how the Convolvotron works, you need to understand some basic psychoacoustic principles (see "An Ear for Processing on p. 66 for more). We locate sounds in space by using small differences in time, phase, and amplitude of the sound that reaches each eardrum. These differences are caused by several factors: the direction we are facing in relation to the sound source, the acoustic space surrounding the listener and source, and the shape of each person's outer and inner ear. The end result is that none of us hears things in quite the same way.

Although differences in each person's inner and outer ear were long suspected to be significant, they were hard to quantify. By using Fred Wightman's precise measurements, the Convolvotron can account for them. To make the measurements, the user is seated in an anechoic (echo free) chamber and a tiny probe mic is placed inside each ear canal, next to the eardrum. Then a test tone is played from 144 different locations surrounding the subject, and the "impulse response" at each eardrum is measured. The impulse response completely characterizes the direct and reflected sound reaching the eardrum. The sum of these measurements, called a "Head Related Transfer Function" (HRTF), contains the aural cues used to determine sound location. The HRTF of a specific user can be fed into the Convolvotron and used to synthesize 3-D sound.

The four sounds going into the Convolvotron are processed through one parallel array containing 128 multiply/accumulators that are configured as tapped delay lines. Each sound is "placed" in space by a Finite Impulse Response filter whose settings are determined by the HRTF measurements. When a sound is moved, it does not "snap" between measured points. Instead, the four nearest measured points are used to interpolate the response for the unmeasured points, allowing smooth motion of sounds.

Inside a virtual reality, the Convolvotron can make sounds seem to come from within an object. Also, localized (3-D) audio cues can be used to highlight information in a crowded visual field, such as an air traffic control display. Real-world sound can be processed, as can synthesized sound generated by the MIDI capabilities of NASA's Auditory Display System (more on NASA and MIDI later).

According to Scott Foster, the Convolvotron can simulate some aspects of room acoustics more accurately than conventional digital reverbs. Instead of using recirculation (feedback) to create reverb, the Convolvotron calculates every direct and reflected path that reaches the user's ears.


In 1988, Phil Stone of Sterling Software, a NASA subcontractor, began to work on a virtual Theremin at NASA-Ames Research Center. He was joined by Mark Bolas later that year, and the instrument they developed is among the first of its kind.

NASA's instrument eliminates the need for the antennae of a normal Theremin (see sidebar "Look Ma, No Hands!" for more on Theremins). Instead, geometric objects are swept in free space, with X, Y, and Z location determining their sound (see Fig. 4). This visual representation of sound is an important feature of the NASA instrument.

By raising a finger, a new object is created in the center of the VR view. This object represents an oscillator whose pitch is determined by its left-right position, its amplitude by distance, and its timbre (filter cutoff) by height. The object follows the movement of the user's arm and hand through space, so as it moves its pitch, loudness and brightness change correspondingly. The system itself doesn't create any sound, but instead generates MIDI controllers, which are used to play connected MIDI modules--in this case, a pair of Ensoniq ESQ-Ms.

By repeating the raised-finger gesture, another object can be made to appear, again in the center of the view area. With this scenario, both oscillators follow the user's motion in parallel, and the two pitches beat against each other as they are swept through the frequency, amplitude, and timbre space. Additional objects can be created, with each sending data on its own MIDI channel.

Another NASA instrument could be considered a virtual drum kit. With this instrument, MIDI notes are triggered when the user's hand passes through the surface of a floating drum head. These heads can be arranged in any 3-D pattern, allowing a wave of your hand to trigger a number of sounds with a single, sweeping gesture. Plus, it's pretty novel watching your hand pass through the skin of a drum head as if you were dipping it into a pool of water. The virtual Theremin and drum were both designed to demonstrate the use of sound in VR. They were built using a more general tool, the Auditory Display System. The ADS is capable of generating MIDI messages, or small sequences in response to specific conditions in the VR, where audio is used to reinforce video information. An example: audio cues (the same ones guitarists use to tune up) help direct a remote robot hand in the assembly of an electronic device. As the hand lines up a circuit board with a small slot, two pitches drift towards the same frequency. When the part is correctly aligned, the notes are perfectly in tune. In addition to generating MIDI data, the ADS also integrates the Convolvotron with NASA's system.

Fake Space Labs has rewritten the NASA virtual Theremin and drum software to run on a standard IBM PC. Using a relatively inexpensive 3-D input device called String Thing, the Theremin or other virtual instruments can be used with a normal 2-D video display. The String Thing tracks the user's hand in three dimensions, with a fast response time. While the current version of the device requires an IBM PC, MPU-401, and A-to-D card to generate MIDI, the company is working on a version that outputs MIDI directly. In addition, Fake Space is bringing out a MIDI interface for the Mattel Power Glove, a Nintendo accessory that costs $79.95 at most toy stores.

[Bolas, M. & Stone, P. (1992). Virtual mutant theremin. Proceedings International Computer Music Conference, San Jose, California, USA, 360-361. San Francisco CA, USA: International Computer Music Association.]