The Digital Stereo Matrix

D.A. Morgan, PhD.

 



General

The Matrix

Functions Of The Matrix

Variations of Implementation

Surround Delays

Steering

Nonlinear Responses: Attack and Decay

Analog vs. Digital

Software and Hardware Implementation

General

The effort to raise the quality of the audio and video we enjoy in the home and theater is a continuous one. Music and entertainment in general is more exciting when we feel involved, as opposed to simply watching. Multi-dimensional sounds allow the audience to become more involved in what he hears- he can be surrounded by it. Even though quadraphonic stereo didn't last, a device known as the matrix born in the same era found its way to the cinema to provide us with a bigger experience than we had before. Now, it is back as home theater with both movies and music CDs encoded with left, right, center and surround.

Availability and subsequent low prices prompted more movies, CDs and even television to be encoded with surround sound. Many people now have and, very soon, most will have some form of home theater to enhance their music, movie and television. The trend is bound to continue and increase. As it increases more and better product will become available to support the trend.

The heart of this audio experience lies both in the quality of the sound and in the number of channels of available sound. The quality of the audio has been enhanced through the use of digital techniques for encoding and decoding the analog signal, making it almost impervious to noise. The number of channels have increased bringing us greater involvement and enjoyment in what we see and hear by providing a more realistic placement of the audio image and a real sense of being 'in the middle of it'.

up

The Matrix

Originating with quadraphonic decoders, the matrix is commonly used to create these channels. There have been many versions of the matrix since Scheiber first described it in the early seventies, but they are of a general pattern known as 4-2-4 coding.

The matrix is really an arithmetic device that describes proportionality constant for the encoding and decoding of four signals into two and back again.

In its most common application, the traditional left and right channels are read from the movie sound track (or some other medium) as analog and, after some noise reduction processing, they are fed to the matrix which decodes the new channels, center, surround and subwoofer, from the input left and right. The new channels are encoded on the traditional left and right as sums and differences, and it is the matrix that decodes the separate channels and produces the outputs that we hear.

There are All Sorts of Matrices

All matrices are not created equal. And there are differences between what one can expect from analog and digital implementations.

The encoding process is a lossy one. The problems with matrix decoding lie in the fact that the encoding process itself serves to blur the separation between channels- as you will see in the simple matrix described below. A good matrix, such as Ultra Stereo's, include circuitry to improve separation between channels through the use of steering. But lets start at the beginning.

up

Functions Of The Matrix

Let's go over the typical requirements for a matrix. The matrix basically decodes appropriate inputs into left, right, center and surround channels, it will also sum the left and right channels and low pass them to provide a subwoofer. For the decode to be absolutely correct, the input to the matrix must be properly encoded; the standard stereo left and right do not have enough information to generate a surround channel, and center may or may not be specially encoded.

The center channel represents that audio that is precisely common to both left and right, the surround channel is that part of the audio which is supposed to be occurring behind the audience. The center channel is encoded with a summing network; the surround channel is encoded by placing the information on the left and right channels but out of phase so that it will cancel in the reproduction unless explicitly decoded.

The simplest matrix passes left and right channels provided to it through to the left and right output channels of the equipment. Center is created by taking the RMS (Root Mean Square) of the sum of left and right, and then placing that sum on a separate channel called center.

The real addition to the system comes from the surround channels. (These may be encoded as stereo surround or monophonic surrounds.) The surrounds add depth to the audio and place the listener in the middle. When the star ships in Star Wars zip past a port on you spaceship, you can hear them pass your ear going from front to back. In crowd scenes, the surround channels place you within the situation and not just as a spectator.

The most common technique for encoding the surround channel is to write it on the left and right channels, but out of phase. This is done using something called the Hilbert transform. The surrounds are then decoded by differencing the left and right channels.
In the most elementary implementations, each channel is decoded only. That is, all channels are decoded as discussed in Definitions of the Basic Channels. In this form, one will have all the channels and more: the original encoded signals will remain on left and right channel. This means that those sounds meant only for center will still be evident in both left and right.

In its simplest form the matrix decodes as follows:

1) Left is passed to the left output.
2) Right is passed to the right output.
3) Center is taken as the RMS value of the sum of left and right.
4) Surround is the delayed difference between left and right.
5) Subwoofer is the sum of left and right low passed at 80 to 100 Hz.

Besides the fact that center is still on the left and right, one can lose some sense of position. Imagine watching a train or helicopter transit a scene in front of you. One usually can place exactly where it is by distance from direct right to center and then from center to direct left- but if the residue of center is remains with the right and left, much of this sense will be lost. Still, you will be able to get a feeling of the motion but not as clearly as you might if the sound were focussed in actual position. Quite a number of small theaters here and over seas use this technique, except they do not bother to connect the left and right channels.

up

Variations of Implementation
How is this output in your home or in a theater? All home equipment does not have a separate channel for each channel. And it may come as a surprise but not all theaters do either.

The simplest theater implementation can consist of as little as two channels: center and surround. The center speaker in the middle behind the screen and the surround behind the audience - this is also possible in a home situation.

Some installations provide no center speaker- in such a case there is a mode known as Phantom Center in which the center is placed equally on the right and left channels to give the impression of a center. Of course, depending upon where you sit, this may not be adequate. This amounts to a simple matrix implementation without decoding the center channel. In other words, it is not steered to center but left on remains on left and right.

Another possible scenario is the absence of a subwoofer- in this case, each speaker shares a portion of the subwoofer duty.

A better application is obviously to supply the left and right speakers, so that trains that pass from left to right in front of the audience, actually do. This begins to supply the sensation of depth of field in the experience, just as binocular vision does to sight.

up

Surround Delays

Surround provides a sense of depth, the surround delays fit this to the size of the particular theater.

Dialogue is not supposed to appear in the surrounds, but because of timing issues in the nonlinear steering networks, it may. The surround delays are included to mask this effect by making it appear to originate in the front speakers before arriving in the rear speakers, it therefore appears to be an echo. Therefore, a programmable delay- often about 20 ms. is provided so that the theater (or home) can fit the reproduction to the actual environment.

These delays are usually implemented with digital memory chips.

up

Steering

A real sense of spatiality and inclusion comes when the audio is steered to the correct channel only using the energy content of the audio signal to detect and guide this steering. The encoding technique involves a loss of separation between channels, steering is employed to increase this sense of separation. Steering involves determining where a signal belongs and placing there while removing it from the generalized left and right channels. Ideally, then the audio should appear only where it was when it was encoded. Let us say, for a moment, that a scene from a movie is recorded with a narrator whose voice is in the center channel. The matrix should remove all traces of it from the right and left channels and place of it in the center, even though the source of the left, right and center channels are the left and right channels. It should be clear how this could enhance one's sense of position when listening to music- the flute section will be located more precisely just as it would be if you actually attending a concert. The better the matrix, the better the spatial sense will be. In other words, a train moving from left to right will originate in the left (with no trace in the center or right) and move smoothly between the left and center and from there to the right.

With good audio equipment, properly encoded material and all the channels decoded correctly, it should be possible to place a narrator or musical instrument anywhere, even in the audience- or at least give that impression. This depends upon the channel separation and monotonicity of steering.

A poor matrix can give a sense of separation and even position, but you will experience such phenomenon as the train originating in the center, moving to the right, back to the center and then to the left. Other indications of poor steering or inadequate nonlinear (attack) control are sounds of dialogue (or music) beginning in one channel and suddenly moving to center or surround. Often times, these things are simply dismissed or lost in the fascination of the movie (with music it is harder to ignore)- all of which is forgivable for the audience, less so for the engineer responsible.

The matrix we describe here is loosely based upon the original equations developed by Scheiber for encoding quadraphonic sound:

Keep on eye on the coefficients: AS, BS, CS, and DS. These are decoded in the steering network to direct the energy where it needs to be.

The steering logic is a very high gain network capable of correctly representing the proportion of any signal in its respective channel regardless of level. In an analog circuit, this is usually done with VCAs (Voltage Controlled Amplifiers) whose gain is regulated by the sum of all the rectified and filtered channels. If that sum decreases because of a quiet passage, the gain of the VCAs' increases to keep the steering logic correct. It must be this way, so that at low levels, the equipment can still correctly decode the appropriate channel for a signal. If something like this is not done, separation disappears.

This produces a strong steering sense regardless of level. For you reference in the next section, these steering signals are normalized values: AS, BS, CS, DS, you will see these again when we speak of decoding.

You can see, as well, that the center is developed by a sum of the left and right channels and the surround channel is a difference of left and right.

Following this servo mechanism is a ratioing and integrating network that determine the delay before any level changes are fed into the system; this is the part of the circuit that deals with 'attack'.

up

Nonlinear Responses: Attack and Decay

In addition to steering the audio to the correct channel, the matrix must also compensate for sudden and long-term changes in sound level (otherwise known as 'attack') that could potentially confuse the steering logic. This compensation is usually done immediately after the steering mechanism. It inevitably involves delay and integrating circuits such as those created with RC networks.

With no integration of the control signals at all, it is possible to have audio jumping from left or right to the center or surround channels with every abrupt level change. Instead, these networks are tuned to maintain the steering direction based upon the sum of the energy.

A danger you face in creating these circuits with improper time constants is pumping'. The backgrounds and sometimes the foreground sound will appear to vary in level and move between the channels.

And of course, you can not ignore a large change as with an explosion, central cymbal crash, or dialogue suddenly appearing after a long quiet period. To compensate for changes such as these, some sort of break over mechanism, such as a diode, must be employed to route this new level around the integrating circuitry to the steering outputs. Not doing so will result in the appearance of audio in left or right and then suddenly moving to center or surround.

up

Analog or Digital

One point that immediately distinguishes matrices is the question as to whether they are implemented as analog or digital devices. An analog matrix can be much more limited in separation, noise immunity and stability than a digital system. This in addition to the problems that haunt analog circuitry follow and analog matrix- resolution and accuracy problems resulting from component variations, temperature and drift problems, size, noise immunity- on and on.

From my experience, given an adequate word length for resolution, a decent A/D that includes accuracy of conversion and channel to channel monotonicity, a digital matrix has a better chance of producing finer performance than an analog matrix. Digital audio can add a great deal more dynamic range to music, channel separation and stability. Well done, it can provide excellent quality audio.

Of course, anyone can produce a poor product, even with the best of goods. In this article, we will be concentrating on the requirements for a matrix and its functionality. What we say here will be generally true for digital and analog equipment.

up

Software and Hardware Implementations

If the designer of the matrix does not pay attention to the manner in which the original signal was encoded and decode it accordingly, he can have some terrible problems with placement of the audio image, pumping and erratic motion of the audio from channel to channel.

If the input irregularities will produce garbage with the simple matrix, they will do the same with the complex matrix and more due to interaction with the nonlinear networks.

Imagine a system with a small imbalance in gain, left over right. Let us also imagine that we are introducing a balanced monophonic signal into both left and right. According to the equations for the simple matrix, left will obviously be louder than right. With a monophonic signal, center will be slightly louder but no more. But look at surround- a monophonic signal should have nothing in the surrounds, but since surround depends upon the difference between left and right, there will be audio in the surrounds. Generally speaking, it takes approximately a 3 dB change in level for the average person to hear the difference- but it requires much less than this to cause a very noticeable disturbance in the steering of the audio!

And things can be much worse, if proper attention has not been paid to decoding so that the phase is the same as the original signal. I will leave that to your imagination.

    up

   USL, Inc.    |    181 Bonetti Drive, San Luis Obispo, CA 93401-7397    |    Tel. 805.549.0161    |    Fax 805.549.0163