Experimenting With 5.1 Downmixes

That’s how it always goes, doesn’t it? I think I have nothing to talk about, and then I scramble around all day with stuff to ramble about. In this case, we’re talking about the intricacies of audio mixing! Namely, abusing surround sound mixes for fun and dynamic range. This is, by its nature, a more technical experiment and ramble, but if you’ve got headphones, you’ll get the gist just by the sample clips.

But first, some maintenance and gear tune-up before we start today. πŸ‘ Left, right, πŸ‘ center. πŸ‘ Left surround, right surround. πŸ‘

The 5.1 fart cloud

Gimmickery in the form of increasing the number of channels in an audio recording has been going on since the 60s. For over 100 years, sound recordings were pretty much all in mono–you had one channel. The 60s proved the stereo mix to be viable with the experimentalism of the album rockers, and as such, more channels followed. You had quadraphonic vinyl, you had Dolby AC-3 on Laserdiscs, and the subject of today’s discussion, 5.1 DTS surround on DVDs.

Now, stereo might’ve changed audio forever, but no surround setup has ever made a dent in the consumer realm. Indeed, most people have a mono speaker on their phones and watch movies through that. Immersion matters none to normies. The natural question: is surround underrated? Some forgotten gem of a feature? Does it blow minds?

For movies, I can see the appeal of surround sound. Movie mixing is super meticulous, and building a spatial image of each room, piping in room tone to all and dialogue to some, punctuated with woofersplosions? Pretty neat. (Assuming you can find some good movies with full surround soundtracks to watch, of course, but the tech has been there since the 80s.)

For music, I’m less convinced. For one thing, artists routinely fight with even stereo recordings for not having any “balls”. It’s very easy to suck out the “aggression” or forwardness of a sound by panning it, or by having a ton of diffused reverb that obscures the attack of a sound. Applying that logic to six or more discrete channels is a recipe for making any sonic wizard say “fuck it” and go back to recording in mono.

The big appeal of surround is getting a very natural spatial image. Problem being, studio music will never and is not meant to sound “natural”. Everything you do in a studio is unnatural. It’s not four people playing to one another in a big room; if it’s not processed to hell, every instrument recorded one-by-one to a click track, everything’s at least recorded isolated behind walls and in nooks with the intent to minimize bleed. You can’t really make a natural spatial image out of something that never had one to begin with.

Live music is the one place I can see a surround mix working, but even that mostly seems to work out the same: band in the left, right, and center, audience in the left and right surround. (Not like you could have part of the band be behind you; that’d be annoying.) The results are interesting, certainly wide and natural sounding, but hardly revelatory, and hardly worth it over a normal CD.

Where surround becomes relevant to me: dynamic range

Consumer media has a tendency towards very lenient quality control (read: no one in the chain between label and buyer giving a shit), and music is no exception. The loudness wars have waned in recent years, as has my interest in them (here’s an essay I wrote on it for school ages ago), but it still kills albums dead and I still try to avoid loud, crackly, shitty, brickwalled records if I can.

The surround versions of albums, usually on SACD or DVD-Audio, avoid the loudness war treatment for two big reasons:

  1. It’s easy to squish two channels of audio, but six with extreme panning, plus bass-boosting as most loudness war-afflicted records get, is a recipe for a genuinely unlistenable mess
  2. It’s not consumer media! It’s enthusiast media with a much smaller install base and audience, meaning there’s less of a reason for labels and engineers to squish albums on these formats to death

As a result, if you were to get an album on SACD or DVD-Audio, extract the 5.1 surround mix, and downmix it to stereo, you’d theoretically have a much clearer, cleaner, and nicer sounding version of your album you can drop on a CD or an MP3 player or whatnot. (Depending on how well the surround mastering was done on the release, anyway.) With some added assembly, yes, but when has that ever stopped me?

Now, I don’t own any DVD-Audio titles, not yet, but I got curious enough about the process to give it another try with one of my other music-related DVDs with a surround mix. Here’s how it went, and I’ll even have some sample clips to play for you at the end!

Introducing the test subject (and a lot of technical junk)

I picked Nirvana’s MTV Unplugged in New York for this, given that I’m very well familiar with the CD of it and I also happen to own the DVD, which comes with 5.1 AC-3 and DTS soundtracks. It’s also a lovely sounding release, a nice, clean acoustic performance to see how things like speech, instrument separation, and the overall stereo image fared in my downmix as compared to a dedicated stereo mix of the album.

Basically, we’re testing which one sounds “better”. That’s highly subjective, as you’ll see, but they’ve got very different sonic trademarks that’ll probably make you prefer one over the other, and I wanted to see just how different a downmix would sound overall, on both my headphones and my speakers.

I doubt there’s much audible difference between the two encodings, but I chose the DTS version. My extractor of choice was DVD Audio Extractor, which is trialware, but uncrippled trialware. You can rip a few DVDs in a month, certainly.

My three versions of Nirvana's MTV Unplugged
I own too many copies of this. I didn’t even buy the cassette, it just kinda fell into my lap. (Also classy cropping Lori and Pat out of the DVD cover, Geffen.)

(Before we begin, I should note my downmix might not be the same as someone else’s, or the downmix that’d come out of software I didn’t use. It kinda depends on how you pan stuff around, volumes of the individual channels, all that good stuff. I used this post from Cinema Sound for the levels I set each of the channels to. I also did some peak equalization after the fact, and then RMS normalization on the clips below so volume doesn’t bias the comparisons. Basically, your mileage may vary.)

(I’m also well aware that DTS and AC-3 are both lossy digital, not lossless. I don’t know the exact bitrate on this release, probably around the ~500kbps mark. In any case, I can’t really tell, and I always assume that lossy separated audio needs a lower bitrate because of the lower complexity of each individual channel. Point is, I’m not concerned about it. The upgraded DTS and Dolby on Blu-rays are properly lossless codecs, so if it’s a sticking point, rip that release instead if possible.)

My DVD Audio Extractor settings
Protip if you install this: use FLAC. Your computer will eat less shit.

I ripped the audio to PCM, six channels, at the 48khz that DVDs use, and then threw the lot into Audacity. This ended up being…a little slow, but all credit to Audacity, after it got waveform renders for each track, it still performed fairly speedily. I still got rid of each individual channel once I did the actual downmix. (If you’re curious, left, right, and LFE were left at 0db, I dropped the center and the left and right surrounds by 3db, and made the center dual mono on a stereo track to cope with an Audacity defect that’s too technical to get into here.)

Of course, I can’t judge this disc based on its actual merits in a surround sound setup, but I can talk a bit about the pieces. True to my speculation, the band is in the left, center, and right channels, with the audience and some of the “room” mics in the left and right surround. The LFE channel is largely unused on this disc, a tiny bit of the very low end of the kick and that’s about it. I mixed it in there just for the hell of it, but it doesn’t affect the sound too much.

Here’s some sample clips of the center, left-right, left surround-right surround, and LFE channel of the album’s first song, “About a Girl”, in FLAC. You can play along at home with them if you’re audio-inclined. I only linked them because this is plenty long as it is and I still haven’t gotten to the main attraction.

Speaking of that–onto comparing my downmix to the CD!

Comparing the mixes

The waveform of the entire downmix, and the CD tracks synced up to it
You can really see just how much of the show they excised for the CD when I put them side-by-side.

I figure that fewer words is more at this point, so I’ll leave you mostly with my notes from last night and some sample clips so you can finally hear what I’ve been going on about yourself. I’ve picked what I think are probably the two more representative songs here, “Jesus Doesn’t Want Me for a Sunbeam”, which has an awful, accordion-heavy mix on the CD, and “Polly”, which is mostly Kurt solo with Krist’s bass and Dave’s cymbals. I also left in whatever stage chatter was happening during the clips.

“Jesus Doesn’t Want Me for a Sunbeam”, 1994 CD
“Jesus Doesn’t Want Me for a Sunbeam”, 5.1 downmix
  • The CD mix is absolutely more compressed and pumped up and has a more “rock” feel to it. The solo on “The Man Who Sold the World” is properly fuzzy on the CD, as is everything else in the mix at that point, while on the downmix, everything stays clear and defined. Whether or not you prefer it depends on your tastes.
  • The vocals across much of the record are much drier on the downmix. This is especially noticeable on “The Man Who Sold the World”, where there’s a whole lot of reverb on the CD mix and virtually none on the downmix.
  • Much of the dialogue and stage banter has been excised from the CD version and got restored on the DVD. Trust me, Kurt deadpanning “who cares, it’ll be edited, this is a television show!” makes all the difference. Similarly, some of Kurt’s mistakes (coming in too early on “Lake of Fire”, mispronouncing David Geffen’s name into “Where Did You Sleep Last Night?”) are edited out on the CD, I guess to make the show play a little better to normie ears.
  • The CD mix across all tracks has more stuffed in the middle, namely Kurt’s guitar. The downmix has things more panned across the stereo field; Kurt’s guitar and the ride cymbal are in the left ear, while the hi-hat and Pat’s guitar are in the right ear. Only the bass, Kurt’s vocal, and the snare and kick are properly center.
  • The reverb difference affects everything on “Polly”, “Pennyroyal Tea”–any of the more solo Kurt songs. Kurt sounds like he’s singing in a slightly artificial cavern on the CD, while it sounds much more naturally large and open on the downmix, like you’re right next to Kurt and the reverb just hugs the edges.
  • Dave’s backing vocals are also much clearer in the downmix across the entire record. On “Jesus Doesn’t Want Me for a Sunbeam”, they’re not even particularly audible, while they’re comfy in the center on the downmix.
  • The crowd feels more in the background in both ears on the downmix. They seem to be pumped up on the CD mix for a more “live” feel.
  • Genuinely, I am so pissed I’m only now hearing the full stage chatter. Before “Where Did You Sleep Last Night?”, people call out more requests, and some bat in the back yells for “Rape Me” (hilarious). Dave asks “is that Kennedy?” (The MTV VJ, the one “Name” by the Goo Goo Dolls is about?) And Kurt just goes “I don’t think MTV would let us play that?” This entire show is just hilarity at MTV’s expense.
“Polly”, 1994 CD
“Polly”, 5.1 downmix

(More song-by-song notes in a text file, if you’re curious. Not that I expect it to be relevant to anyone but me, but due diligence.)

And the end result?

I like the downmix a lot more. Aside from a few nitpicks (but that’s really only if I’m listening critically), it sounds more natural, wider, more defined, and certainly less “sweetened” than the CD mix. The CD mix to me has a lot of vaguely artificial reverb meant to make the band sound bigger, while on the downmix, they’re a lot less processed.

In all honesty, I didn’t compare either one to the DVD’s stereo mix, largely because I’ve now heard this album three times in the past 24 hours and I’m just not listening to it again. I did check in the DVD’s liner notes, and it seems Scott Litt, who mixed the original CD, also did the stereo mix on the DVD. (The surround mix was done by a different guy, Elliot Scheiner.) I expect it to be fairly similar to the CD as a result.

Regardless, it certainly is an upgrade! At least in some conditions. I mention in my extended notes that the Meat Puppets tracks feel weirdly empty compared to any of the other songs, to the point where I prefer the original CD mix of “Oh, Me” no matter how I listen to it. I liken it to a mono radio mix, very filled, very sweetened, while the downmix is almost like a stereo vinyl pressing of an album you’re using to hearing in mono. Both are good for different types of listening; one’s more accurate, one would probably play better in a car or where there’s a lot of noise.

Either way, when I dig this album back out, I’ll probably reach for the CD-R I just burned, if only to hear Kurt ramble about how Davey and Goliath was evil for promoting the tarring and feathering of children. Why they cut this shit out, I don’t know.

Comments are closed.