The Five Families of MP3 Encoders

It's been a recurring fascination of mine, that being data compression. It's what makes the modern internet possible at all. Only recently have we been able to share multi-gigabyte files quickly and fairly cheaply. In the 90s? There was no question that whatever you uploaded need to get squished somehow. If you wanted to share music with friends who weren't nearby, you sent MP3s. Your only other option was to mail them a CD. (Which was certainly better sounding, but you had to pay for postage. Tradeoffs, tradeoffs.)

MP3 is probably the poster child for media compression, even close to 30 years after its introduction. It's second only to JPEG in ubiquity, and with neither being protected by patents at this point, there's no stopping them, regardless of whatever new technology Google wants to make you use this week. But once upon a time, that wasn't quite the case! In fact, not all MP3s are made equal! When Fraunhofer, the German company that developed much of the MP3's technology, was still fiercely guarding its patents and intellectual property, knockoff MP3 encoders of various stripes popped up across the internet, all sounding very, very different from one another.

Data compression of any stripe is a rabbit hole topic, but I thought it'd be curious to take some songs and run them through different encoders and compare the differences in quality and speed. I'll also be rambling a bit about the innards of an MP3, and how five different, equally playable MP3s can all sound so massively different. I've even got a robot listening buddy on board!

How can there be any different-sounding encoders out there, let alone five?

MP3 is a lossy compression codec, which means it'll toss out information to produce a smaller file, the goal being that it's data you can't hear or won't mind if it's gone. In the case of MP3, your audio (which is normally represented as a series of hundreds of thousands or millions of voltage samples) is converted to a series of frames and then split into 20 bands of frequencies in each frame, and the encoder will determine which bands are the most audible and which ones can be encoded with lower accuracy (again, in each frame). It's actually not far off from how JPEG works, which splits your image into 8x8 blocks and converts the individual pixel bitmap data in each block to lower-accuracy frequencies that represent roughly what each block should look like. This is where JPEG infamous blocking artifacts occur.

There's a lot of misinformation, consumer confusion, and overall snake oil about MP3s. A lot of people think they can skirt the loss of information by using a bitrate of 320kbps (don't, it's just irritating bloat and you're still throwing away data, use a lossless codec), or they think you can take a lower-bitrate MP3 and convert it to a higher-bitrate one or even to WAV for better quality (you can't, the data is permanently gone, start with a lossless or uncompressed original), but one place that MP3 is genuinely a little strange is on the encoder side. Using different encoders will get you different results!

How can that be, though? Don't they all just make MP3s?

MP3 is interesting because its specification is more strict on the decoding process than it is on the encoding process. In other words, so long as the encoder produces an end MP3 that looks and is structured like one, it can write pure noisy garbage and all is up to code. While the MP3 specification does have some sample psychoacoustic models (a whole bunch of math to approximate what we can hear and what we can't) and the like, if you were writing an encoder, you aren't bound to use any of it. You can come up with your own way to throw away data with impunity. The thought at the time being, if better models were to come along, we could encode higher-quality MP3s that still worked on all our old gear. And largely, that's what happened, as you'll soon discover.

So what are the families?

The five "families" of encoders are:

Fraunhofer-based encoders (l3enc, mp3enc, fastenc). These are the official encoders you could license from Fraunhofer if you wanted to build them into a product. If you've ever converted a song to MP3 using iTunes or Adobe Audition, it was using a Fraunhofer encoder.
dist10-based encoders (8hz, SoloH, Blade). "dist10" was the reference encoder for MP3 files, and infamously, its code was stolen from a university server and used to create a ton of identical, competing encoders. These all sound rather artifact-y, given that dist10 was meant to be a reference for codec implementation and not a brilliant quality encoder. Nearly all dist10 encoders were struck down by Fraunhofer, who sent many, many cease-and-desist orders to various projects using their code. Notably, 8hz was a project focused on rewriting dist10 for speed, and what LAME originally patched before reverting completely to a dist10 base.
Xing (and later Helix). Xing (zing) is an interesting case in that its codebase is completely unique to it, and written plenty in x86 assembly, which means it ran much faster than other encoders. (The fact that it was completely custom also kept Fraunhofer from going after them.) Xing was later bought by none other than RealNetworks, who continued development and distributed Xing instead as Helix, later making it open-source completely.
LAME. Ah, the big one! The highest quality one, the one with the most work put into it, and the best known. LAME started as a patch on the 8hz (and later dist10) sources, which is what helped to keep it off Fraunhofer's radar at first; not redistributing Fraunhofer code means Fraunhofer can't get upset, yeah? Later, since it was only distributed as source code and not binaries directly (their position being "source code counts as speech" and written descriptions of patents aren't illegal), LAME was again bulletproof. Since the expiration of the pertinent MP3 patents, FOSS OSes have included LAME directly, and it continues to get updates to this day.
Shine. A unique encoder built for simplicity rather than speed or quality, essentially becoming a new encoder programmer's scrap wood project. It also happens to have a fixed-point version, the only open-source encoder that doesn't therefore require an FPU on board to work. This one was written by Gabriel Bouvigne, who's one of the main LAME developers.

Comparing the encoders

For my tests, I chose two songs, both 44.1KHz and lossless (one from CD, one from Bandcamp). One's a well-mastered, clean-sounding crunchy rock track (27.7MB) with tons of cymbal noise, and the other is a fairly dry, low-key acoustic guitar number (23.5MB) with two harmonizing singers recorded to fairly noisy tape. Both of these have plenty of characteristics that could easily trip up less capable MP3 encoders, especially at low bitrates. (Both of those links are to FLAC encodes of the original WAV.)

Each encoder was set to encode a 128kbps CBR MP3 file. 128kbps is considered rather low these days, but was incredibly common when a lot of these encoders were competing. It's also a good, medium bitrate where artifacting will be noticeable, but not irritating. For projects that survived longer, I've used both period-appropriate and new versions for completeness. The encoders used are:

LAME 3.90 (December 2001) and LAME 3.100 (October 2017)
XingMP3 1.5 (January 1999)
BladeEnc 0.94.2 (May 2001)
l3enc 0.99a (March 1994) (!) and Fraunhofer MPEG Audio Layer-3 Codec (professional) 3.4.0 (April 2006)
Shine 0.1.4 (November 2005)

One final bit of housekeeping involves a certain "ODG score" you'll see in the table below. All of these encoders were sourced from a site called ReallyRareWares, which specializes in mirroring bizarre, old-school audio software, be it encoders (for MP3, AAC, or far more obscure formats) or players and editors. In browsing RRW, I came across a program called EAQUAL, which is essentially a robot taking a listening test. "ODG" stands for the Objective Difference Grade, or how big a difference it can hear between the original and each MP3. The scale goes from -4 (which sounds terrible and annoying) and 0 (which sounds exactly the same as the original), so the closer to 0, the better.

While I've of course listened to each sample thoroughly myself, I used EAQUAL to get a second opinion on which encodes sounded the best. Surprisingly, we agreed most of the time. Linked under each score is the exact output of the program in each trial, which is mostly a lot of other numbers. I'd say run it yourself on each of the samples, but they need to be perfectly time-aligned (not a sample off!) WAV files—so you have to encode to MP3, import into an audio editor, line the MP3 up perfectly with the original file, then transcode both back to WAV. Annoying.

Anyway, let's get to the clips! These are ranked worst to best. First, the crunchy rock song:

Comparing various encodes of c.layne's "You Dodged a Bullet"
Encoder	Time	ODG	Comments
l3enc 0.99a	132.7s	-3.54	Okay, this one is super unfair, but mostly, I wanted to see how the quality was from the very beginning. Answer: it sounds rough. There's incredibly distracting twinkling and graininess over the entire track, basically. ffmpeg's MP3 decoder (like used in Audacity) also treats this one like it's skipping constantly, and thus only lasts half the runtime. foobar2000 plays the file without issue.
Shine	7s	-2.93	The watery, tweeting artifacts are super strong on this one, and there's a hissing, ringing whisper on the hi-hat and vocals in the verses. There's also a "blip" at the very start of the track, meaning the encoder took data somewhere as audio. I had no metadata chunks anywhere in my input WAV, I promise. Shine is far closer sounding to the better-performing encoders up at 192kbps (and quite nice at 256kbps)—even though this means a respective 50% and 100% increase in size.
BladeEnc	4s	-2.84	Super noticeable whispering on the vocals, even before the band kicks in. Basically any time the snare kicks, the track warbles. I swear I'm hearing a lot of what I can only describe as "MP3 flutter" if I listen super closely.
XingMP3	2s	-1.59	Surprisingly listenable! The swishing is mostly noticeable in the crash cymbals in the choruses, which are panned hard left (hard panning is notoriously difficult to encode). It definitely sounds like a 128kbps MP3, but a not at all offensive one. Would encode music to put on a Rio S50 with.
LAME 3.100	4s	-1.34	State of the art for 2017 should damn well come out on top, but not quite, something I'll explain in the comments for LAME 3.90. Fairly close to Xing, but ever so slightly cleaner. The hard panned crashes don't trip it up quite as hard, but the ride cymbal still rings very faintly through the entire chorus. It doesn't sound quite as "full" as Xing? Not too sure how that works, like Xing is either exaggerating the bottom end of the track or LAME is suppressing it.
FhG 3.4.0	2.8s	-1.50	About neck and neck with LAME 3.90! I thought I was noticing some clicky distortion bits in the quiet, vocal-only intro, but that's just how the song sounds. I think the warbling ride in the choruses is just a tiny bit more noticeable here, and the hard-panned crash sounds fairly different from any of the other encoders somehow. ffmpeg decodes these files without issue, unlike the old l3enc.
LAME 3.90	4s	-1.45	Yes, LAME 3.90 came out on top of all the encoders, for this song at least. It somehow sounds cleaner than its cousin from 15 years later! I'm not hearing much artifacting at all on the snare in the verses, while on 3.100's encode, it sounds like the snare is "breathing" slightly. I swear the ride cymbal ringing in the choruses, while still present, is much less noticeable, though the hard-panned crashes definitely warble some.

And for the acoustic song:

Comparing various encodes of alaska!'s "Nightmare X"
Encoder	Time	ODG	Comments
l3enc 0.99a	110.8s	-3.39	This one made me laugh. Each pluck of each string becomes a dead, wobbly mess, and the vocals need to be heard to be believed. Literal underwater noises.
Shine	6s	-2.91	Again with the blip at the start of the track. Funnily enough, both happened in the left ear, suggesting an encoder defect regardless of what you feed it. The tape hiss becomes a bed of watery noise, the guitar picking is definitely smeared, and the entire mix starts to warble when the vocals kick in. Bad/10.
BladeEnc	5s	-2.26	A little messy on the guitar and tape hiss, but again, the vocals are really what trips this one up. They seem a lot quieter than they should be at first, as if it filtered out all the body in their voices.
LAME 3.90	3s	-0.76	Surprisingly, 3.90 seems to handle the tape hiss the least gracefully of the top encoders, but still plenty listenable.
FhG 3.4.0	1.8s	-0.46	Clean encode, and the fastest of the bunch. The piano in the bridge sounds a little smeary, but only very briefly.
XingMP3	2s	-0.41	The guitar sounds incredibly clean on this one, very impressive! Even their voices came out pretty clean. No complaints at all about how Xing's performed over these two songs, given its age. Impressively, EAQUAL liked it the best too.
Lame 3.100	3s	-0.47	About on par with FhG and Xing. All four of the top encoders in this test are rather interchangeable. I suppose it's not the most complex or brightest-sounding song, but hey, not all of them are.

Results, part A: encoder caveats

l3enc and FhG both hate you. l3enc has to be run on a 32-bit machine, and I also needed to copy it to my desktop, as it would not run off a network drive in my XP VM. It also won't take long file names, so make sure it's 8.3 or no MP3! Unbelievably, when the software quits out with a fatal error, it'll actually tell you "better luck next time, bye". (It is much more courteous when you successfully complete an encode, which is guaranteed to take at least two minutes.)

Sample command: l3enc_fp.exe nightmarex.wav nightmarex_l3enc.bit -br 128000

FhG, on the other hand, is most easily accessible (without paying money or installing iTunes for Windows, ick) through the ancient ACM (Audio Compression Manager) framework. This means that you have to encode through another program, and not much works with ACM these days. I couldn't get ACM Station to take any of my WAV files (it said they were all corrupt when they very much were not), but ACMENC (which is command-line) worked fine (after I figured out how to specify the codec in a way it liked...)

Sample command: amcenc.exe -c "Fraunhofer IIS MPEG Layer-3 Codec (professional)" -b 128 nightmarex.wav nightmarex_fhg340.mp3

XingMP3 comes with a not terrible GUI, just one that refused to work and instead took me to Modaldialogboxville when I tried to use it. Indeed, I got the popup spam every time I tried to do anything with it. Worth noting is that it didn't come like this, so clearly, I broke it somehow in trying to move the program to my Dropbox, where all my audio encoders are. Luckily, Xing is just a frontend for x3enc, which isn't picky at all and works like any other command-line encoder. Impressively, x3enc will actually batch encode a ton of WAVs if you give it folders rather than input/output files, which is a really nice feature.

Sample command: x3enc.exe nightmarex.wav nightmarex_xing.mp3 -b 128000

BladeEnc is probably the most featureful (on the user end, it doesn't even support VBR) of the bunch, with a ton of options for setting information about raw input (if you're using it), the ability to encode to stdin and stdout, drag-and-drop functionality (which makes a standard stereo 128kbps MP3), and support for settings files as opposed to command-line switches. Just a shame it sounds like crap, but that's dist10 for you.

Sample command: bladeenc.exe -128 nightmarex.wav nightmarex_blade.mp3

LAME 3.90 and 3.100 work fairly similarly. Older versions of LAME are mostly just pickier about metadata and exact syntax.

Sample command: lame.exe -b 128 nightmarex.wav nightmarex_lame.mp3

Results, part B: the, uh, results

So what didn't surprise me: l3enc and Shine performed like garbage. l3enc being such an early encoder and Shine being so simple explains both, but it does show you can have wildly varying results, even if the resulting MP3 is perfectly playable (mostly, in l3enc's case). BladeEnc performed slightly less like garbage. I've used BladeEnc to destroy samples of audio for my music before, so I knew it wasn't exactly a top encoder. Both versions of LAME and the ACM Fraunhofer encoder performed wonderfully. LAME was head and shoulders above nearly everyone else, even 20 years ago (which is why you'll find even very old LAME encodes rather frequently), and Fraunhofer stayed competitive just by being a giant corporation with the money to throw at codec research.

What did surprise me was how incredibly well Xing did. In both tests, it regularly output incredibly clean MP3s for its age and for the low bitrate I gave it, and it always came in first place or runner-up in execution speed. Obviously, this isn't a very scientific listening test, and I only tried out two songs. Both of those songs sounded ace though. In a world without LAME, Xing would be the go-to easily available MP3 encoder—and if you're looking to encode audio on older computers or for older devices, Xing is still very well worth a look.

EAQUAL and I also have a surprisingly similar set of ears, and I very much appreciated that. I was genuinely pretty hype when I saw it preferred Xing out of every other encoder in the "Nightmare X" test. Feels good when someone agrees with you, especially if that someone is a robot.

Finally, the sizes of all the encodes didn't vary all that much. All of the encodes of "Nightmare X" landed around 3.8MB, and all of the encodes of "You Dodged a Bullet" came out to around 4.3MB.

Wrapping up

MP3s are a wonderful bit of technology I think most people don't think too much about. We've only had lossy compression for a good 30 years or so, and already, the fact that something as complex as a piece of music can get transferred across the world in ten minutes in the worst possible scenario and near-instantly in the best is absolutely wild. When Apple proclaimed that you could carry a thousand songs with you at any given time, regardless of how you feel about Apple, they weren't wrong. That had simply never happened before. Best you were gonna get was 80 or so across a few CDs, and they weren't about to fit in your pocket.

But even wilder is that MP3s are just the most visible example of this crazy world of smaller and smaller sizes for such complex data. Did you know there was an MP1 and MP2? If you're listened to terrestrial radio in the past 15 years, likely, you were listening to an MP2 stream broadcast over the air! If we're willing to get really experimental, try the MP3pro and HE-AAC codecs, which actually recreate some of the audio data on-the-fly so they can throw more of it away. And these days, Opus is the dominant ultra-efficient codec, built half of what Skype used to use (SILK) and half of a more MP3-like scheme (CELT) for the best of both worlds in terms of audio quality. And somehow, it streams ridiculously fast and stores ridiculously small. Used Discord's voice chat lately? Opus.

I'm likely to make more pages about the ins and outs of the audio world. Even back in 2003, all modern lossy codecs of the time were more than good enough for casual listening, let alone these days. And as I get older, I kinda dig the sizzly, slightly off sound of it more and more. Not to mention, you can take MP3 with you easier, especially if all you've got is 128MB to work with.

Where to get the songs used

The final mix of "You Dodged a Bullet" by c.layne is available on his Loom EP. Alas, he's since gotten rid of the original mix, and this might be the only place you can get it at all now, let alone in full quality.
"Nightmare X" by alaska! is from their debut record Emotions. It's a wonderful, underappreciated folk rock album with lots of mood and great harmonies on it.

This page last updated June 16, 2021.

*stares at old MP3 players and wants them all*