If you do any music production, even at the tyro level like me, you realize that there’s a characteristic “sound” to most pop music: mid scooped and compressed. It’s a very odd sound, highly un-natural, and it’s completely ubiquitous. But it was not the characteristic sound of other eras. In the 20s, music sounded the opposite, with a pronounced mid-hump. In the 50s, the eq profile was “flat.” Why the changes?
The mid scoop is what happens when you boost the highs and lows and cut the middle range. It’s the characteristic eq profile of most modern music. It’s common in guitar tones, it’s common in bass, especially slap-style bass. I took a bass line from George Porter of the Meters, (Funkify your Life)and played it on my bass with the eq set flat:
Here’s the same line with a pretty typical midrange “scoop,” the “smile eq.”
Midrange is the range of the human voice: it’s where our ears are most sensitive. Most people will hear the second bass line as prettier, and the first one, the one that’s flat, as harsher sounding. It’s especially pronounced when you hear it through good speakers. Bass players know that it sounds pretty by itself, but in a mix, with a band, it tends to vanish.
Now onto compression. Live music typically has high volume parts and low volume parts. If you sing, you will typically start a note low, bring it up to full, volume and then taper down. All struck or plucked or bowed notes have an “envelope:” an initial attack followed by a slow or fast decay as the note fades. Musicians typically hit some notes harder than others: it’s part of being expressive.
Compression eliminates those variables. It’s an effect that make the quiet parts loud and brings the loud parts down. Or really, in most music, what it does is take the quiet parts and make them as loud as the loud parts. Compression makes the music “punchy,” because instead of coming at you with the normal attack/decay pattern, the notes start at full volume and end at full volume. Literally all pop music today is massively compressed, at every stage. That’s how they make the commercials louder than the show–they massively compress them. There’s a very good explanation on Wikipedia.
A little loop, bass, drums, guitar uncompressed and unscooped:
And the same loop, mid scooped and compressed
You should hear a pretty dramatic difference. The second one has been compressed and mid scooped on every track: it will probably sound brighter and louder: the playing may sound more professional and less fumbly. To my ears, the first one is way way better, because there’s more there, but what’s there has more human mistakes in it.
Compression seems to have originated with AM radio, way back in the day. Radio broadcasters wanted to make their stations jump out when users twiddled the dial. They figured out that if they compressed the signal, it would sound loud and direct and “in your face.”
Compression also makes up for weak playing. A really good musician has control of the dynamics. A weaker musician, like myself, does not. Compression makes the things I play much more even, “tighter,” more uniform. I’ve often suspected compression came into use around the same time as rock music, when there were more and more bands who really could not play that well. But now it’s the default taste. Listen to a Lady Gaga song–there are almost no dynamics, everything comes at you with the same punchy attack, heavily, heavily compressed, not like actual untreated live music at all.
Just to try to make it clear, here’s that same bass part, uncompressed:
and compressed:
All explanations for the mid scoop depend on psychoacoustics, and specifically the fact that our ears tend to “fill in” what isn’t actually there. It’s sometimes argued that the taste for mid scoop comes from the rise of earphones/earbuds. Earbuds have tiny speakers and they really can’t reproduce bass frequencies very well. Open earphones, the kind that don’t seal against your head, are even worse. To make them work, music producers pumped up the bass and treble, so you heard bright sparkly highs and whompy lows, and your ear sort of filled in the rest. A good set of speakers, on the other hand, will give you the full range, and the smiley eq can sound fatiguing and artificial and hollow.
The other explanation may be the mp3 format. A music track on a cd, a three minute song, will be around 30-40 megabytes. The same song as an mp file will come in at 3-4 megabytes–smaller by a factor of ten. Whoa, dude, where’d all my information go?
Mp3s work partly by sucking out the midrange frequencies, because your ear’s tendency to will fill in what isn’t actually there. The classic illustration of this is a phone call. Your iphone cannot physically reproduce the actual frequencies of your male friend’s baritone voice. It only actually reproduces the midrange frequencies, and your mind/ear “fills in” what’s missing and tell you it’s Bob, with the deep voice.
If you compare a good cd track with a typical mp3 of the same track, side by side on a decent reproduction system, the mp3 will sound “hollowed out.” Try, for example, a Frank Sinatra recording from the late 50s/early 60s., back when they did not do much fiddling. But if it’s a modern song, designed to be played on earbuds, it won’t sound that much different.
In the 20s, all recorded music had a midrange hump–it emphasized the midrange. That’s because recording equipment could not capture the extreme frequencies, either the lows or the highs, very well. Listen, for example, to a Louis Armstrong Hot Five or Hot Seven recording. By the 1950s, recording equipment had become extremely sensitive, and the eq profile was mostly “flat,” even in all frequencies, much as you would hear it live. Listen to the classic Miles Davis album Kind of Blue for an example.
In the 1990s, the taste for mid scooping and compression became really pronounced. On the dance floor, or in a car driving by, people wanted that whomping bass, and the sparkly top end: a Madonna song like Ray of Light is a good example. It’s a sonic confection: all sugar and salt and no other flavors.
Leo Fender, the great instrument and amp maker who founded Fender Musical Instruments, often compared his sonic goals to lemonade. When you drink lemonade, he said, you want to taste the sour lemon and the sweet sugar–everything else is “fluff.” He wanted, he said, to get rid of the fluff, and fender’s guitars and amps were designed with a scooped eq in mind.
It’s easy to argue this is kind of a juvenile taste, a junk food taste, a taste for extremes. Is it accurate? I’m not sure. It’s also possible what what appeals about lemonade is the impossibility of two opposite tastes, sour and sweet, combined: that we like to have natural oppositions overcome, and so we like a track that is both all bass and all treble for the same reason we like foods that combine salty and sweet, or dramas that depict love and hate in equal measure.
Update: I got a couple comments that saw this post as a critique of modern music–it’s really not meant that way. It’s meant as a possible account of changes in taste, and a possible explanation of why. I suspect lady gaga would be delighted to have her music described as artificial and a snack food confection, and I like snack foods as much as the next guy.
Update two: Several people have suggested that mp3 does not reduce mids. This is not what I had read in earlier research, but I’m not an acoustician or a computer guy–so if anyone has some links or source with better information, post away!
Update three: “mp3” can mean different things–mp3 files can be encoded at different samples rates. The smallest files have typically suffered the biggest losses and will sound worse. At higher sample rates, it’s extremely difficult to tell mp3 from other formats. It’s possible that the mid scoop is an artifact of the early days of mp3, when more “lossy” tracks were common. Its also possible that the prevelance of mid scoop comes from the fact that mp3’s sometimes strip out the very low and very high frequencies, the ones we hear less. In that case, the mid scoop would be a way to restore “flatness.”
It is also true that while mp3 was designed to be uncolored, many people hear a difference: Wikipedia’s entry for “mp3” includes “test given to new students by Stanford University Music Professor Jonathan Berger showed that student preference for MP3 quality music has risen each year. Berger said the students seem to prefer the ‘sizzle’ sounds that MP3s bring to music.[44] Others have reached the same conclusion, and some record producers have begun to mix music specifically to be heard on iPods and mobile phones.[45]”
It’s still the case that Leo Fender, who helped devise the sonic signature of post WWII pop music, wanted a scooped midrange sound. He saw it as pure and essential and clean, uncluttered by “fluff” in the middle.
I wore big-ass headphones walking around my college campus back in the day. Years later I installed an outrageous car stereo when I lived out in the burbs and drove everywhere. It was so fancy that I couldn’t listen to the radio anymore because the broadcast compression was too obvious and too fatiguing compared to CDs.
Still, I think your MP3 and iPhone-related assertions really undercut your other observations in this post. Also, you are a Luddite.
It confuses matters to conflate the lossy encoding of MP3 files with the dynamic range compression in studio production. MP3s, earbuds and iPhones play no arguable role at all in your indictment of modern day production techniques. MP3s sound great, earbuds are just handy for kids walking to class, and iPhones sound as good as any MP3 player if you jack them into decent speakers.
There’s no way either of us could tell the difference between some nice MP3s of Bitches Brew and the lossless FLAC files. Our ears aren’t keen enough and our reference monitors aren’t fancy enough.
Production value is as high as it has ever been. Music targeted for younger audiences might certainly suffer to our ears from pop or club mixing and mastering techniques. But just as you wouldn’t buy a bedazzled Dora The Explorer backpack or sneakers with flashing lights in them you need to let Gaga and Beyonce do their thing. Those tracks are for places we don’t go filled with people we don’t know. They sound just right for that context, and better than Louis or Miles would sound through a wall of sub boxes. Music is just application-specific like that, always has been.
Nothing sounds as good through earbuds and most folks don’t care – or at least don’t care enough – I’ll grant you that.
But maybe you just need some good new music recommendations. Have you heard Voodoo by D’Angelo? It’s awesome. And well-produced.
Interesting that you should read it this way–it’s not at ALL intended as some kind of “all modern music is no good” post. This is something I think people read into it–I like lemonade, I like fender amps, I sort of like lady gaga, but not in the same way as I like Miles, and why would I–they are working totally different terrains.
I think this is an example of ageism–I’m older than you, so you assume this is a “watsa matta with kids today” post. It’s just an observation about difference and some speculation about what it might mean.
You might be right about conflating two things–in the history of technology this is the argument for treating all specific pieces of technology as “technological systems,” in the way that lightbulbs require power generation require mass produced stranded wire require rubber insulation etc etc. I probably need to re-examine that part.
Now get off my lawn!
Yeah, we’re on the same page now: it’s only the poor fit of both the tech examples (MP3s, earbuds, and iPhones) and the music examples (Gaga vs some Blue Note jazz giants) that make you sound like Clint Eastwood.
There’s no such thing as “music designed to be played on earbuds”. The novel look at the midrange scoop vs. hump throughout history is the fun and interesting angle in this post. I’m just saying that the anecdotal stuff you pulled together to contextualize it undercut the value of that angle.
It’s hidden in academic language, but the criticism of Gaga and the exaltation of Miles and Louis is in there. It’s not something I’m reading in. You’re not comparing any of our beloved jazz players to salt water taffy or whatever rotted your teeth out back then. The terms “juvenile” and “junk food” are pinned on the likes of Madonna, while language like “extremely sensitive” and “even in all frequencies, much as you would hear it live” is reserved for Kind Of Blue.
And yes, I still haven’t forgiven you for mistaking Pompaloose for being superior to Beyoncé. I think you need new speakers to go with your new D’Angelo album. =P
Fine

[…] getting a lot of generous and smart comments on my initial post about compression and mid scooping I decided to take another crack at it. The basic fact, described here, is that modern […]
I believe that when Fraunhoffer originally developed the mp3 codec, they worked under the premise that we are most sensitive to the midrange, least sensitive to treble above 16 Hhz, and deep bass below 60 Hz. Mp3 compression does the exact opposite of what this article speculates, it preserves the midrange and cuts treble and bass with filters. The higher the bitrate the higher the low-pass cutoff is (192 kbs usually cut it off at 20 kHz).
The article’s discussion of the mp3 format should be simply deleted for being incorrect.