If you do any music production, even at the tyro level like me, you realize that there’s a characteristic “sound” to most pop music: mid scooped and compressed. It’s a very odd sound, highly un-natural, and it’s completely ubiquitous. But it was not the characteristic sound of other eras. In the 20s, music sounded the opposite, with a pronounced mid-hump. In the 50s, the eq profile was “flat.” Why the changes?
The mid scoop is what happens when you boost the highs and lows and cut the middle range. It’s the characteristic eq profile of most modern music. It’s common in guitar tones, it’s common in bass, especially slap-style bass. I took a bass line from George Porter of the Meters, (Funkify your Life)and played it on my bass with the eq set flat:
Here’s the same line with a pretty typical midrange “scoop,” the “smile eq.”
Midrange is the range of the human voice: it’s where our ears are most sensitive. Most people will hear the second bass line as prettier, and the first one, the one that’s flat, as harsher sounding. It’s especially pronounced when you hear it through good speakers. Bass players know that it sounds pretty by itself, but in a mix, with a band, it tends to vanish.
Now onto compression. Live music typically has high volume parts and low volume parts. If you sing, you will typically start a note low, bring it up to full, volume and then taper down. All struck or plucked or bowed notes have an “envelope:” an initial attack followed by a slow or fast decay as the note fades. Musicians typically hit some notes harder than others: it’s part of being expressive.
Compression eliminates those variables. It’s an effect that make the quiet parts loud and brings the loud parts down. Or really, in most music, what it does is take the quiet parts and make them as loud as the loud parts. Compression makes the music “punchy,” because instead of coming at you with the normal attack/decay pattern, the notes start at full volume and end at full volume. Literally all pop music today is massively compressed, at every stage. That’s how they make the commercials louder than the show–they massively compress them. There’s a very good explanation on Wikipedia.
A little loop, bass, drums, guitar uncompressed and unscooped:
And the same loop, mid scooped and compressed
You should hear a pretty dramatic difference. The second one has been compressed and mid scooped on every track: it will probably sound brighter and louder: the playing may sound more professional and less fumbly. To my ears, the first one is way way better, because there’s more there, but what’s there has more human mistakes in it.
Compression seems to have originated with AM radio, way back in the day. Radio broadcasters wanted to make their stations jump out when users twiddled the dial. They figured out that if they compressed the signal, it would sound loud and direct and “in your face.”
Compression also makes up for weak playing. A really good musician has control of the dynamics. A weaker musician, like myself, does not. Compression makes the things I play much more even, “tighter,” more uniform. I’ve often suspected compression came into use around the same time as rock music, when there were more and more bands who really could not play that well. But now it’s the default taste. Listen to a Lady Gaga song–there are almost no dynamics, everything comes at you with the same punchy attack, heavily, heavily compressed, not like actual untreated live music at all.
Just to try to make it clear, here’s that same bass part, uncompressed:
All explanations for the mid scoop depend on psychoacoustics, and specifically the fact that our ears tend to “fill in” what isn’t actually there. It’s sometimes argued that the taste for mid scoop comes from the rise of earphones/earbuds. Earbuds have tiny speakers and they really can’t reproduce bass frequencies very well. Open earphones, the kind that don’t seal against your head, are even worse. To make them work, music producers pumped up the bass and treble, so you heard bright sparkly highs and whompy lows, and your ear sort of filled in the rest. A good set of speakers, on the other hand, will give you the full range, and the smiley eq can sound fatiguing and artificial and hollow.
The other explanation may be the mp3 format. A music track on a cd, a three minute song, will be around 30-40 megabytes. The same song as an mp file will come in at 3-4 megabytes–smaller by a factor of ten. Whoa, dude, where’d all my information go?
Mp3s work partly by sucking out the midrange frequencies, because your ear’s tendency to will fill in what isn’t actually there. The classic illustration of this is a phone call. Your iphone cannot physically reproduce the actual frequencies of your male friend’s baritone voice. It only actually reproduces the midrange frequencies, and your mind/ear “fills in” what’s missing and tell you it’s Bob, with the deep voice.
If you compare a good cd track with a typical mp3 of the same track, side by side on a decent reproduction system, the mp3 will sound “hollowed out.” Try, for example, a Frank Sinatra recording from the late 50s/early 60s., back when they did not do much fiddling. But if it’s a modern song, designed to be played on earbuds, it won’t sound that much different.
In the 20s, all recorded music had a midrange hump–it emphasized the midrange. That’s because recording equipment could not capture the extreme frequencies, either the lows or the highs, very well. Listen, for example, to a Louis Armstrong Hot Five or Hot Seven recording. By the 1950s, recording equipment had become extremely sensitive, and the eq profile was mostly “flat,” even in all frequencies, much as you would hear it live. Listen to the classic Miles Davis album Kind of Blue for an example.
In the 1990s, the taste for mid scooping and compression became really pronounced. On the dance floor, or in a car driving by, people wanted that whomping bass, and the sparkly top end: a Madonna song like Ray of Light is a good example. It’s a sonic confection: all sugar and salt and no other flavors.
Leo Fender, the great instrument and amp maker who founded Fender Musical Instruments, often compared his sonic goals to lemonade. When you drink lemonade, he said, you want to taste the sour lemon and the sweet sugar–everything else is “fluff.” He wanted, he said, to get rid of the fluff, and fender’s guitars and amps were designed with a scooped eq in mind.
It’s easy to argue this is kind of a juvenile taste, a junk food taste, a taste for extremes. Is it accurate? I’m not sure. It’s also possible what what appeals about lemonade is the impossibility of two opposite tastes, sour and sweet, combined: that we like to have natural oppositions overcome, and so we like a track that is both all bass and all treble for the same reason we like foods that combine salty and sweet, or dramas that depict love and hate in equal measure.
Update: I got a couple comments that saw this post as a critique of modern music–it’s really not meant that way. It’s meant as a possible account of changes in taste, and a possible explanation of why. I suspect lady gaga would be delighted to have her music described as artificial and a snack food confection, and I like snack foods as much as the next guy.
Update two: Several people have suggested that mp3 does not reduce mids. This is not what I had read in earlier research, but I’m not an acoustician or a computer guy–so if anyone has some links or source with better information, post away!
Update three: “mp3” can mean different things–mp3 files can be encoded at different samples rates. The smallest files have typically suffered the biggest losses and will sound worse. At higher sample rates, it’s extremely difficult to tell mp3 from other formats. It’s possible that the mid scoop is an artifact of the early days of mp3, when more “lossy” tracks were common. Its also possible that the prevelance of mid scoop comes from the fact that mp3’s sometimes strip out the very low and very high frequencies, the ones we hear less. In that case, the mid scoop would be a way to restore “flatness.”
It is also true that while mp3 was designed to be uncolored, many people hear a difference: Wikipedia’s entry for “mp3” includes “test given to new students by Stanford University Music Professor Jonathan Berger showed that student preference for MP3 quality music has risen each year. Berger said the students seem to prefer the ‘sizzle’ sounds that MP3s bring to music. Others have reached the same conclusion, and some record producers have begun to mix music specifically to be heard on iPods and mobile phones.”
It’s still the case that Leo Fender, who helped devise the sonic signature of post WWII pop music, wanted a scooped midrange sound. He saw it as pure and essential and clean, uncluttered by “fluff” in the middle.