Get in touch with us about your project! Email email@example.com, or call 314.669.1472
Audio Perception and ABX Listening Tests by Rob Schlette
Audio production requires a lot of decision making. In fact, most of audio production is decision making.
Unfortunately, pro audio marketing nonsense has devalued sensory feedback, and created elaborate preconceptions about how certain things 'sound'. We are routinely exposed to the most outrageous qualitative claims that have never been proven (or even suggested) with a systematic listening test.
We owe it to ourselves and to our clients to make good choices with our ears. Let’s take a look at how anyone with a basic DAW setup might be able to go about conducting a listening test of their own.
- If a claim or question includes phrases like, “sounds better” or “can hear the difference”, the most direct way to prove it (or dismiss it) is a listening test.
- A listening test is useless if the listener can see what he or she is listening to (e.g. selections labeled MP3 and CD, or any visable waveforms). Our eyes will betray our ears, and reinforce preconceptions.
- The results of a test aren’t results if they can’t be repeated.
All of these types of premises are debated in online communities, but we have to draw some practical boundaries. We only care about answering the question, "can you hear the difference between these two things?"
An ABX listening test takes two audio samples (A and B), and provides a method for determining whether they are distinguishable to a listener. During the ABX test the listener is asked to answer whether each of a series of playback examples X is sample A or B. Most test cycles will run about 10 times. The listener’s score is typically quantified in terms of percentage of correctly identified X’s. If you factor in chance, then you can calculate the reliability of identifying A vs. B.
Presumably if you identify X correctly close to 100% of the time, you can hear a difference. If your scores keep landing in the 50% range, or vary widely across multiple tests, the suggestion is that you’re not hearing a reliable difference between the two audio examples. If any of the above is true across reasonably well-selected groups of listeners, then you're really starting to learn something.Keep in mind:
- If you can't hear a difference between two things, it doesn't necessarily mean you're tin-eared or dim. You might have just freed yourself from a myth.
- ABX is not about minutia or 'detecting small differences'. Those things tend to fall completely away in an ABX test. That's part of the point.
- If the results of your test embarass you, you might be about to learn something.
There are several software ABX apps available. I use Takashi Jogataki’s (free) ABXTester all the time. I highly recommend it for Mac users. QSC made a fairly famous hardware ABX Comparator until 2004.
If you're interested in ABX listening tests for digital audio codecs, the Sonnox Fraunhofer Pro Codec includes an excellent built-in ABXer.
Lots of software companies offer different codecs for creating compressed audio formats like MPEG-3 and AAC. There are a lot of reasons to prefer one over another, including user interface, cost, and brand association. To keep myself honest, I’ll typically download a demo of a new codec, and ABX it against my current preference.
I’ll bounce the same audio source twice – once with each codec product set to identical digital audio precisions. Absolutely nothing else about the two bounces can be different, or the test is pointless. If I’m really being honest, I get someone else to load up the examples into the ABX app so I don’t know which is which.
After one round of ABX testing (repeatedly identifying X as either A or B based on what I’m hearing) I observe my success rate. I’ll usually repeat the test 3 to 5 times, maybe using different monitors (i.e. limited bandwidth versus full-range monitors). If the results suggest any ability to hear the difference (especially if I'm preferring the new codec), I’ll usually repeat all of the above with a wide variety of playback samples from different musical genres.
This process isn’t objective or blind enough to qualify as a truly scientific test, but it is goes a long way to eliminate a lot of self-deception and marketing fog.
The most important step in the ABX testing process is defining a test that has a single variable. If there is more than one thing different between A and B, you’re not really going to learn anything useful.
For example, a question like, “does mic pre A sound different than mic pre B,” is complicated. First, you have to consider the wide range of variables between two successive performance examples. Eliminating those with a mic splitter (or a playback example), you would need to consider the gain staging of the two mic pres. Devise a standard for establishing ‘equal gain’ between the two signal chains (e.g. acoustic test noise metered at a reference level at the mic pre outputs).
A question like, “can I hear the difference between a 96kHz digital recording and one sampled at 44.1kHz,” would require you to have an acoustic or analog test source (a digitally derived source would be irrelevant). With an acoustic source you would need to have two identical converters feeding two different DAW setups with nothing different between them but sample rate. You'd need to bounce both examples with the same digital audio precisions in order to be able to conduct the ABX test, which would re-defining the question as, “can I hear the difference between a 96kHz digital recording and one sampled at 44.1kHz once they’re both at 44.1kHz?”
Obviously the simpler the question, the simpler the test. After getting used to thinking through the variables that affect our perception, marketing claims will begin to inspire the question, “how would you test that?” The answer will either inspire some new exploration of your own, or instantly expose the silliness that often lies just under the surface of pro audio marketing.
ABX testing is just one way of attempting to determine unbiased answers to questions of audio perception. Other methods like null testing might be better for particular scenarios, as long as the test is well-conceived. There are some popular examples of tremendously silly null testing on YouTube, but you’ll be smart enough to consider a single variable at a time.