Glossary: Digital

BLOG

by Chris Martens
Aug 12, 2024

Over recent years, our online guides have created an extensive encyclopedia of audio terminology. We decided to bring these disparate dictionaries of audio terms together for the first time. This exhaustive guide is the result.

While the days of trying to baffle people with terms only the cognoscenti know are (hopefully) behind us – many readers might recall the patronizing salesman in the ‘Grammo-phone’ sketch from Not The Nine O’clock News in the early 1980s – this is still a terminology-led industry, and knowing the terms is a good idea if we are to be able to recognize how components might conceivably be different, and why.

While it’s important not to get too hung up on the terminology – we are in an industry where observed performance should always remain more important than specifications – knowing the difference between a ported loudspeaker and a sealed-box loudspeaker is important and knowing that a sealed-box loudspeaker and an infinite baffle design are basically one and the same is important, too.

DIGITAL AUDIO TERMS

Perhaps no single category in all of high‑end audio has spawned a more convoluted ‘alphabet soup’ of technical terms and abbreviations than digital audio. Indeed, the topic has given rise to so many TLAs (three-letter acronyms) that at times it seems almost impossible to keep them straight in one’s mind. We present here a minimalist glossary that, while by no means exhaustive, covers at least a few of the more common acronyms and terms you are apt to encounter when you go shopping for digital audio components.

AAC

This acronym stands for ‘Advanced Audio Coding’, which is one of several coding standards for lossy digital audio compression (see ‘Compression’ in this glossary for more details). AAC was originally developed as the successor of MP3, which is another form of lossy compression. AAC is generally thought to deliver somewhat better sound quality than MP3 for any given bit rate.

AAC comes up often in product specifications sheets because it is the default audio format for such popular products and services as: YouTube, iPhone, iPod, iPad, iTunes, and the Sony PlayStation 3.

ADC

The acronym ADC (sometimes also shown as ‘A/D’) is shorthand for ‘Analogue-to-Digital Converter’. Realistically, not many audiophiles own, or would have any reason to own ADCs, but it is worth bearing in mind that recording studios and production houses use ADCs in order to create the digital audio music files that most of us enjoy.

ADCs receive analogue audio signals, sample those signals at very high frequencies (under the control of extremely accurate clocks) and then generate digital bit-streams (that is, multi-bit words of digital audio data) that represent the sampled analogue audio signals as accurately as possible. As with any other type of audio equipment, ADCs are not created equal, and some have audibly superior performance capabilities to others.

AIFF

This acronym stands for ‘Audio Interchange File Format’, which is a digital audio file format developed by Apple. AIFF stores audio data in uncompressed pulse-code modulation (PCM) format and is therefore lossless. Because they are both uncompressed and lossless, AIFF files require more data storage space than compressed audio files would do, but the trade off—one that many audiophiles happily embrace—is that AIFF introduces no sonically deleterious ‘compression artefacts’ of any kind.

ALAC (and ALE)

The acronym ALAC stands for ‘Apple Lossless Audio Codec’, which is sometimes alternatively called ALE (for ‘Apple Lossless Encoding’). In short, ALAC is a method for compressing digital audio data in a completely lossless manner (meaning all of the original audio data is preserved).

ALAC was initially a proprietary Apple standard, but as of 2011 Apple made the codec available as open source and royalty free software. Both iTunes and iOS devices support ALAC (whereas Apple systems and devices typically do not support other lossless standards), so that ALAC has become the de facto lossless compression standard for audiophiles who use Apple computers and/or iOS devices.

Note that AIFF and ALAC are not the same things. AIFF digital audio data is not compressed at all and therefore is inherently lossless; ALAC digital audio data is compressed but can be decoded for playback in a lossless manner. ALAC digital audio files are roughly one half the size of equivalent uncompressed files.

Bit

One unit of digital data, typically represented by voltages either above or below a clear-cut threshold and by convention held to represent a ‘1’ or a ‘0’ as used in binary numbers. Typically abbreviated as a lower-case ‘b’ – as in, “My DAC can handle PCM digital audio files at resolutions up to 32-bit/384kHz.”

Bit-rate

The speed, expressed in number of bits per second, at which digital audio data is processed or transferred from one device to another or playback. For example, one of the better sounding and more popular forms of MP3 transfers data at 320kbps (kilobits per second).

Byte

An 8-bit ‘word’ of digital data, abbreviated with a capital ‘B’ – as in, “I store my digital music library on a 2TB drive” (where 2TB means ‘2 Terabyte’). The digital word lengths used in digital audio are typically multiples of 8-bits: hence, 16-bit, 24-bit, or 32-bit words are frequently discussed.

CD

The acronym stands for ‘Compact Disc’, a physical storage format for digital audio commercially launched in the early 1980s by Philips and Sony. CDs are polycarbonate discs that incorporate a highly reflective metallic layer upon which ‘pits’ can be etched along with shiny spaces in between the pits, known as ‘lands’. The pits and lands effectively represent the ‘1s’ and ‘0s’ inherent in digital audio data.

By convention, CD standards are set forth in the so-called Red Book, which calls for the digital audio data to be stored in 16-bit words of data sampled at a rate of 44.1 kHz. When writers talk about ‘CD resolution’ digital audio files, they will often refer to them as ‘16/44.1’ files. While CDs are arguably the most popular digital audio format on the planet, other storage formats are now on the rise, many of them offering resolutions (and, in principle, sound quality) much higher than that of CDs.

“The ear is extraordinarily sensitive to timing and thus can readily differentiate between clock errors.”

Clock

Digital clocks are extremely important in digital audio, both when encoding and decoding or playing back digital audio files. Since clocks govern the precise time intervals at which digital audio files are captured, and then later played back, it is critically important for clocks to be stable and accurate so that the intervals between clock beats are maintained with extreme precision.

The human ear is remarkably sensitive to clock timing errors, so that errors occurring down at the picosecond lever are thought to be audible. The more accurate, stable, and precise a clock is, the better the sound of the component will be (all other things being equal). Some very high-end components use extremely exotic Rubidium (or ‘atomic’) clocks to achieve the ‘nth’ degree of sound quality.

Codec

A codec is a software or firmware program that can encode or decode a digital audio stream. The term ‘codec’ represents a condensation of the more cumbersome phrase ‘encoder decoder’. Some popular codecs you may have heard of include MP3, MP4, ALAC, FLAC, Ogg Vorbis, and many more.

Compression

Compression is a data manipulation process where digital audio files are condensed in order to conserve data storage space. It is useful to think of compression, as it applies to digital audio, as a two-part process. First, digital audio files are compressed to reduce them to a more compact and manageable size for storage; then, later on, the compressed files are decoded or de-compressed for playback. There are many types of audio compression algorithms, but they generally fall into two categories: lossy compression and lossless compression.

Lossy compression algorithms do the most efficient job of compressing data, but with the tradeoff that—when it comes time to decode the lossy files—only part of the original digital audio data is restored, while some is irretrievably lost (hence the name ‘lossy’). Two of the more popular lossy compression codecs are AAC and MP3.

Lossless compression algorithms are less efficient than lossy algorithms in terms of conserving storage space, but they have the benefit that—when it comes time to decode the files—fully 100% of the original digital audio data is restored. Most audiophiles perceive lossless compression to offer audible performance benefits vs. lossy compression (although there is some debate on this topic).

As broadband internet speeds continue to increase and very high-capacity storage devices have become less expensive and more commonly available (even in small, portable, handheld devices) there is less pressure on audiophiles to conserve storage space, so that over time lossless compression algorithms have become increasingly popular. Two of the more popular lossless compression codecs are ALAC and FLAC.

DAC

This acronym stands for ‘Digital-to-Analogue Converter’, with the DAC serving as an essential ingredient in any digital audio playback device. In simple terms, the job of the DAC is to receive digital audio data at extremely precisely clocked intervals and to convert that data into an analogue output that mirrors (or is proportionate to) the numerical values of the digital audio data received.

DACs can be, and often are, condensed to fit on single integrated circuit chips, with popular DAC makers including firms such as Burr-Brown, ESS, Texas Instruments, Wolfson, and many more. However, it is possible to create DACs from individual, discrete parts—an approach some audio component manufacturers have pursued in the interest of superior sound quality.

Either way, it is important to understand that the DAC devices used in a given component do not necessarily define or determine the component’s characteristic sound (other circuit elements also play a major role in determining sound quality).

DSD

The acronym stands for ‘Direct Stream Digital’, which is a digital audio encoding and decoding system developed by Philips and Sony as the format of choice for use in their higher-than-CD-resolution Super Audio CD discs (commonly called SACDs).

Unlike, PCM (pulse code modulation) formats, which store digital audio data in the form of 16-bit, 24-bit, or even 32-bit words sampled or clocked at rates ranging from 44.1 to 384 kHz, DSD is a single-bit, delta sigma modulated encoding process, but with extremely high sampling rates of 2.8224 MHz (known as DSD64) or 5.6448 MHz (known as DSD128). In principle, DSD files are extremely easy to decode for analogue playback, requiring only a basic low-pass filter. Some critics argue that DSD files have high frequency noise issues to contend with and that the delta sigma process has some inherent errors that are difficult to overcome. Proponents of DSD, however, argue the DSD achieves a smooth, free-flowing, analogue-like sound that is often difficult for PCM to achieve.

While SACD discs have never achieved the popularity of conventional Red Book CDs, their underlying DSD file format has won widespread popularity in recent years, since many music lovers now prefer listening to files downloaded or streamed from the Internet (or a local network). DSD files can be streamed or downloaded via a transfer process called ‘DoP’, which stands for ‘DSD over PCM’. This process does not convert DSD files to PCM format, but rather temporarily stores DSD data in PCM ‘data containers’ in order to simplify file transfers.

DSP

The acronym stands for ‘Digital Signal Processing’, a topic that comes up often in discussion of digital audio. One of the beauties of digital audio is the fact that, once analogue signals are converted into digital formats, they can be processed in ways that would be difficult if not impossible to achieve solely through analogue means. For example, DSP can be used to implement complex digital filtering systems that can shape the sonic character of the ultimate playback presentation in extremely subtle and potentially desirable ways. Likewise, DSP makes possible certain elaborate equalization (EQ) systems that would be very difficult to execute with a purely analogue EQ system. Finally, DSP allows designers greater control over various sonic variables including noise, transient response, resolution, etc. as well as greater control over various processing/ playback artefacts.

Dynamic Range

In audio, dynamic range is the difference between the smallest and the largest usable signal that can be passed through a transmission or playback system; this difference is expressed as a ratio and typically is quoted in dB (decibels). The human ear is said to have about 140dB of dynamic range (which is also, in rough terms, about the same dynamic range as some of today’s best microphones).

Since digital audio inherently involves creating digital representations of analogue sound waves, one question that arises is this: “Does the digital system have more or less dynamic range than the analogue signals it is attempting to represent?” All other things being equal, digital components with greater dynamic range often offer superior sound, in part because they do not lose low-level signals in noise, nor do they overload on very high-level signals.

Part of today’s emphasis on higher-than-CD-resolution digital audio files involves the fact that 24-bit files offer dramatically higher dynamic range than do the 16-bit files found in CDs.

FLAC

The acronym stands for ‘Free Lossless Audio Codec’. FLAC is one of the most popular and widely supported lossless audio codecs in use today, in part because it is an open-source, royalty-free software package, but also because FLAC readily supports metadata tagging, complete with storage of album cover art and the like.

Jitter

As mentioned under ‘Clocks’, above, timing is absolutely crucial in digital audio with particular emphasis on maintaining absolutely identical time intervals between clock pulses. Unfortunately, nothing is perfect so that small variations or errors between intervals can and do occur—errors called ‘jitter’, which will usually be quoted as worst case timing variations (for example: ‘Jitter: </= 9 picoseconds’).

As mentioned elsewhere in this glossary, the ear is extraordinarily sensitive to timing and thus can readily differentiate between clock errors, even when those errors are measured in the parts per million vs. clocks with errors measured in the parts per billion. The point is that all other things being equal, the digital playback system with the lowest jitter almost invariably sounds best.

kbps and Mbps

The former acronym stands for ‘kilobits per second’ and the latter for ‘megabits per second’; both terms are used to express data transfer speeds. ‘kbps’ figures often come up in discussion of lossy compression codecs as a means of comparing the net amount of audio data one codec can supply vs. another codec (typically, the higher the data rate, the better the lossy codec’s sonic performance will be).

You might, for example, see digital downloads offered in two types of lossy formats: ‘MP3 (CBR at 128 kbps) or MP3 (VBR at 320kbps)’—where CBR stands for ‘constant bit rate’ and VBR is short for ‘variable bit rate’. In this case, the MP3 128kbps digital audio file would take up less storage space, but the MP3 320kbps digital audio file would offer markedly superior sound quality.

One small tip: In talking or reading about acronyms like these bear in mind that a lower case ‘b’ denotes ‘bits’, while a capital ‘B’ denotes ‘Bytes’.

Metadata

Literally ‘beyond data’, metadata is information about the data itself. For example, in an audio file, this might mean the title track, the artist, the composer, the genre, date of recording, date of composition, the album cover, band members, and more. This information about the music is generally ‘embedded’ within the file itself, to be read and displayed by media players and music servers alike. Metadata is enormously useful for listeners, simply because ‘Good Vibrations’ is a more memorable file name than ‘a156e03c’ to humans. Older file formats (such as WAV) are less robust in preserving metadata than their more modern counterparts.

MP3

MP3 is one of the oldest and most widely supported lossy digital audio compression codecs in the world. Over time MP3, which was created by the Fraunhofer Institute in the early 1990s, has emerged as a free ISO (International Organization for Standardization) standard that has also been incorporated by the MPEG (Motion Picture Experts Group) as part of both the MPEG-1 and MPEG-2 Audio Layer III standard.

MP3 was instrumental in the explosive growth that personal digital audio devices have enjoyed over the last 15 years or so, because it offered a means of substantially compressing large digital audio files so that even fairly large music libraries could be condensed to fit in devices with limited storage capacity (for example, early generation iPods).

MP3 also served, for many listeners, as an introduction to ‘perceptual coding’, where the general idea is to reduce the amount of data used to represent aspects of sound thought to be beyond the perceptual resolution of most listeners, while devoting data to the aspects of sound most readily heard and perceived. The concept was to reduce dramatically the amount of data that needed to be stored while still appearing to deliver full fidelity sound for most listeners, most of the time. Naturally, the idea of throwing out potentially useful sonic data did not sit well with most audiophiles and has been a topic of controversy and heated debate ever since.

Networked Audio & Network Streaming

Music stored on a computer can be removed to devices distributed across a home network (more accurately, a LAN or Local Area Network). This typically involves storing music on a computer or network attached storage device, which also runs some form of music server program to store and order these music files. The music itself is played through a ‘media renderer’ in your audio system that is also attached to the same computer network.

Functionally similar to internet streaming, networked audio distributes your own music library within the local network, instead of relying on online providers to stream their own music. While the popularity of personal libraries stored locally looks set to wane as online services proliferate, the networked audio system is a great way to store all your existing music collection in one easily accessible place.

PCM (and LPCM)

The former acronym stands for ‘pulse-code modulation’, while the latter stands for ‘linear pulse-code modulation’; both are means of representing analogue audio signals in a digital format. Many audiophiles use the terms PCM and LPCM interchangeably, though in fact the terms do not mean the same thing. PCM/LPCM is by far the most popular digital audio encoding format in use today.

Both PCM and LPCM sample the amplitude of analogue signals at precise and identical timing intervals. When each sample is taken, the amplitude of the signal is quantized and recorded as a multi-bit digital word. The difference between PCM and LPCM involves the manner in which signal amplitude is quantized; in PCM, samples are quantized to the nearest value within a range of possible digital steps, whereas in LPCM, samples are quantized to steps that are uniform in level.

The quality of PCM and LPCM encoding is largely controlled by two factors: the sampling rate (that is, the rate at which samples are taken) and the bit-depth of the samples taken (that is, the length in bits of the digital words used to represent each sample). As a general rule, all other things being equal, higher sampling rates and greater bit depths equate to better sound quality. Thus, a 24-bit/384kHz file of a song would likely sound superior to a 16-bit/44.1kHz file of the same song, assuming the master recording captured high levels of sonic detail and nuance in the first place.

“All other things being equal, higher sampling rates and greater bit depths equate to better sound quality.”

Resolution

In simple terms, ‘Resolution’ is the catchall phrase most audiophiles use to describe the amount of digital audio data used to represent analogue audio signals. As a general rule, the less data used the lower the resolution (and sound quality) will be, while the greater the amount of data used the greater the resolution (and sound quality) will be—up to a level where a perceived ‘point of diminishing returns’ is reached.

Generally speaking, lossy compression codecs yield what are considered low-resolution digital audio files. CD files, captured at 16-bits/44.1kHz are considered the standard, and files with higher-than-CD bit-depths and/ or sampling rates are considered to be high-resolution files.

Can listeners hear the difference? In a word, yes. The only area where there is room for discussion involves the question, ‘When is high resolution high enough?’

Servers

This term is the shortened form of the term ‘music server’. Typically, music servers provide a means of storing large quantities of digital audio files along with user interfaces that facilitate loading, organizing, and playing digital audio files. As a general rule, servers are typically thought to be self-contained units that not only store digital audio files, but also can deliver them for playback on demand.

Snake Oil

A term used by consumers to describe products that involve technological principles that are not well understood by the consumer. Examples of such technologies include EMI, decimation mathematics, image creation in the brain, bandwidth of the ear, phase effects, pre-ringing and reference measurement parameters. Snake Oil is a term of approbation which strongly implies that what is not understood is not valuable, rather than focusing value judgements on results achieved.

Streamers

By definition, streamers are network-attached devices that may offer Ethernet, Wi-Fi, and/or Bluetooth connectivity, or any combination of the above. The primary purpose of the streamer is to allow digital files from a music streaming service (e.g. Qobuz, Tidal, Spotify, Apple Music etc) to be located, selected and converted from internet protocol format to a format readable by an audio device like a digital-to-audio converter (DAC). Streamers are usually connected to the internet via an RJ 45 connector on an Ethernet cable connected to a switch or router that is part of your home network (hard wiring limits droputs and allows hi-resolution signals, unlike Bluetooth). DACs usually accept USB, S/PDIF, AES/EBU, I2S or Optical inputs. Streamers may or may not have storage of their own for local files (in which case they would properly be called ‘streamer/servers’). Streamers often have an input or inputs to accept external files, for example on a memory stick or a portable SSD. Streamers have user interfaces to allow their owners to view, choose, and play audio content from the available network resources at hand. The compatibility of streamers with various user interface applications (e.g. Tidal Connect or Roon) and the provision of interfaces for switching between services (e.g. BluSound OS allows choosing between 25 services and selection of multiple output devices) is a point of differentiation between streamers.

UPnP/DLNA

UPnP (Universal Plug and Play) and DLNA (Digital Living Network Alliance) are similar sets of interoperability guidelines, allowing digital media devices to work together with little or no need for complex ‘handshaking’ protocols. Devices that fall under one (or more usually, both) standards are designed to be compatible with one another as standard, and fall into three broad categories for audio systems: control point (which might be an app on a tablet), media renderer (the network-attached DAC or streamer), and media server (that might be a computer or NAS drive).

WMA

This acronym stands for ‘Windows Media Audio’ a family of audio data compression codecs developed by Microsoft that together are part of the Windows Media framework or ‘ecosystem’. The are four WMA codecs:

The original WMA codec is a lossy compression algorithm comparable to MP3.
The WMA PRO codec supports multichannel or surround sound files (with up to eight discrete channels) and supports ‘high resolution audio’ (at up to 24-bit/96kHz levels).
The WMA Lossless codec is a lossless compression algorithm.
The WMA Voice codec is a low bit-rate, lossy compression algorithm focused specifically on conversational voice content.

WAV (or WAVE)

This acronym stands for ‘Waveform Audio File Format’, which was developed by Microsoft and IBM, and which is an uncompressed and therefore lossless file format that typically uses LPCM encoding. In theory, WAV supports compressed audio as well, though this is rarely seen in actual practice.

WAV and AIFF files are compatible with Windows, Macintosh, and Linux operating systems.

In simple terms, WAV—much like AIFF—is all about preserving maximum sound quality while eliminating compression artefacts of any kind. Two drawbacks are that WAV files take up considerably more storage space than files encoded by lossless compression codecs and that WAV files do not lend themselves to storage of album/song-related metadata. Recognizing the sonic potential of WAV, many manufacturers of ripping and/or music server software have come up with workarounds to allow WAV files to be stored with associated metadata.