The WSRS would like to thank Nagra and John Owens for permission to reprint this article, a timely reminder that we need to attend to the more aspects of the recording system than simply the narrow specmanship of bit depth and sampling rates.In the days of analogue tape, recordings were not only dependent upon the quality of the recorder, pre-amplifiers and microphones but also on the limitations of the recording tape. Artifacts such as wow and flutter, tape "hiss", even print-through along with limitations of distortion, equalization and bias had tremendous effects on the quality of recordings, not to mention the alignment of the tape transports, as often tapes were recorded on one machine and played back on another.
Today, thanks to digital technologies, all of these media and
transport related problems have been removed, and the quality relies
more heavily on other parameters, which are often either misunderstood
or simply ignored.
For example there is a common perception today that a "better" recording will result if a recorder capable of making 24 bit 96kHz recordings is used instead of one operating at 16 bit 44.1 kHz. Without trying to define "a better recording" it is interesting to look at the critical factors behind such a presumption.
The word length 16, 18, 20 or 24 bits can be related to two specific areas of the recording chain: Firstly the word length used by the A/D when digitizing the analogue signal, and secondly the word length which is actually recorded on the media. The latter needs to be at least equal to the A/D in the record chain, however it is perfectly reasonable to record a 24-bit word length derived from a 20 or even 16 bit A/D converter. In such a case the least significant bits (LSB's) will simply be recorded as "0's" and will serve purely to make the bit stream compatible with other equipment of the same format, but will have no bearing on the audio quality. Therefore to assume that a 24 bit audio signal was created using a 24 bit A/D converter is purely hypothetical.
Concerning the sampling frequency which is chosen, there have been many studies arguing the virtues of higher sampling frequencies, and without side tracking too much, it is safe to say that the more audio information that is recorded at the outset, the better the chance of reconstituting the original sound later, becomes. It is perhaps simply better to decide which sampling frequency to use depending on each particular recording.
It should be remembered that sound is generated as an analogue signal and is perceived by the human ear as an analogue signal. Several critical factors such as microphone choice, placement and recording environment are far more important than the word length and sampling frequency or even the recorder chosen. But, assuming these practical factors are well understood, then the last remaining factor, as far as a recorder are concerned, is the audio chain itself in terms of level, frequency response and dynamic range. These factors in themselves have an equally important bearing on the recording, and if misunderstood or misused can produce poor recordings even with the best recording equipment.
Level is probably one of the most difficult points to discuss as there is no "golden rule" or formula for setting it, and the setting will have a bearing on the quality of the recorded sound. It is however important to remember that unlike the old analogue days when a signal peaking comfortably at +3 or +6dB above maximum level was quite normal, where the gentle progressive distortion introduced, gave warmth and depth to the recording. In the digital world this is impossible as digital chains do not allow this approach and therefore level setting is far more critical than in the past. A digital signal cannot get "louder" than 0 dB or in digital terms "7FFF" for a 16-bit sample, and once this point is reached distortion produced is total. So in principle, one tries to record as close to this point as possible, without going over it to ensure maximum use of the available dynamic range of the digital system. In reality though, as the generally accepted tone reference is -18dB the peaks will be around -10dB at best. This insinuates that the average or mean audio level will probably be some 6 - 8 dB below this i.e. around the -16 to -18 dB point.
Now, as the useable dynamic range in digital terms is calculated at 6 dB per bit, a 16 bit system should have a useable dynamic range of 96 dB (although in reality this equates to about 90 dB), starting from the "0" point, and counting back to -90 dB. If the recording level is peaking at around - 10dB then the maximum available dynamic range is only in fact about 80 dB. In this scenario, with the best will in the world, you are only making a 13-14bit recording.
If we now look at the microphone pre-amplifier stages, a dynamic range of 120 dB can be considered as the best currently available in any audio recorder, and therefore, using this to its absolute maximum ability your digital level is only going to brush the 20 bit mark, so one could argue as to the advantage of using 24 bits, in audio terms is somewhat academic.
This being said, if one looks at a microphone preamplifier stage and sees that its dynamic range is limited to say 85 dB, there seems little point in worrying whether the recording is 16, 18, 20 or 24 bit and it will make absolutely no difference to the "quality" of the recording.
A frequent comment is "Using external 24/96 converters makes it sound so much better" well in fact this is purely because such external converters are relatively expensive pieces of equipment, and contain high quality analogue stages. Hence the subjective improvement has little to do with the number of bits or sampling frequency.
The sampling frequency and frequency response go hand-in-hand really, and although using the Nyquist theory, 44.1 kHz is sufficient to record perfectly up to 22.05 kHz bandwidth, using higher sampling frequencies does appear to reconstitute the original sound more accurately.
In conclusion, sound recording is a combination of art and science, and many years experience enables the correct choices of equipment to be made for any particular recording, but complicated technical explanations, often imply that better recordings are made using certain technologies. However, the quality and positioning of the microphones, the analogue pre-amplifiers and their useable dynamic range is far more important than the digital word length or sampling frequency. Some manufacturers will encourage the belief that these technical specifications are of critical importance when, in fact, much of the time they are completely irrelevant. This may simply be because it is relatively easy to design circuits using high bit depth converters at high sampling frequencies, rather than designing analogue microphone pre-amplifiers with high gain, wide dynamic range and low distortion. The best way to judge a piece of equipment is to listen to the recording rather than reading the glossy brochures.