When I first joined this board one of the mods suggested creating an audio FAQ or glossary. There are so many facets to audio production and post production that I shied away from the idea but said I would, when I had the time, create posts with useful info. I posted one a couple of days ago about stereo in the post-production forum and here is another. In other posts I (and other audio pros) have had occasion to mention gain staging or proper gain staging. It occurred to me that most people here would have only a vague idea of what gain staging is and as it is at the heart of the production sound mixer's job, I thought it would be a good idea to explain it.
To understand gain staging we first have understand SNR. Using the recording dialogue to explain: The Signal to Noise Ratio (SNR) would be defined as the difference between the peak level of the loudest piece of dialogue and the peak level of the noise floor (background noise on set). We obviously always want to maximise the SNR because we want the noise floor to be as far away from the quietest dialogue level as possible. If they're too close together, the quiet dialogue will sound noisy and if the dialogue is at (or below) the noise floor it will be unintelligible or inaudible.
Gain staging is effectively the act of fitting a number of Signal to Noise Ratios (SNRs) inside one other.
SNR 1: The initial/first SNR we have to deal with is the one already described above, on set. At this point, the maximum range or "window" our SNR covers is fixed, it cannot be increased until we get to post production (with tools such as expanders, noise reduction software or EQ). However, our SNR window can and will be decreased! The goal is to decrease it as little as possible using correct gain staging.
SNR 2: The mic we use to record with will have it's own internal SNR which we may need to consider (especially with cheaper mics) but more important is where we place it. The closer to the sound source (in this case the actor's mouth) we can get the mic, the more signal we will record, thereby minimising the reduction of the SNR window already defined by SNR 1. Obviously we can't usually get very close to the actors mouth, so shotgun mics with tight pickup patterns are usually invaluable but then you need to be that much more precise with where you position and point the mic, which is why a good boom operator is so invaluable. We now have a new SNR, with a new, smaller window which as with SNR 1, can only be decreased further until post. How big this new SNR window is will depend on the skill of the boom op, the situation he/she is faced with and to an extent the quality of the mic and boom.
SNR 3: The signal output from a mic is tiny and would be near or even below the internal noise floor of our recording device (SNR 4), so we need to amplify it quite considerably with a Mic Pre-Amp. The essence of using a mic-pre is therefore defined by amplifying our signal to a level suitable for use downstream (recording), while adding as little noise as possible. It's fully understanding this statement which trips up so many inexperienced and even quite a few experienced production sound mixers. In other words, what constitutes a suitable level for recording (answered in SNR 4) and, what noise is added by a mic-pre? There are essentially two types of noise added by all mic-pre's: 1. It's own internal noise floor and 2. Overdrive distortion. The production sound mixer's job is to capture as much of the SNR 2 window as possible by finding the optimum point between the mic-pre's two types of noise. BTW, we are still in the analogue domain so this overload distortion doesn't suddenly happen but starts inaudibly and increases proportionately as we increase the mic-pre's gain. It will take practise, testing and experience to discover this optimum point for the individual make/model of mic-pre. Obviously, the more expensive mic-pre's will offer a bigger window of opportunity, by having a lower noise floor and achieving higher output levels before overload distortion. The nominal output (line) level for mic-pres should be +4dBu, which any mic-pre must be able to output without audible distortion. Top of the line mic-pres can go as high as +18dBu without distortion becoming noticeable.
These dBu figures become important when we get to the recording stage (SNR 4). The Analogue to Digital Converter (ADC) takes the signal from our mic-pre/mixer and as the name suggests, converts it to digital data for storage. How this analogue input (mic-pre/mixer) level corresponds to the digital level depends on how the ADC is calibrated. For film (worldwide) and TV (in many countries) +4dBu would equal -20dBFS (European TV: +4dBu = -18dBFS). What this means is that even with the very best mic-pres money can buy, we are going to start adding distortion at about -6dBFS (+18dBu). And, considerably lower than this for not so high end mic-pres.
In other words, in pretty much all cases a signal peak of -6dBFS is on the limit or more likely some way outside of our optimal SNR window for our mic-pre! Providing we are calibrated to film standards the optimum level for our mic-pre is going to be around -20dBFS with peaks at around -12dBFS but what about the optimum level for our recording device:
SNR 4: In the days of tape recorders, the SNR was little more than 70-80dB, it was standard practice to record "in to the red" (the red line being set at 0VU = +4dBu) to get the signal as hot as possible and as far away from the noise floor of the tape machine as possible because all mixing processes in audio post would add further noise and we would run out of SNR. Even recording as hot as possible wasn't enough though and additional noise reduction was required when the final mix was printed to film (Dolby Noise Reduction). With 16bit digital, the noise floor was lowered, providing a SNR of over 90dB and providing we still recorded near the red line we no longer needed the addition of Dolby NR.
24bit recording was a huge leap forward, so big a leap, it actually exceeds the limits of the laws of physics! In reality there is no such thing as a 24bit converter, although 24bit ADCs output 24bit files there is not 24bits of digital audio signal stored in those files. This is because even with a theoretically perfect circuit design (which is impossible), the noise of electrons colliding inside the resistors and capacitors is considerably louder than the noise floor of 24bit digital! The best ADCs money can buy use about 20bits and the limits of the laws of physics would be about 22bits.
So, all those people out there advocating recording as hot as you can are over a decade out of date, those days are over! Recording as hot as you can (the SNR window) should now be defined by the optimal performance of the mic-pres, because the SNR window of mic-pre's output is going to be several hundreds of times smaller than that of the recording medium (24bit). Even if your recording peaks no higher than -20dBFS, in 24bit the SNR window defined by your mic-pre is not going to be affected.
One last point, something we have to be careful of is that most "stand alone" ADCs (and DACs) are designed for music use and are usually calibrated to +4dBu = -18dBFS, -16dBFS or even -14dBFS not the film/TV standard of -20dBFS and will therefore need recalibrating!
Hope this was useful?
G
To understand gain staging we first have understand SNR. Using the recording dialogue to explain: The Signal to Noise Ratio (SNR) would be defined as the difference between the peak level of the loudest piece of dialogue and the peak level of the noise floor (background noise on set). We obviously always want to maximise the SNR because we want the noise floor to be as far away from the quietest dialogue level as possible. If they're too close together, the quiet dialogue will sound noisy and if the dialogue is at (or below) the noise floor it will be unintelligible or inaudible.
Gain staging is effectively the act of fitting a number of Signal to Noise Ratios (SNRs) inside one other.
SNR 1: The initial/first SNR we have to deal with is the one already described above, on set. At this point, the maximum range or "window" our SNR covers is fixed, it cannot be increased until we get to post production (with tools such as expanders, noise reduction software or EQ). However, our SNR window can and will be decreased! The goal is to decrease it as little as possible using correct gain staging.
SNR 2: The mic we use to record with will have it's own internal SNR which we may need to consider (especially with cheaper mics) but more important is where we place it. The closer to the sound source (in this case the actor's mouth) we can get the mic, the more signal we will record, thereby minimising the reduction of the SNR window already defined by SNR 1. Obviously we can't usually get very close to the actors mouth, so shotgun mics with tight pickup patterns are usually invaluable but then you need to be that much more precise with where you position and point the mic, which is why a good boom operator is so invaluable. We now have a new SNR, with a new, smaller window which as with SNR 1, can only be decreased further until post. How big this new SNR window is will depend on the skill of the boom op, the situation he/she is faced with and to an extent the quality of the mic and boom.
SNR 3: The signal output from a mic is tiny and would be near or even below the internal noise floor of our recording device (SNR 4), so we need to amplify it quite considerably with a Mic Pre-Amp. The essence of using a mic-pre is therefore defined by amplifying our signal to a level suitable for use downstream (recording), while adding as little noise as possible. It's fully understanding this statement which trips up so many inexperienced and even quite a few experienced production sound mixers. In other words, what constitutes a suitable level for recording (answered in SNR 4) and, what noise is added by a mic-pre? There are essentially two types of noise added by all mic-pre's: 1. It's own internal noise floor and 2. Overdrive distortion. The production sound mixer's job is to capture as much of the SNR 2 window as possible by finding the optimum point between the mic-pre's two types of noise. BTW, we are still in the analogue domain so this overload distortion doesn't suddenly happen but starts inaudibly and increases proportionately as we increase the mic-pre's gain. It will take practise, testing and experience to discover this optimum point for the individual make/model of mic-pre. Obviously, the more expensive mic-pre's will offer a bigger window of opportunity, by having a lower noise floor and achieving higher output levels before overload distortion. The nominal output (line) level for mic-pres should be +4dBu, which any mic-pre must be able to output without audible distortion. Top of the line mic-pres can go as high as +18dBu without distortion becoming noticeable.
These dBu figures become important when we get to the recording stage (SNR 4). The Analogue to Digital Converter (ADC) takes the signal from our mic-pre/mixer and as the name suggests, converts it to digital data for storage. How this analogue input (mic-pre/mixer) level corresponds to the digital level depends on how the ADC is calibrated. For film (worldwide) and TV (in many countries) +4dBu would equal -20dBFS (European TV: +4dBu = -18dBFS). What this means is that even with the very best mic-pres money can buy, we are going to start adding distortion at about -6dBFS (+18dBu). And, considerably lower than this for not so high end mic-pres.
In other words, in pretty much all cases a signal peak of -6dBFS is on the limit or more likely some way outside of our optimal SNR window for our mic-pre! Providing we are calibrated to film standards the optimum level for our mic-pre is going to be around -20dBFS with peaks at around -12dBFS but what about the optimum level for our recording device:
SNR 4: In the days of tape recorders, the SNR was little more than 70-80dB, it was standard practice to record "in to the red" (the red line being set at 0VU = +4dBu) to get the signal as hot as possible and as far away from the noise floor of the tape machine as possible because all mixing processes in audio post would add further noise and we would run out of SNR. Even recording as hot as possible wasn't enough though and additional noise reduction was required when the final mix was printed to film (Dolby Noise Reduction). With 16bit digital, the noise floor was lowered, providing a SNR of over 90dB and providing we still recorded near the red line we no longer needed the addition of Dolby NR.
24bit recording was a huge leap forward, so big a leap, it actually exceeds the limits of the laws of physics! In reality there is no such thing as a 24bit converter, although 24bit ADCs output 24bit files there is not 24bits of digital audio signal stored in those files. This is because even with a theoretically perfect circuit design (which is impossible), the noise of electrons colliding inside the resistors and capacitors is considerably louder than the noise floor of 24bit digital! The best ADCs money can buy use about 20bits and the limits of the laws of physics would be about 22bits.
So, all those people out there advocating recording as hot as you can are over a decade out of date, those days are over! Recording as hot as you can (the SNR window) should now be defined by the optimal performance of the mic-pres, because the SNR window of mic-pre's output is going to be several hundreds of times smaller than that of the recording medium (24bit). Even if your recording peaks no higher than -20dBFS, in 24bit the SNR window defined by your mic-pre is not going to be affected.
One last point, something we have to be careful of is that most "stand alone" ADCs (and DACs) are designed for music use and are usually calibrated to +4dBu = -18dBFS, -16dBFS or even -14dBFS not the film/TV standard of -20dBFS and will therefore need recalibrating!
Hope this was useful?
G
Last edited: