• Wondering which camera, gear, computer, or software to buy? Ask in our Gear Guide.

Production Sound (Gain Staging)

When I first joined this board one of the mods suggested creating an audio FAQ or glossary. There are so many facets to audio production and post production that I shied away from the idea but said I would, when I had the time, create posts with useful info. I posted one a couple of days ago about stereo in the post-production forum and here is another. In other posts I (and other audio pros) have had occasion to mention gain staging or proper gain staging. It occurred to me that most people here would have only a vague idea of what gain staging is and as it is at the heart of the production sound mixer's job, I thought it would be a good idea to explain it.

To understand gain staging we first have understand SNR. Using the recording dialogue to explain: The Signal to Noise Ratio (SNR) would be defined as the difference between the peak level of the loudest piece of dialogue and the peak level of the noise floor (background noise on set). We obviously always want to maximise the SNR because we want the noise floor to be as far away from the quietest dialogue level as possible. If they're too close together, the quiet dialogue will sound noisy and if the dialogue is at (or below) the noise floor it will be unintelligible or inaudible.

Gain staging is effectively the act of fitting a number of Signal to Noise Ratios (SNRs) inside one other.

SNR 1: The initial/first SNR we have to deal with is the one already described above, on set. At this point, the maximum range or "window" our SNR covers is fixed, it cannot be increased until we get to post production (with tools such as expanders, noise reduction software or EQ). However, our SNR window can and will be decreased! The goal is to decrease it as little as possible using correct gain staging.

SNR 2: The mic we use to record with will have it's own internal SNR which we may need to consider (especially with cheaper mics) but more important is where we place it. The closer to the sound source (in this case the actor's mouth) we can get the mic, the more signal we will record, thereby minimising the reduction of the SNR window already defined by SNR 1. Obviously we can't usually get very close to the actors mouth, so shotgun mics with tight pickup patterns are usually invaluable but then you need to be that much more precise with where you position and point the mic, which is why a good boom operator is so invaluable. We now have a new SNR, with a new, smaller window which as with SNR 1, can only be decreased further until post. How big this new SNR window is will depend on the skill of the boom op, the situation he/she is faced with and to an extent the quality of the mic and boom.

SNR 3: The signal output from a mic is tiny and would be near or even below the internal noise floor of our recording device (SNR 4), so we need to amplify it quite considerably with a Mic Pre-Amp. The essence of using a mic-pre is therefore defined by amplifying our signal to a level suitable for use downstream (recording), while adding as little noise as possible. It's fully understanding this statement which trips up so many inexperienced and even quite a few experienced production sound mixers. In other words, what constitutes a suitable level for recording (answered in SNR 4) and, what noise is added by a mic-pre? There are essentially two types of noise added by all mic-pre's: 1. It's own internal noise floor and 2. Overdrive distortion. The production sound mixer's job is to capture as much of the SNR 2 window as possible by finding the optimum point between the mic-pre's two types of noise. BTW, we are still in the analogue domain so this overload distortion doesn't suddenly happen but starts inaudibly and increases proportionately as we increase the mic-pre's gain. It will take practise, testing and experience to discover this optimum point for the individual make/model of mic-pre. Obviously, the more expensive mic-pre's will offer a bigger window of opportunity, by having a lower noise floor and achieving higher output levels before overload distortion. The nominal output (line) level for mic-pres should be +4dBu, which any mic-pre must be able to output without audible distortion. Top of the line mic-pres can go as high as +18dBu without distortion becoming noticeable.

These dBu figures become important when we get to the recording stage (SNR 4). The Analogue to Digital Converter (ADC) takes the signal from our mic-pre/mixer and as the name suggests, converts it to digital data for storage. How this analogue input (mic-pre/mixer) level corresponds to the digital level depends on how the ADC is calibrated. For film (worldwide) and TV (in many countries) +4dBu would equal -20dBFS (European TV: +4dBu = -18dBFS). What this means is that even with the very best mic-pres money can buy, we are going to start adding distortion at about -6dBFS (+18dBu). And, considerably lower than this for not so high end mic-pres.

In other words, in pretty much all cases a signal peak of -6dBFS is on the limit or more likely some way outside of our optimal SNR window for our mic-pre! Providing we are calibrated to film standards the optimum level for our mic-pre is going to be around -20dBFS with peaks at around -12dBFS but what about the optimum level for our recording device:

SNR 4: In the days of tape recorders, the SNR was little more than 70-80dB, it was standard practice to record "in to the red" (the red line being set at 0VU = +4dBu) to get the signal as hot as possible and as far away from the noise floor of the tape machine as possible because all mixing processes in audio post would add further noise and we would run out of SNR. Even recording as hot as possible wasn't enough though and additional noise reduction was required when the final mix was printed to film (Dolby Noise Reduction). With 16bit digital, the noise floor was lowered, providing a SNR of over 90dB and providing we still recorded near the red line we no longer needed the addition of Dolby NR.

24bit recording was a huge leap forward, so big a leap, it actually exceeds the limits of the laws of physics! In reality there is no such thing as a 24bit converter, although 24bit ADCs output 24bit files there is not 24bits of digital audio signal stored in those files. This is because even with a theoretically perfect circuit design (which is impossible), the noise of electrons colliding inside the resistors and capacitors is considerably louder than the noise floor of 24bit digital! The best ADCs money can buy use about 20bits and the limits of the laws of physics would be about 22bits.

So, all those people out there advocating recording as hot as you can are over a decade out of date, those days are over! Recording as hot as you can (the SNR window) should now be defined by the optimal performance of the mic-pres, because the SNR window of mic-pre's output is going to be several hundreds of times smaller than that of the recording medium (24bit). Even if your recording peaks no higher than -20dBFS, in 24bit the SNR window defined by your mic-pre is not going to be affected.

One last point, something we have to be careful of is that most "stand alone" ADCs (and DACs) are designed for music use and are usually calibrated to +4dBu = -18dBFS, -16dBFS or even -14dBFS not the film/TV standard of -20dBFS and will therefore need recalibrating!

Hope this was useful?

G
 
Last edited:
Good stuff, A.P.E.!

Probably an American vs European thing, but here Signal to Noise Ratio is usually shortened to S/N-R.

S/N-R #2 is the one that I harp on almost exclusively; getting the mic in close and aimed properly. If the dialog isn't captured cleanly the rest is pretty much moot. That's why I am always suggesting that someone who knows what they're doing swing the boom.
 
Hi Alcove,

Yes, I have seen it written as S/N-R and to be honest that is probably the more accurate way of abbreviating it. In this age of internet abbreviations, SNR seems to be more common though.

My point in writing the OP was I thought it would be useful to have a page to link to, as we seem to mention gain staging quite often. Although I didn't really mention gain staging on the output side, which is obviously very important when we get to post. I'm not sure I wanted to get too far into gain staging DAC output levels, speaker and amp impedances and wattages, consumer verses professional line levels, etc. Maybe another time!

I completely agree with you on SNR #2. There's quite a lot we can do about SNR #1 but relatively small (even tiny) changes in mic position can make a big difference with SNR #2 and with the audio quality beyond just the SNR. In other words, the skill/experience of the boom op is usually the single most deciding factor of the quality of production sound recording. I have to say though, I'm also frequently faced with overload distortion from the mic-pre and overload distortion from hitting the digital limit (0dBFS). There's really little excuse for the latter with 24bit. 99 times out of 100 it's because of ignorance of what 24bit means, it's massive SNR window and therefore the headroom it offers, without loss of quality.

G
 
Last edited:
Hi Alcove,
... I have to say though, I'm also frequently faced with overload distortion from the mic-pre and overload distortion from hitting the digital limit (0dBFS). There's really little excuse for the latter with 24bit. 99 times out of 100 it's because of ignorance of what 24bit means, it's massive SNR window and therefore the headroom it offers, without loss of quality.

G

24 bit sampling resolution captures extreme fine grain details of the analog signal voltage so that the digitized signal very closely matches the original. 24 bit resolution also reduces the digitizing artifacts (noise) from the recorded sound. However, a 24 bit recorder will not offer any so called headroom above 0 dB, which was there in the magnetic tape recorder. Once your signal exceeds 0 db, it gets 100% clipped. That makes recording loud sound transients very challenging in digital recorder.
For example, you need to record on-location a normal dialog followed with a scream, even with 24 bit recorder you may not record. You need to use a quality limiter to handle such loud transients. Almost all professional recorders have a limiter, some are better than other.
 
However, a 24 bit recorder will not offer any so called headroom above 0 dB, which was there in the magnetic tape recorder. Once your signal exceeds 0 db, it gets 100% clipped. That makes recording loud sound transients very challenging in digital recorder.

This is statement represents a common misunderstanding of how digital audio works. 24bit offers potentially about 1,000 times more dynamic range than a magnetic tape recorder. The mistake made by those who don't understand digital audio is to equate 0dBFS with 0 (VU) on a magnetic tape recorder or analogue mixing desk. The "0" point equivalent on a digital system in fact equates to -20dBFS (in TV and film) and therefore digital systems provide nearly 20dB of headroom. So in fact, contrary to your statement, recording loud transients on a good quality 24bit digital recorder is far less challenging than it used to be on an analogue recorder! This also means that in many recording situations the use of a limiter is not required and is why the better quality recorders provide the ability to turn off the limiter.

The mistake made by many is turning up the preamps too high, in an attempt to record a signal hotter than is necessary. This problem is exacerbated by the fact that many prosumer quality recorders have relatively noisy preamps, which makes for a very limited range of optimal recording levels (as explained in SNR 2 of the OP).

G
 
Well, the limiter is an extra part of the signal chain so there is reasoning behind disabling it.
I personally tend to leave it on 'just in case' but try my hardest to never actually engage it. Then again the limiters on my recorder are very high quality and I trust them to limit effectively and add negligible artefacts to my signal chain. The cheaper the gear the less trusting I would be and more likely to possibly disengage, especially if I couldn't trust them not to still destroy the sound when they do engage.

On the subject of recording to -20db, the amount of 'editors' who have complained about 'problems with sound' when challenged they state 'it's too quiet, I needed to put 24dbs gain on it'. Not bad I thought considering I was also booming and couldn't keep tweaking levels. One guy even complained 'it's all hissy' while passing me consumer headphones plugged into his macbook at full volume. That time I had my recorder with me and still had the files on disk so I jacked it full whack and put the headphones on his head (i wasn't cruel enough to turn the headphone limiters off). I then asked him again if it was too quiet or too hissy. After that I explained the difference between toys and tools and went off to find the producer to ensure he was actually getting a soundy in to do the sound editing and mixing.
 
so I do get confused here.. on the practical side.

I recorded some digital audio, it sounds nice and clean, but just low in volume. When I try and work with it in Premiere, I cant seem to boost it enough to get nice level in the mix, its not getting noisy, just not loud enough
 
I recorded some digital audio, it sounds nice and clean, but just low in volume. When I try and work with it in Premiere, I cant seem to boost it enough to get nice level in the mix, its not getting noisy, just not loud enough

What you have described is an exceedingly common problem but not directly related to the OP. As well as proper gain staging for recording, there is the other whole area of proper gain staging in audio post and in many respects, proper gain staging in audio post is more difficult to get right than with recording production sound. In my experience of working with professional picture/video editors for many years, without exception they do not have the knowledge or the equipment/facilities to get anywhere near the correct gain staging for audio post. With more than 90% probability I would say that your problem is not that your recording is too low in volume but that your playback system; Digital to Analogue Converter (DAC), amps and speakers are set too low and therefore all your Sound FX, music, etc., are all too loud, which in turn makes your recorded digital audio sound too quiet in comparison.

If your playback system is set too low, what should it be set to? The only way of knowing if your recorded audio really is too low in volume (or possibly even too loud) is to play it back in a properly calibrated mix environment and when we ask the question "what is the proper calibration (the playback system output level)?", this is where it starts to get complicated. Output system calibration is really deserving of a thread all of it's own, although to be honest it's been covered to a large extent in various other threads here on indietalk. I won't get into it here, except to say that the correct calibration for audio post varies, depending largely on the distribution of the final product; cinema (film festivals, theatrical distribution), TV or internet for example. Just to add to the confusion, the "proper calibration" of the playback system for Film/TV is very different to that used in the music industry, because the music business doesn't have a defined reference (calibration) level. If you take an average commercial music track and play it back on a system calibrated for TV or film audio post, it would either blow your speakers or blow your head off! This fact catches out virtually all inexperienced film/TV music composers, the vast majority of picture/video editors and even most inexperienced professional Sound Designers/Re-recording mixers! When dealing with music from inexperienced composers or music libraries, I usually have to reduce it's level by somewhere around 18dB to 30dB, same is true of pretty much all sourced Sound FX (commercial or free).

It is extremely unlikely that you would need to normalise.

G
 
Last edited:
So, what are some pres that can go as high as +18dBu without distortion becoming noticeable?

There are various mic-pres designed for music studio use which are capable of this, although they aren't so suitable for film use because of their colouration. Some of the Grace Designs mic-pres could go up to +18dBu without distortion but for production sound use I would expect the Sound Devices mixers/pres to get closest to this figure.

G
 
No, I fear normalize, I shall swallow my fear!

I must have missed this post previously. No, you are right to fear Normalizing! While the normalize process itself is perfectly safe and will not damage your audio, any subsequent processing, such as EQ or compression, is likely to. By all means increase the gain but I would strongly advise against normalization.

G
 
Out of curiosity, when recording with a battery powered mic, does using the battery power feature or the phantom power on the recording device make a difference, or will it sound the same with either setting?
 
Mics that use an internal battery to supply power to the phantom power circuitry of the mic, like the NTG-2 and AT897, tend to have lower volume output levels. This does not change the sound (the NTG-1 and AT875 sound the same as their battery powered siblings) but the lower levels require more gain from the pre-amps, adding more of the self-noise of the mixer or recorder. With budget recorders like the H4n or DR-100 this amount of self-noise (usually hiss) can be very substantial.
 
I must have missed this post previously. No, you are right to fear Normalizing! While the normalize process itself is perfectly safe and will not damage your audio, any subsequent processing, such as EQ or compression, is likely to. By all means increase the gain but I would strongly advise against normalization.

G
OK this is intriguing. Would this be because with the peaks set to zero any processing that adds gain will push those peaks into clipping?
I do often normalise as part of my workflow but as a final process. Anyways I always find normalising at the start of workflow useless because slate claps and handling noise in the unedited audio set the peaks and normalisation normally has no effect.
 
OK this is intriguing. Would this be because with the peaks set to zero any processing that adds gain will push those peaks into clipping? I do often normalise as part of my workflow but as a final process.

Essentially yes, although it's not just any process that adds gain which could cause clipping! A low pass filter for example would also usually push a normalised signal above 0dBFS.

BTW, you should never normalise your final output, for two reasons: 1. It will fail QC for any commercial broadcaster or distributor and 2. The reason why no distributors or broadcasters allow peak values up to 0dBFS, is that there is always some downstream process after the sound mix has been completed and those processes are also likely to cause clipping. This is just as true if distributing on youtube as it is distributing to a network broadcaster.

G
 
Oh yes, a valid point. If I do normalise it would be the individual stems so I don't run out of fader movement when mixing low volume recordings.
If I am being particularly lazy I have sometimes normalised the final mix if I find I don't have much time to play with the mix level (should point out here that we talking personal stuff headed for soundcloud or CD, not paid work, I don't professionally mix, though when I did it was to -9 to DAT). Though I have normalised a few projects that would play out on radio but as they playout through desks with a fair bit of headroom I think the rules are different but probably still somewhat naughty. Some industry standard software such as BURLI and I think Selector auto-normalise all material though that could easily be to -6dB or similar safer levels. I should really look into this as I am increasingly producing a lot of content for radio.
Thanks for the heads up.
 
Though I have normalised a few projects that would play out on radio but as they playout through desks with a fair bit of headroom I think the rules are different but probably still somewhat naughty. Some industry standard software such as BURLI and I think Selector auto-normalise all material though that could easily be to -6dB or similar safer levels. I should really look into this as I am increasingly producing a lot of content for radio.

Having worked in audio production in radio before moving to film, I can tell you that you have both the on-air automation system truncating silences at the start and end as well as normalising the audio, but you then also have the station compressors that bring into line all audio.

When producing station promos and imaging, we used to normalise, and then compress the hell out of everything, as well as the final mix. Make it sound as loud as possible ;)
 
Last edited:
Back
Top