• Wondering which camera, gear, computer, or software to buy? Ask in our Gear Guide.

Workflow - how is sound sync maintained when moving from NLE to DAW?

Hi there; this is my first post here.

I thought I'd join because I've had no luck Googling this or reading about this anywhere.

I have dailies synced up with their respective pieces of external audio in Adobe Premiere; I did this manually with reference to the clapperboard.
But now my intention after the final cut of the edit was to render the video alone into my DAW, so I can mix dialogue, SFX and music together without the limitations of Premiere's audio editing.

My question is: how do I maintain the synchronisation between the final cut of the video and the dialogue I've lined up in the video editing program? Do I have to make a list of timings for each clip of dialogue audio?

I don't want to render the dialogue with the video because I want complete control over the dialogue mix in my DAW.

Hopefully that's not too confusing.
Thanks in advance!
 
The question is understandable, but a little unexpected.

Simply put, if you render your video separately from your audio, and if they are already lined up perfectly in the timeline, then the two tracks that you will have once you render them will still line up, because they will both begin at the same instance they do now, and their middle's will still line up, but their endings might not depending on if you move the end marker for the sequence itself.

Best thing to do is to simply render out the video and the audio separately like you want, and then bring them both back into Premiere and see if they still line up. If they don't, then there's likely something screwy with the frame-rate settings or the video codec: so make sure you set up the render settings to match as far as frame rate goes.

And just in-case this still sounds weird, each of the empty spaces between your sound clips in the timeline will be represented by silence when it is rendered out as a single track. That's why the sound clips won't lose their synk when rendered. The render does not move the clips around, it simply fills in the empty space with black if it's video, or silence if it's sound.
 
Last edited:
I'm not sure exactly what your question is but you might want to check out this thread started by APE

http://www.indietalk.com/showthread.php?t=41292

It talks about sync pops, among other things, if that's what you're really asking about.

This thread helped me tremendously towards my understanding of working with a DAW which I was not really considering when I started on my audio post. I'm more or less done with my audio post now, and just figured out how to print the audio and am printing as I write.
 
Last edited:
It's pretty old school, but I like a 2-pop (or a traditional 10 second count-off) on the head and a 2-pop on the tail.

https://www.youtube.com/watch?v=VxOfiXe-CfM

Burning TC into the picture is also extremely helpful.

https://www.youtube.com/watch?v=QE0JsYEiFK8

Render the visuals WITH sound, which can be muted in the DAW; it's one more thing to help you maintain sync.

As long as you use the proper NLE export and the correct DAW import conversion settings for OMF or AAF throughout the process you should have no problems.

If you are using a native Pro Tools set-up using a .dv stream will allow things to run a little more smoothly.
 
My question is: how do I maintain the synchronisation between the final cut of the video and the dialogue I've lined up in the video editing program?

The standard workflow to achieve what you're after is to create an AAF (or OMF) in your NLE and then import that AAF into your DAW. AAF is a container format which will (depending on settings) contain all the audio tracks and clips in your NLE's timeline. When you import that AAF into your DAW all those tracks will be created and all the audio clips those tracks contain will be located on the DAW's timeline as they were in your NLE. In effect, AAF allows you to recreate your NLE session in a DAW.

FilmmakerJ's suggestion, of rendering each of the audio tracks in your NLE to an audio file and then importing each of those audio files to a separate track in your DAW will work. However, contrary to FilmmakerJ's assertion, it is the least favoured method for a number of reasons: Firstly of course all the audio is rendered, all the edit boundaries are lost and all fades, x-fades and any other editing/processing carried out in the NLE is rendered and therefore unidentifiable and/or can't be undone. Also, as each track in your DAW would contain 1 audio file the length of the session, it's very difficult to see what is happening, where it's happening and on which track. Lastly, there are various other situations/scenarios which can occur during audio post which either cannot be fixed or would be massively time consuming to fix if all you have is rendered audio files.

Having said all this, on a couple of occasions I have had to work with a pic editor who didn't know how to export a functional AAF and after playing around for a few days and not getting anywhere, we had to give up and just render the audio. Rendering the audio should always be an absolute last resort though.

Do I have to make a list of timings for each clip of dialogue audio?

No need, your NLE should be able to create such a list automatically for you. It's called an EDL (Edit Decision List) and it would be worth your while researching EDLs a bit. When exporting an AAF you don't strictly need an EDL because all your edits are preserved anyway. However, it's still well worth you spending the few seconds it takes to create an EDL when you export your AAF. There are a few potential scenarios where an EDL can be invaluable and save days of work.

I don't want to render the dialogue with the video because I want complete control over the dialogue mix in my DAW.

Generally, the temp mix created by the pic editor would be included in the video file and then the individual tracks/clips included in the AAF. The editor's temp mix never finds it's way into the final mix but is an invaluable reference.

As Alcove mentioned/implied, careful of the video codec you use. Avoid the codecs designed for online distribution (H.264 and MP4 for example) and stick to the "intermediate" codecs designed for the task. I prefer Apple ProRes or Avid DNxHD.

G
 
It's pretty old school, but I like a 2-pop (or a traditional 10 second count-off) on the head and a 2-pop on the tail.

Burning TC into the picture is also extremely helpful.
I was doing this anyway for show, but I can see how these would both be helpful now. Thanks!

The standard workflow to achieve what you're after is to create an AAF (or OMF) in your NLE and then import that AAF into your DAW. AAF is a container format which will (depending on settings) contain all the audio tracks and clips in your NLE's timeline. When you import that AAF into your DAW all those tracks will be created and all the audio clips those tracks contain will be located on the DAW's timeline as they were in your NLE. In effect, AAF allows you to recreate your NLE session in a DAW.

No need, your NLE should be able to create such a list automatically for you. It's called an EDL (Edit Decision List) and it would be worth your while researching EDLs a bit. When exporting an AAF you don't strictly need an EDL because all your edits are preserved anyway. However, it's still well worth you spending the few seconds it takes to create an EDL when you export your AAF. There are a few potential scenarios where an EDL can be invaluable and save days of work.

Exactly what I was looking for - surprised that wasn't easier to work out myself. Thanks very much.
 
As Alcove mentioned/implied, careful of the video codec you use. Avoid the codecs designed for online distribution (H.264 and MP4 for example) and stick to the "intermediate" codecs designed for the task. I prefer Apple ProRes or Avid DNxHD.

G

I don't use Apple Pro res, but I have used Avid DNxHD, and the file size is actually quite large for not much quality. But what I discovered also was that if you output H.264 format in a .mov file, then Pro Tools has no problem reading and delivering a high picture quality with a much smaller file size.

And yes, with your video file, you probably should have your "temp mix" as APE called it (I don't know the terms) with your video. That's what I read. And that's what I did. Although, I have to say, now that I think about it, I muted it at the very beginning of the audio post session, as instructed, and never really referenced it. But I can see why it's good procedural habit. Can't hurt, and could definitely help.
 
I don't use Apple Pro res, but I have used Avid DNxHD, and the file size is actually quite large for not much quality. But what I discovered also was that if you output H.264 format in a .mov file, then Pro Tools has no problem reading and delivering a high picture quality with a much smaller file size.

It's always a trade off. Long GOP codecs (like H.264) achieve high quality with a small file size by encoding "groups of pictures (frames)" referenced to a "key frame", the default for H.264 is (I believe) 1 key frame every 80 frames. The computer playing back a long GOP codec looks at the key frame and then reconstructs the next 79 frames which effectively just contain the data of the pixels which have changed (relative to the key frame). This significantly reduces file size as only 1 in 80 frames contains a full frame's worth of data. The main disadvantage is that a relatively large amount of processing power is required to decode a long GOP codec, as the other 79 frames don't really exist as such but have to be computed. This causes an additional disadvantage when working on a long GOP codec, in that you are constantly stopping and starting and the chances are small that your playhead is stopped on a key frame. If it's instead stopped on an I-Frame (intra-frame), ProTools is largely guessing exactly which frame it is. In other words, using a long GOP codec uses up CPU power which would best be reserved for ProTool's audio processing, sync accuracy is far less stable/reliable and there are often problems with certain functions (scrubbing is a good example). None of these issues is much of a problem for the average viewer, as they tend to just watch a video all the way through, rather than starting, stopping and jumping around all the time and usually when they are watching a video that tends to be the only thing they are doing, rather than running other processor heavy applications at the same time.

Long GOP codecs are therefore good for distribution but bad for working on. Depending on various factors, ProTools can play nice with H.264 video files, other times it can have obvious serious problems and other times it can have problems (such as sync issues) which may not initially be at all obvious. For this reason, it is advisable to use a codec designed for the task which completely avoids all the potential long GOP issues by not being a long GOP codec! ProRez and DNxHD are codecs specifically designed for working on. They use far less processing power because each frame is stored rather than having to be computed and likewise accurate sync is far more reliable. The price though is a much larger file size for an equivalent visual quality but storage is relatively cheap and the trade-off easily worth it. BTW, in some/many commercial projects the loss of visual quality by using one of these "intermediate" codecs is irrelevant because they are only being used as a proxy, to speed up the pic and sound editing processes. If you're hooked on using H.264 for editing then you can but my advice would be to change the default H.264 settings so that every frame is a key frame (rather than 1 in 80), which will solve the potential issues mentioned above but will also of course increase file size and/or decrease visual quality.

And yes, with your video file, you probably should have your "temp mix" as APE called it (I don't know the terms) with your video. That's what I read. And that's what I did. Although, I have to say, now that I think about it, I muted it at the very beginning of the audio post session, as instructed, and never really referenced it. But I can see why it's good procedural habit. Can't hurt, and could definitely help.

The two main reasons to have the pic editor's mix are: 1. It can provide useful clues about the intention of the pic editor and director. This is obviously more relevant when the audio post person is not the same person as the pic editor and director and 2. It provides a sync reference, even though it's commonly not a particularly accurate one.

G
 
Last edited:
Pretty good info. I have to say, I don't know much about how codecs work.

The two main reasons to have the pic editor's mix are: 1. It can provide useful clues about the intention of the pic editor and director. This is obviously more relevant when the audio post person is not the same person as the pic editor and director

G

Very good point indeed.
 
Back
Top