Last week, audio work for a documentary, from Sérgio Miguel Silva just started and I feel grateful for the opportunity to intervene from the very beginning, as sound will also have an essential role telling stories and portraying voices that the visuals won’t be able to do so at a full extent (although incredibly powerful and jaw-dropping, I’m telling ya!)
The intention of this post is to show some primary general research and study that, although concerning possibly every audio production for picture, is focused on this particular project. It shows some information gathered by me, the kind that makes us think in advance for a theoretical concept and an appropriate approach to the audio design. It was intended to have a full preparation in order to be sure of what to do when it comes to production recording and post-production and shape it, most of all (not disregarding new waves of ideas or even various constrains that may happen during the production).
Months ago, I took some notes from the book Audio Design, by Tony Zaza, directed to the point of the actual documentary we are working on.
So far, I worked on two documentaries only in post-production. One – in a short poor description – documents a personal achievement which is in audio mainly told by the music; the other documents a city mark, by its people. The audio focus on these are the dialogues and the place’s backgrounds. Sérgio’s documentary will have voice over not directly related to the picture content, no characters appearing, but it will convey a strong message. This implies a lot of questions to be answered.
In order for you to have a better understanding, his documentary will portray spaces, or more specifically ruins. If there will be a character, those spaces will be it. And this puts immediately aside the most common documentaries – even those I had work on – and against what Michel Chion calls Added Value by the Text, in which he defends that cinema is centred on the voice, its structure engaged by the text, and because we are “vococentrists” as well.
Time, Space and Memory would be keywords.
So, to start,Tony Zaza considers that a basic analysis consists of:
1) listening for spaces
2) tonal arrangements
for emotive implication. 
I interpret this as the tone the space / room has (room tone) and what details we find in it (wood squeaks, windows waggling, bugs, resonance with exterior sounds, an oppressive silence, a soothing silence, etc.); sometimes we find an inherent melody or even harmonies that make that space unique – for example, almost every village is thought as very quiet: we hear a lot of birds, some light wind, maybe crickets if the day is fading, some farm animals and the commonalities we are used too. Very recently I heard some field recordings made in a place like that here in Portugal. But it had something more that made it very very special: the wind blowing gently through the church bells made a beautiful and unique melody filling the air.
How the sounds outside this place relates to itself (my house shakes a bit when heavy trucks passes nearly, obviously causing a distinct sound), takes us to the next point referenced in Audio Design:
Ultimate decoding is made by
1) associations with off-screen space
2) some event(s) before and after the aural event.
Already in here, I noted questions like these: what is the sound of emptiness?, what is the sound of crowded / busy?, what is the sound of cold?, and sound of heat? (we could continue).
Time and Space and Narrative
Next big question: What will sound tell that the image is not telling?
In this case sound will have a fundamental and indispensable role. Portraying a past that is no longer there is to be achieved with sound only. Ways to do it? Threw this question at Social Sound Design. We are mostly habituated to relate sounds from the past with reverbs and delays. Well, it works! Since reverbs and delays set some distance in space, could our perception relate to as other forms of distance? And what is the proposed aural distance from the spectator? Well, an image is essentially still. Sound can signify changes in time or space.  As we know, time domain prevails in sound. More: can we relate depth of field to a sense of time?
According to Zaza, information of the spatial content should give the audience the sense of
Given that each shot also implies a specific sense of space through size, composition and perspective,  I could opt to respect the picture, exaggerate it or diminish it somehow. That would depend on the narrative and the emotive content to convey. The first is mostly determined by sound focus. The second is achieved by culturally conditioned responses, on most cases.
Happy documentary making / field recording / editing!