WBL: The importance of the listening environment for mixing

The right listening environment is the most key tool an engineer has in their arsenal. To provide a flawless mix it is required that the engineer is able to concisely define flaws and this requires a suitable environment. If the listening environment is viewed as the canvas and the engineer a painter, it would be almost impossible to create a piece of canvas was in the dark. We don’t perceive these issues as relatives, because we as consumers and human beings are are frequently provided with examples of pristine image quality and therefore know if something is sub par. This however is not the case for music consumption as we are almost never provided with a perfect listening environment and cannot therefore reference it. This is down to the differential physics of sound and light and the way in which they react to our environment. However, the necessity of clear monitoring is essential for both creative processes.

When talking about monitoring environments, we are often told that the aim is a flat room. This isn’t strictly true but it is definitely in the right ball park. When treating a room the aim is to provide the most true representation of sound and as almost no space is naturally flat in frequency response, it seems a little counterintuitive to mix within one. A good listening environment should provide a clean frequency response with minor colouration. As long as each section of the frequency spectrum is audibly clear we are heading in the right direction.

Most professional grade studios adhere to what is known as the “analogue curve”. This phenomena describes the a gradual roll off of frequencies from the upper midrange onwards. This is helpful for a few reasons. The curve initially occurred as a compensation method for the loss of high end frequency content on magnetic tape allowing engineers to record a brighter sound to the tape initially. But has been adapted over recent years to help reduce high frequency ear fatigue, helping to avoid mixes becoming too bright. When ear fatigue sets in, we lose our perception of high frequencies, analogous of a frequency roll off pad, meaning that we over compensate and increase the high frequencies more than is desirable.

This having been said many digital recording rooms are able to bypass this curve because there is no loss of audio in the digital domain. Bob Hodas states ” However, the fatigue factor is still at work, and if any roll-off is applied, it is usually based on the engineers listening levels. I believe that one of the reasons many of the early digital recordings were harsh is that they were made in rooms originally set up to do analog work. Old habits die hard in the audio trade.”


Achieving a good listening environment requires the application of a combination of techniques. Firstly, one should address room layout. Everything within a room from the walls to the empty coke can on the desk affect the way a room sounds. So to treat a room every detail should be addressed. In a budget-less world, the room could be floated to reduce audio bleed and your monitor’s should be floated by a multi-thousand pound hanging system. But in the world that I live in these just aren’t possible. To address the issue of the listening environment I have implemented the following techniques. When building the room I parallel mirror offset the walls by 3 degrees in each parallel set of walls to avoid standing waves. I used triple layered dB plasterboard and double thickness Rockwool, with a 20mm air gap to  achieve the highest absorption coefficient  possible for the space and budget. I then measured the rooms frequency response and RT60 values. After studying these results I began building diffusers. Unfortunately because of the size of the space their is not enough room in my mixing room for bass traps – I decided that having a seating space for clients was more pertinent as a lot of the work I currently do is producer based and requires long sessions with artists to develop. Distinguishing where the sound first meets the wall I implemented diffusers to break up this sound and avoid reflection and medium transference.

To improve my listening space I could look in to methods such as Sonarworks reference 4, which uses a computer application to measure the room and tailors the output of the system to act flat within the room. I could also experiment with different types of treatment to see which provides the best results.


WBL: Isolating near-field monitors according to budget

Near field monitor speakers are used in the process of mixing and producing music to provide a clear, accurate representation of the sound. This is important so that the engineer is able to determine whether or not a change needs to be made. In essence a flat response means we can hear a true representation of a sound. A room is made flat in a number of acoustic solutions. The sound of a room will be heavily affected by the dimensions of a room and material makeup of the room itself. To treat this we use items such as diffusers, bass-traps and absorbers. When we refer to isolating speakers this means to isolate them from sound conductive materials. Your room may sound amazing but if the speakers are resonating with the desk, the sound you hear will be muddy and inaccurate. When a speaker resonates with a different object it creates a new sound source, a source that will not be present in other listening environments and therefore is an inaccurate representation of the sound, not to mention it will sound nasty.

For more in depth information on overall room treatment visit:http://www.bobhodas.com/optimizing-the-studio-listening-environment.php

When treating speakers specifically, the idea is to dissect the conduction of sound waves by placing a non conductive material between the sound source and the desk/stand. These materials can either be really expensive or dirt cheap and of course in this range, we can expect differing results of effectiveness.

Starting from a the high end of the spectrum we have products such as the Iso Acoustics GAIA puck which retail at around £80 per item. And with at least 6 of these being needed, this is definitely not a budget solution. These work by introducing a suspension based air gap set between two ceramic plates (ceramic is really good at absorbing sound), to diffuse all of the sound before it has a chance to achieve medium transference.

Getting a little cheaper we have isolating stands such as the Iso Acoustics L8R200, which retail at around £120 a pair and work by adding a conductive stand between the speaker and the desk.

Getting cheaper and more ineffective we have the old insulation foam. Foam is an okay conductor of sound. The problem here is single medium transference. With the speaker resting directly on top of the foam, it has no chance of escape and all of the heavy lifting is expected from the acoustic foam, which it just isn’t capable of. Using foam as a room diffuser is more effective because the sound is not concentrated when it meets the new medium (it has been allowed to diffuse in the air), but when speakers are resting directly on top of foam this is eradicated and the sound is much more prevalent and hard to deal with.

Getting absurdly cheap we have the old half a tennis ball trick. Whilst this is preferable to no treatment at all, it has a minimalistic effect. However this having been said, when the room sound is measured with software such as Room EQ wizard there are significant difference is the frequency response, especially in the bottom end. I believe this is because the tennis balls at least eradicates direct medium transference.

Next we have layered carpet tiles, which have a similar effect to the tennis ball. Definitely better than nothing at all. But not great!


With research I have discovered there are many DIY solutions to this problem and I would be interested to see how effective these actually are. Perhaps a project for the future.



Shepard Tones: Dunkirk

The Shepard tone, named after Dr.Roger Shepard, a cognitive scientist that worked for Harvard university and Bell Laboratories, is an audio illusion that exploits the human brains inability to follow multiple pitch and dynamic changes at the same time. This illusion consists of three rising scales, each an octave apart, with the highest falling in volume, the lowest rising and the middle staying at the same. As stated before, when these scales are rising and automating in volume we are unable to track the progress of each and we perceive this as a constant rising tone.

This is effective in a musical score because it is an endless supply of tension. When we hear scales rising chromatically, the intervals between the notes are not harmonious, if the instrument has release and this combined with the Shepard tone help to force viewers to the edge of their seats.


Here is an example of Shepard tones at work in the film Dunkirk, composed by Hans Zimmer.



I used a synthesiser, with a noise oscillator mixed in, to create this effect within my Philips Carousel project. I introduced the noise oscillator to introduce more paranormal connotations.  I played a chromatic scale started at the route note of the project (E) spanning over an octave, duplicated this over three track.  I found that if i applied the Shepard tone module with the velocity it left me free to mix the overall dynamics of the tone. Alongside volume automation, the Shepard tone proved to be fairly effective.

Multimedia: Backdraft inspired sound sweetening

The movie Backdraft employs sound sweetening techniques, by layering animal noises over the top of roaring fire to instil a subconscious, primitive fear in the user. With roaring lions and shrieking monkeys, Backdraft implements a multitude of subtle and overt sounds to create an excitement and they are balanced in a way that doesn’t pull focus from the theme but actually intensifies it!


Using animalistic sounds in this context works because of our already learned, subconscious connections. We already know that a loud lion roar is scary. It sounds scary! and that’s because we’ve already learned this. So when we hear this sound out of context, we don’t immediately connect it to the conscious memory of a lion but the subconscious connection of the emotion of fear, because this cognitive path is shorter and is therefore received quicker. Think RAM.


To implement this I gathered a few animal sounds that I felt could work throughout the film, that correlated to certain themes. I then sprinkled these around and found that they worked well in a few places. When I found there resting place, I dragged them around a little until it felt right, and then started processing them a little. To make them stand up against the surrounding sounds, I parallel compressed the sounds to make them sit in the mix. I then bussed them to a fairly sparse reverb. I then juxtaposed this against a shiny wavetable synth in serum. The sounds that I used are a lot less aggressive and this resulted in them sounding better in more serene, less dark atmosphere.



Multimedia: ADR – Automated Dialogue Replacement

ADR is the process of re-recording dialogue, over the visually recorded scene. A process that has become much simpler over recent years thanks to the advancements of technology but has been used in film for countless years. In the years before digital recording technology this process was referred to as looping, because the scene would be physically spliced from tape and looped over and over for re-recording. Nowadays, the process is much faster and voice actors have an unlimited amount of opportunity to capture the perfect audio take.

ADR is usually implemented when there is a fault in the original audio’s take. These flaws can be anything from unwanted ambient noise (such as a plane flying overhead in the recording of a period piece) to unwanted audio from physical effects, such as a blowing fan, where the visual effect is wanted but the noise is not. Sometimes ADR can simply be the result of a creative change, such as the director deciding to change the voice of a certain character because it suits the themes of the film more.

A good example of ADR is the speeder chase in Star Wars episode II . In the recording of this scene, a large fan was used to emulate movement in the hair of the characters moving through a space in the stationary speeder, but this of course creates problems because the dialogue recorded on set is awash with the noise of a massive fan blowing. So by re-recording the audio, the creators were able to capture clean, coherent speech alongside immersive visual effects to create a credible, believable, immersive experience.

Multimedia: Mixing Film – Considering Variables of application and consumption.

There are many differences when mixing for Film when compared to the mixing of a song. These differences mainly arise in the desired application of the sound. For example, usually when we mix a song, we want to create a false but acceptable stereo image – the idea of a space. Now, this space can be anything we envisage that atmospherically corresponds with the recorded sounds. The way we achieve this is by recording sounds with ambient microphone techniques and applying effects such as EQ, compression and reverb in post production. But our approach must differ slightly when we approach mixing for film because our audience have a visual reference for the space and can therefore make subconscious judgements as to whether or not the audio sounds right within a space. The decision making process for the engineer should start by finding the ideal dynamic signature for each sound and progress to the ideal application of post-production effects both for the reality of a space and the desired effect of a sound.

To provide an overzealous example, if we had an instrument doused with a cathedral convoluted reverb in a song, as long as it works musically, dynamically and sonically alongside our other audio aspects, this is completely fine. However if we had a section of dialogue doused in the same reverb, but the visual reference for the space was a small concrete room, we would perceive this as unnatural, whether consciously or not we would know it’s not normal. This is all down to reference, the main tool of our sensory perception. I’m not saying that this would or has never be used, but it would feel unnatural and would therefore not be used to create a realistic nuance. Effectively mixing to visuals, creates another reference for glue – another medium to consider (the visual) and our decisions should be made accordingly. This is because of our brains natural tendency to reference patterns and correlations (our brain compares what we see and hear constantly so it know when something is amiss). This type of effect could however be used to connote an unnatural atmosphere. A great example of this type of processing is The Last Jedi episode of Star Wars in the telepathic scenes between Rey and Kylo Ren, the juxtaposed character archetypes of the film. Within the storyline, these characters have a telepathic bond, a fact which isn’t overtly stated until a fair way into the film. However, the audience are given this information by the audio. The two voices are placed within the same, ominous sounding space, telling our brain immediately that these two voices are communicating and exist within the same non-physical plane. In my opinion, this is a genius use of audio to foreshadow and develop a storyline.

Another attributing factor to the differences in mixing for film is the way in which it is consumed. Dynamic range and mixing in film is very much dependent on the audience. For example a large blockbuster film to be consumed in a cinema will be mixed so that the transient sounds are jarring and have a huge impact at around 100dB, the dialogue will be mixed so that it is loud enough to carry over the ambient sound of an audience (popcorn rattling, mumbled conversation and such) and will be compressed and mixed to sit at around 64dB relative to the audience. And the diegetic ambience of the film will sit just below. Music and score tend to sit somewhere between the highest transient sounds and the dialogue but should be applied subjectively depending on the desired effect. This is not necessarily so it is perceived as natural but because it provides the most immersive listening platform for the audience.

Once the audience has been considered, an engineer must also consider the listening environment. Using a cinema as the pinnacle and most dramatic of examples, one must consider the X-curve of a space. The SMPTE standard states that in the average cinema there will be a high roll off, algorithmically decreasing, attenuating the listeners perspective of frequencies between 2kHz to 20kHz because of the size of the space as well as it’s treatment. Cinema’s are designed to provide the most consistent listening platform possible. For example, the chairs are soft and plush not only for consumer comfort but because this makes the room sound more uniform regardless of its current capacity.

To achieve this it is important to know the system you are listening on and have solid points of reference for your audio level. An industry standard method for doing this is by running pink noise through the mixing system and calibrating all speakers to read at a universal volume from a single point (85dB is recommended). This is done with an external dB meter.

In conclusion, to create an effective mix for film one must not only consider the space we are creating within a film and the effect we wish this to have, but also the physical space in which our theoretical space is to be perceived. Then, to strike a critical, informed balance between all variables.

Multimedia: Audio Standards in Film – The Rising Tide

The optimum output volume for a selected media essentially comes down to the dynamic range of the content. For example it would be stupid to whack the faders up on a dialogue based piece with no extra-diegetic sound, because we would just be listening to REALLY loud talking. No, optimising a track for a certain volume refers to the difference between the dynamic range of these sounds and the best average volume to listen to the entire thing at. For example, have you ever been sat watching a film, where the main protagonists are having a really tense quiet conversation and then suddenly a mind bending explosion startles the neighbours two doors down, and you are left frantically padding around looking for the remote, before the next explosion strips you of your ability to hear? That is because that film is mixed to be LOUD, as a lot of modern films are. This is simply because volume is equal to impact.


This is definitely an area of ambiguity and a matter for contention in the modern industry. With the constant improvement of playback technology alongside the increasing ease of availability of home cinema systems, movie sound can get louder than ever before, without the risk of losing sonic integrity. The substantial rise in home Hi-Fi systems and the “bigger is better” mentality of loud story makers place us in a unique situation where loudness can be embedded in to the film with a lot less contemplation as to whether the audience will ever be able to truly receive it. In years past the standard volume for audio playback in a cinema setting, lay at around 85 dB. It is hard to place a single figure on this, because of course as is the case with any art, the beast defines the terms. Meaning that each case will of course have its own variables to consider and each mix should be different depending on what the film is trying to achieve. However, in more recent times mixes are being optimised for louder and louder playback, even sometimes reaching the eardrum blasting 108-110 dB. This will inevitably come down to the pressure applied from external sources. An experienced musician/engineer will know the detrimental, self defeating effects of making something louder, but the director will of course want to create the most affective piece of work possible and as we all know LOUD seems to be BETTER.


I suppose, now, with technology reaching its pinnacle it could be said that it is no longer a case of what sounds best, but how can he have the most impact without being unsafe. But is this really the avenue to explore if true quality is what is desired. Probably not right? Loud mixes can create a whole multitude of problems for example, the last action movie you saw, how much dynamic compensation were you having to apply with your TV remote between the dialogue sections and the all out action, probably a fair amount because these films cannot be listened to quietly. This is because the level that you’d have it set to, to listen to speech is way too high for the action and vice versa. This is the sound engineers attempt to impact you, overdriven by the directors pressure for volume. This having been said, sometimes there is absolutely nothing wrong with a really loud mix. I mean you wouldn’t want to watch an all-out action movie and not have it rattle your sofa, but how far is too far?

I suppose it depends on the beast.


Boyes States “We need to have some kind of even approach to dealing with clients to let them know that we agree that movies have gotten too loud. I think a real education process has to happen, and, if it does, it will be a real defining moment for film audio as we start the new century.”