Video Game Audio: Diegesis Theory

platfrom_banner
In a previous article, we looked at the diegesis theory of interface design. The theory can also be applied to audio design. This is a look into the design of audio for games, and how diegesis theory can help us structure our thoughts.

So where do we start? Let’s start with an experiment.

  1. Turn computer on mute, start-up favourite game and play a bit of it.
  2. Now do exactly the same thing, but this time with the audio.

You would have most likely found the game easier, and more enjoyable to play. So what are the functions of audio in games? Sound serve four main functions in a game:

  • Sound gives feedback to the player, often in addition to visual feedback. This is most obvious in the interface, where buttons make noises. In the game itself, audio feedback helps the player feel his actions have an effect, an example is the sound played when a gun is reloaded.
  • Sound gives the player information about the game and its world itself, for example, the sound of approaching enemies warns to player to get ready. It also helps draw the attention to important game events.
  • Sound forms part of the award system of games. The sounds and flashes that go with completing a row in Tetris make the act much more pleasurable. It is a short-term award that supports the long-term award of not dying and beating a high-score.
  • Sounds help create realism. They help to immerse the player deeper into the game world and encourages the suspension of disbelief. It helps put the player in the scene, making her part of the action. Explosions without audio are neither believable, nor very dramatic.
  • Sound creates mood and pace, most often as background music. The cinematic soundtracks used in strategy games have nothing to do with realism or feedback or information. Instead, glorifies the setting, it makes the events feel epic, and so enhances the player’s feeling that he is doing something important. Similarly, the action-packed feeling of an arcade game is as much a result of the blingy music as it is of the actual gameplay.

This article is going to look at how we can use the concepts of diegesis or narration to help us design audio for a game. We will look at various different games to understand how this approach will help us.
As with the diegesis theory of interface design, we are concerned with two main concepts: narrative and the fourth wall. I repeat the explanations of these two concepts from the interface article:

Narrative

Narrative is a message that conveys the particulars of an act or occurrence or course of events. In simple terms, it is the story the designer wishes to convey; be it the story of blocks falling from the sky which need to land in the right place (Tetris), or a journey through a strange land (Machinarium).

Not all elements of a game are part of the narration. For example, the game menus and the HUD, because the game’s characters are not aware of these elements. This does not mean these components do not support the narrative. For example, a futuristic game typically has GUI elements that also appear futuristic.

The fourth wall

The fourth wall is the imaginary divide between the player and the world of the game. In order for the player to immerse themselves in the game world, he needs to move through the fourth wall. The ease with which the player moves between the real world and the game world depends on the way the interface designer delivers information to the player.

Posting your latest game accomplishments on Facebook is an example of how a game extends beyond the fourth wall. To further delve into this concept, one should read Steven Conway’s interesting discussion of the fourth wall in games: A Circular Wall? Reformulating the Fourth Wall for Video Games.

Sounds in games

We can now ask ourselves two questions about any sound:

  • Is the sound of the game story? (Is it part of the narrative?)
  • Is the sound of the game space? (Is it behind the fourth wall?)

Depending on the answers, we can classify the sound into one of four classes: diegetic; non-diegetic; spatial; or meta.

The diagram below summarises the four possible combinations.
In talking about audio, meta and spatial representations are often clumped together with non-diegetic and diegetic sounds; we won’t cover go into detail into meta and spatial representations here (they are not very common), but it is still useful to keep the distinction.

Diegetic sound

For diegetic sound, we answer our two questions as follows:

  • Is the sound in the game story? YES
  • Is the sound in the game space? YES

When we talk about diegetic sound, we are talking about sound that is part of the world that the player is in, any audio produced in your game world. This is includes sound-effects, ambient environmental sound, and game character dialog.

In this trailer of Call of Duty: Black Ops, all the sound is diegetic. The absence of music helps build tension, and helps the player focus on the events as they unfold.

Although diegetic sound is used most often to increase immersion and realism (bigger guns make bigger noise), it can also support gameplay. In Dead Space, the audio (an alien “screaming” from behind) is used to draw the player’s attention to the wave of enemies that is behind him.

Non-diegetic sound

For non-diegetic sound, we answer our two questions as follows:

  • Is the sound in the game story? NO
  • Is the sound in the game space? NO

Non-diegetic sounds are audio cues provided from outside the world that the player is in. This includes background music (where is this magical omnipresent band playing in the world?) and interface sounds.
Music is incredibly useful to create atmosphere and manipulate the player’s emotions. (Changing the music is the single most effective tool used for genre bender spoofs, such as the recut trailers for Mary Poppins or The Shining).

Robot Unicorn Attack the crazy backing track supplied by Erasure, together with the chirpy graphics, emphasises the ironically chosen title of the game.

In games that use adaptive music, the player’s emotions can be controlled even more tightly. How much more dramatic is a battle when the brass enters as it starts?

Interface sounds give the player feedback on his or her actions. They are vital to show that the interface is responding, and to draw the player’s attention to important events or information.

In Borderlands, the music effectively becomes part of the interface, changing based on your life level, presence of enemies, and their difficulty.

Meta representations

For meta sound, we answer our two questions as follows:

  • Is the sound in the game story? YES
  • Is the sound in the game space? NO

Meta audio is audio which sits between the world of the protagonist and the player. The most obvious example is a game narrator, which is not part of the game world, but definitely adds to the story. In Bastion this is taken to the limit, where a running commentary is given on the player’s every move.

Spatial representations

For spatial sound, we answer our two questions as follows:

  • Is the sound in the game story? NO
  • Is the sound in the game space? YES

Spatial audio representations are sounds set in the game space, but is not part of that world. It is not very common in games, mostly because we cannot easily map the source of a sound with a location in the game.

One example is in Anno 1404, where the music changes depending on the location the player is viewing: the music gets an Arabic flavour when the player views locations in the “Orient”; this, together with the visual differences, helps to sets the east part from the west in the game. Another example is the audio-navigation system used in The Path.

Links

  • The IEZA framework is another very useful theory of game sound. It also makes use of the diegetic axis.
  • Game Sound provides a whole bunch of links to publications on game audio.
  • Sander Huibert’s thesis Capitivating Sound [14.6 MB, PDF] provides a wealth of useful information on how game sound contributes to immersion.

Thanks

Thank you Zachary Reese and Devin Moore for giving some of the examples.

Header image from http://www.flickr.com/photos/frostnova/606822557/sizes/z/in/photostream/.