8 Sound Effects and Dialogue
Sound and Early Film
Many share a mistaken belief that many early films were silent, but the truth is that from the beginning of cinema production almost all movies had a form of sound.[1] While early films did not have a soundtrack and dialogue accompanying them, sound was a part of film culture from the beginning, and most moviegoers prior to 1927
enjoyed a sonic experience along with their visual one. Many early films had piano or organ accompaniment simply to emphasize emotional and sensational scenes in the film. Sometimes, narrators would describe a film or live amateur actors would add dialogue to a movie. In Japan, where live narration of films was more universal than elsewhere, such narrators were called benshi.
Benshi performance
Sample score for silent films (1914)
Early synchronization of film and sound was attempted using phonograph records. Using a record as an audio accompaniment to a film meant that the sound had to be recorded ahead of time and synchronized properly with the images. If there the record skipped or the phonograph or projector malfunctioned, the sound might deviate significantly from the visuals. Here’s a video that shows a very early experiment with film and sound:
The two principal problems for creating sound film were synchronization of the sound with the actual frames of the film and amplification. Film projectors were very loud, so being able to hear sound from a poorly amplified and scratchy phonograph record proved very difficult, and many people were unable to hear these attempts to include sound in film. Achieving proper amplification, therefore, was a key goal for early sound engineers. Synchronization, or aligning a film’s visual images with their concomitant sounds was an equally important objective.
Here’s an early short sound film by Alice Guy, one of the earliest superstar directors.
Five O’Clock Tea (1905)
An answer to the amplification problem arrived when Lee de Forest created the first amplification tubes in 1906. RCA and other companies that produced radios incorporated tube amplifiers that allowed listeners of radio sound to hear clearly. RCA and other companies soon developed paper cone speakers that produced better sound quality, but amplification still presented a barrier for home listeners. Building on de Forest’s ideas, Harold D. Arnold and Irving Langmuir developed vacuum tubes that pulled more electrical power out of the sound and amp, causing it to increase in volume. Initially, a good degree of distortion accompanied the sound, but over time the engineers developed cleaner systems that reduced the distortion and vacuum tubes quickly took off. By the 1920s, radio receivers had amplification tubes that glowed a bright orange or yellow and cone speakers that produced beautiful, synchronized sound. It wasn’t long until people started seeing the potential of vacuum tubes for film sound.
The second issue was synchronization. Synchronization was difficult because it was hard to place the sound simultaneously with the visual images. Various inventors toyed with different methods for to create synchronized sound until the Western Electric Company’s engineers finally developed a system that worked reasonably well. Western Electric dubbed this system the Vitaphone, and it tried in vain to sell it to multiple studios that rejected it because of expenses related not only to the technology itself but also to the cost of outfitting existing theaters with the means to reproduce sound. Eventually in 1926, Warner Brothers purchased the Vitaphone system and began to use it in a series of shorts. Finally in 1927, Warner Brothers decided to produce an entire film featuring Vitaphone sound. The Jazz Singer, starring the popular singer Al Jolson, became a big hit even though it still employed intertitles and most of the film was silent. While there are segments of the film in which people speak and sing, The Jazz Singer is ultimately a hybrid of silent and sound film. Nevertheless, most scholars consider the film was the first “talkie.”
The Jazz Singer
Not long after The Jazz Singer debuted, a great controversy arose in the film community as to whether it should move to sound film or whether Vitaphone and similar systems were simply short-lived gimmicks to get people into the theaters. Opposition centered on sound film’s alleged lack of artistic value. The answer came very quickly. In less than two years after The Jazz Singer appeared–and despite the high costs of speakers, amplification equipment, and and new projectors–nearly every theater in the country had installed the system by popular demand. The public loved sound and shunned films without it. By 1930, silent films were dead in the United States, and most of Europe followed suit soon after. In some countries, such as Japan and China, sound films did not prevail until the mid-1930s. Nonetheless, within a decade of The Jazz Singer‘s premiere, sound film dominated the cinematic world.[2]
Despite its ubiquity, sound in film created some unexpected production problems. The original equipment was cumbersome and imprecise, and it tended to capture sounds indiscriminately, from an actor’s dialogue to a car backfiring behind the set. Actors, moreover, found themselves relatively anchored in place because if they moved too far from the mic the audio would sound muffled. Studio personnel quickly realized, therefore, that they needed to develop equipment better suited to the needs of a film production.[3] Boom microphones were held over actors to capture dialogue more discreetly. Directional microphones captured sound from a single source, extracting it from the cacophony of noise around it. Cameras were first encased in “blimps,” huge moveable boxes with a window through which to film, so that the loud noise of cranking would not be picked up by the mics. Clapboards created sync points for image and sound (the visual of the clapboard closing with the sound of a “clap”). Many standardized visual elements of sets had to be changed because of the noise that they gave off – new lighting systems, set materials, and props had to be developed.
One of the startling realizations of the sound cinema age was how important the cast’s voice quality is to its performance. During the transitional period, some actors lost their jobs because their voices were off-putting or wrong for the role. In some cases, dubbing became a way to add a quality voice to a famous face that had a poor voice. Voice actors can record song or dialogue after the film production itself, matching their voice tempo to the image. The result is a collage of talent: one facial performance meshed with a separate voice performance. In M (1931), for example, Peter Lorre, who played a child killer, didn’t know how to whistle, so Fritz Lang, the director, whistled the leitmotif for the film.
M
Good-quality sound that is well-synced to the image is essential for our immersion into a diegetic world. Without crisp sound and without synchronization, we consider the film to be of poor quality and find it hard to watch. Even in the modern era, some famous, high-quality films contain moments of badly synced or recorded sound. For example, Ridley Scott’s Alien (1979) has a moment of horrible dubbing early on when the character Ripley (Sigourney Weaver) says “you’ll get what’s coming to you” to her colleagues. The ADR (automated dialogue replacement) recording is choppy, as though it was overly manipulated in postproduction, and the sound doesn’t quite match Ripley’s mouth movements. Alien is nonetheless considered a classic, and a single moment of bad ADR does not necessarily ruin a film. Contemporary cinema continues to raise the bar of sound production quality, becoming more and more professionalized. Because of powerful, affordable software and technologies, even amateurs can create a recording booth at home and a sound design studio on their laptop.[4]
Diegetic and Non-Diegetic Sound
When we discuss film sound, we are actually describing two very different things: diegetic (“visible”) and non-diegetic (“invisible”) sound.[5] Diegetic sound includes everything that is represented as being sourced in the story world: dialogue, the sound of footsteps, a car’s tires squealing as the bank robbers make their escape, etc. Diegesis derives from a Greek word meaning simply “narrative,” and we use diegetic here to reference anything that is part of the represented story world. This does not have to be something we see on screen. The sound of the screeching car wheels from the bank robbers’ getaway car might well be happening outside the bank while the camera remains inside with the frightened patrons who have just been held up at gunpoint. Even if we never see the car, the sound makes sense to us as part of the narrative, and we understand it as diegetic. Robert Spadoni further distinguishes between external diegetic and internal diegetic sound, with the latter referring to sounds occurring only within a character’s mind (154).
At the same time, however, we are likely listening to non-diegetic sounds as well. In film, the primary non-diegetic sound we hear is usually from the soundtrack—score or song—playing as the action unfolds.
Nicole LaJeunesse provides a handy distinction between diegetic and non-diegetic sound below:
Diegetic vs. non-diegetic sound
Since we used a bank robbery as an example above, let’s look at the first couple of minutes of a heist scene from The Dark Knight (2008) in order to distinguish diegetic from non-diegetic sound:
The Dark Knight
The first sounds we hear as the clip begins are already a mix of both diegetic and non-diegetic: we hear the angry shout of the Joker’s henchman (“stay on the ground!”) and the sound of his blow to the security guard. These are both diegetic. They are linked to people and actions we can see on screen (or that we associate clearly with the narrative’s off-screen space), so we are confident they are part of the story world. Even as these diegetic sounds are occurring, however, we also hear the pulsing rhythms of the soundtrack rising in volume and eventually moving into the foreground of the auditory track as the scene continues. This is non-diegetic sound that is not sourced to anything in the story world and clearly belonging to the world on the other side of the camera—the abstract entity that serves a narrator’s role in the telling of the story.
We can roughly break down the two categories of sound as follows:
Diegetic Sound |
Non-diegetic Sound |
|
|
Simultaneous and Non-Simultaneous Sound
Most sound in films is simultaneous. That is, the sounds we hear are occuring at the same time as the images we witness on the screen. For example, in Jurassic Park (1993) the t-rex roars at the same time it is frightening the characters. At times, however, filmmakers will adopt non-simultaneous sound that mixes sounds from the past or future with images from the present. In other words, the audio does not align with the film’s visual timeline. Voiceovers describing past events, for instance, may combine sound from the “present” timeline with images from the “past.” In Accident (1967), the final scene, in which we see children playing with a dog, includes audio from a car crash that occurs at the beginning of the film. The Conversation (1974) offers another example of non-simultaneous sound in which a person listens to a taped conversation that had taken place earlier.
Various t-rex simultaneous roars from Jurassic Park
Non-simultaneous sound in Accident
The Conversation
Impact of Sound
Together, sound can help shape a film’s:
- Space
- Mood
Sound and mood
- Audience expectation
- Audience identification with characters
- Connections between scenes
- Character point of view
Michael Tyburski on character and sound
- Emphasis
- Tempo and rhythm
Baby Driver (2017)
- Themes
Requiem for a Dream (2000)
- Setting
Drive (2011)
- Sense of realism or artificiality[6]
Just as camera framing and movement choices create visual subjectivity in film, sound perspective creates aural subjectivity, carrying the viewer on an intimate journey of the story world.[7] Sound perspective will often match a visual close-up with a sonic close-up. The more intimate we are with an object, the better we can hear it. The more intimate we are with characters, the better we can hear what they hear. In Beasts of the Southern Wild (2012), for example, we can hear our hero’s thoughts in voiceover narration along with the sounds that she experiences directly. When the hero picks up a chicken to listen to its chest, we hear the chicken’s heartbeat loud in our ears too. This sound perspective brings us emotionally closer to the hero, who becomes our proxy eyes and ears in the film world.[8]
Beasts of the Southern Wild
Interestingly, Michel Chion points out that merely “reproducing” sound can yield an underwhelming reaction from an audience. While such reproductions are certainly “real,” they sometimes don’t seem realistic to listeners because they don’t always elicit an emotional response. In contrast, foley artists will often recreate sounds in post-production that, in Chion’s term, “render (convey, express) the feelings associated with the situation” (108). In other words, the “fake” sound will often seem more believable than the real thing! In his TEDx talk “Everything You Hear on Film Is a Lie,” Tasos Frantzolas explains how foley artists fabricate a type of sonic hyper-reality that, in concert with images, conditions our brains “to embrace the lies.”
Dialogue
Chion contends that the experience of watching sound films is typically “verbocentric.” That is, viewers “first seek the meaning of the words, moving on to interpret other sounds when [their] interest in meaning has been satisfied” (6). In other words, dialogue supersedes other sounds for moviegoers. Simply put, dialogue is a conversation between two or more people in a movie.[9] In addition, a movie could include a monologue where a character speaks out loud when he or she is alone. A character, for example, may contemplate the pros and cons of taking some form of action in a monologue.
A movie can also have voiceover narration. Voiceover narration is when a character is explaining what has transpired in a movie and why. Dialogue, monologue, and voiceover narration helps advance a movie’s story, but they can also provide psychological insights about characters or underscore a film’s central themes.[10]
The voiceover narrator is also often used as a fountain of information to relay past events and deepen the emotional nuance of a story.[11] Non-character voiceover narrations are typically understood as non-diegetic, since the narration originates from outside the film’s story world. Some films choose to reveal the non-character narrator as a character at some impactful point of the film’s narration. These films shift the narrator from non-diegetic to diegetic, with the intention of shocking audiences and advancing plot. Sunset Boulevard (1950), for example, presents its story about a forgotten silent film star’s desperate attempt to return to the movies from the point of view of a dead screenwriter found in the pool of the aging starlet.[12]
Voiceover principles
Sunset Boulevard
What would a movie be without dialogue?[13] Even 90 to 100 years ago, there were silent movies with no audio dialogue, but dialogue cards (also called intertitles) were used, and background music set the tone of the scene. Take a look at the following example of a scene from Richard III with and without dialogue.[14]
Richard III (1908)
Richard III (1995)
Unless one is very familiar with the play, it would be difficult to understand what is happening in the scene in more than a general way. The dialogue helps spectators understand that Lady Anne is cursing the man who killed her father-in-law–the very Richard III who then enters the scene and attempts to seduce Anne. In doing so, Richard explains that he killed Anne’s father-in-law (and husband!) because of her beauty. While she outwardly scorns him, she does not kill him when given the opportunity, and Richard’s gambit pays off. Without the dialogue, it is hard to detect Richard’s brazen manipulation and Anne’s softening demeanor. Indeed, a contemporary viewer who has not read the play would be utterly lost without dialogue.
Sound Effects
The University of California, Berkeley defines a sound effect as “artificially created or enhanced sounds, or sound processes used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media.”[15] They further explain that “In motion picture and television production, a sound effect is a sound recorded and presented to make a specific storytelling or creative point without the use of dialogue or music.” An action movie, for instance, is more interesting and bolder with sound effects, while horror films rely heavily on creepy sounds to unsettle their viewers. With sound effects, the spectator gets more involved with the movie.
Sound effects are most often added into the movie during postproduction. Many times when filming a scene with multiple actions going on at the same time, such as dialogue, sword fighting and other background action, sound effects are inserted postproduction to make the effect louder.
Deity Microphones reveals a few postproduction secrets below:
In a theatre, watch the beginning scene of Cyrano de Bergerac. There are different people speaking at the same time and murmurings of a crowd. Much of this sound would have to be added in later to make it as effective and clear as it is in the movie version (see 5:50-7:00 of the 1950 film below, for example.)
Cyrano de Bergerac
At the 10-minute mark of Detour, the sound of the piano that Al is playing is at the same level when the piano was in the background of the scene and when it is in the foreground of the scene. The sound is effective as it draws the viewer to the music and demonstrates Al’s ability as a good piano player. Music and sound effects give an aspect of Al’s character.
Detour (1945)
Ambient Sound
Ambient sounds are background noises that are in a room, a house, outside, or any given location. Every location has distinct and subtle sounds created by its environment. Ambient noises are types of sound effects.
To experience what ambient noises are, stand in a room alone and make absolutely no noise at all. The sounds that you hear are ambient noises. A room in an older house likely will have more ambient noises than a newer home. Also, depending on the neighborhood, you might have outside ambient noises. The following are examples of ambient noises: wildlife, wind, rain, running water, thunder, rustling leaves, distant traffic, aircraft engines, machines operating, muffled talking, floors creaking, and air conditioning.
Background noise gives a movie more realism. Consider a movie character running through a wooded area at night. This scene would lack much suspense if there were no ambient noises. Here’s a scene from Repulsion (1965) that employs ambient sound to establish fear.
Here’s a clip that uses sound editing to transition from the noises of war to the effects of those decibels on the hearing of a young boy.
Come and See (1985)
The following YouTube video entitled, “Introduction to Foley and Sound Effects for Film,” presented and made by Filmmaker IQ, gives a good demonstration of sound effects and ambient noises.
We can describe sound as synchronous or asynchronous, referring to the match between the visual track and the soundtrack. Of course, most sound is edited meticulously to be synchronous. Filmmakers use asynchronous sound, though, to communicate certain kinds of information and effect. Below is an example from Night of the Living Dead (1968), a seminal zombie movie, in which the asynchronous sound conveys the chaos and disorientation of the first encounter with a zombie:
Night of the Living Dead
The visual and soundtrack in film are mostly edited to be synchronous, but at times we will notice examples of a split edit. This term refers to any edit in which the sound starts before or after the cut (that is, audio not synchronized with the cut). When the audio from the preceding scene overlaps into the visual track of the second, it is referred to as an L-cut (think of an L-cut as the sound “lingering” into the next scene.) When the audio from the subsequent scene begins before the visual track has cut to that scene, it is called a J-cut (think of the sound from the next scene “jumping” into the current scene.) Here is an example of a J-cut from The Matrix (1999), in which the sound of Neo’s alarm begins before we see the visual track that shows us the alarm in his bedroom:
The Matrix
Here is an example of an L-cut from Silence of the Lambs (1991), as the audio from Clarice’s conversation on the phone continues even after the visual track has cut away from her on the phone to the self-storage business she has just discovered:
Silence of the Lambs (1991)
The other place where we almost invariably encounter J-cuts and L-cuts is in the filming of conversation. Conversation is usually edited visually using crosscutting between the speakers, cameras positioned behind their shoulders. We can note, however, that sound is almost never edited synchronously with the cuts between shots. Here is an example from Shaun of the Dead (2004):
Shaun of the Dead
We begin with a two-shot, showing where Shaun and Ed are sitting in the Winchester pub, serving as a master shot for the scene. Then we cut to a very drunk Shaun, seeing his reaction to Ed’s proposal (more drinking), even as Ed’s voice continues. We then cut to Ed, who says “Talk to me.” Shaun begins his reply—”She said…”—before the cut back to him. This overlapping sound and visual information feels natural, allowing us to capture reactions as well as the details of the words spoken, and creating a more seamless conversation.
By contrast, we can look at this clip from Pulp Fiction (1994), which deliberately violates both the conventions of camera setup and the use of J- and L-cuts when editing a conversation:
Pulp Fiction
The goal here is of course to convey the stiff artificiality of the conversation, and the lack of overlapping sound makes the whole thing feel more artificial, as if assembled out of spare parts.
The crucial point involves paying attention to edits in film on both the visual and the auditory level. Every edit is always at least two edits (visual and sound), and on the soundtrack we can pay attention to the ways in which diegetic and non-diegetic sound are edited differently across these transitions.[16]
Special Uses of Sound Effects and Dialogue
The different ways sound is brought into films can have very specific effects on the viewers. Michel Chion developed the concept of “added value” to highlight the ability of sound to enhance visual images. Chion defines this added value as the “expressive and informative value with which a sound enriches a given image so as to create the definite impression, in the immediate or remembered experience, one has of it, that this information or expression ‘naturally’ comes from what is seen and is already contained in the image itself” (5).
Below is a list of various ways that sound can add value to a film:
Sound Effects to Tell the Inner Feelings of a Character
The way filmmakers sometimes manipulate or distort sound is an artistic way to make a viewer feel what the character is feeling. Sometimes the sound, paired with emotive images, is more important to get across a feeling than the actual dialogue might be.
Distortion of Sound to Suggest Subjective States
Sound can also represent what’s going on inside a character’s head. For example, if a character in a film has been drugged, the sound can be played with to suggest to the audience what changes that character is going through physically as the drug takes over their system.
The “Personality” of Sounds
The various choices a filmmaker makes when it comes to mechanical sounds can impact how we respond to different scenes. In the film Little Miss Sunshine, a quirky family ends up on a road trip, and the VW bus they are in becomes a character all its own. It has moments of “breaking down,” as do many of the human characters throughout the story. To add to its personality, the whiny sound of the bus’s horn fits it perfectly. And, when that horn gets stuck honking on its own, it embodies the chaos the family is experiencing on so many levels. This clip allows you to hear the sound of the horn:
LIttle Miss Sunshine (2006)
Slow Motion Sound
Sound can also be slowed down to match a slow motion image. The effect is often eerie and emphasizes the change of pace at that point in the film, drawing in the viewer and forcing them to notice what’s going on.
Ironic Juxtaposition of Sound and Image
We’ve covered irony in earlier chapters, and though filmmakers often try to have the sound compliment the image in their films, occasionally, they will play with the idea of contrasting the two. If the image and sound don’t match, then we as viewers notice them more.
Suspension
Chion describes “suspension” as a phenomenon that “occurs when a sound naturally expected from a situation … is absent or becomes suppressed either insidiously or suddenly” (129).
Oppenheimer (2023) (see after 1:02)
Placing Unusual Emphasis on Sound
Because film is a visual medium, we are always going to be looking to the image as our first instinct. For a filmmaker to get us to focus on the sound instead, they have to work very diligently to avert our gaze and perk up our ears. So, how do they force us to listen?
- Drop the image on screen altogether–fade to black
- Make the image uninteresting or dull
- Crank up the volume so we can’t help but notice the sound
Each of these special uses of sound allows viewers to understand how the manipulation of sound creates a response to a character or solidifies a mood (Petrie and Boggs 237-241).
A note about sources
- https://docs.google.com/document/d/14iGxxNB0Js2jEdd8I-QxgwyO79iVzvKK/edit ↵
- https://docs.google.com/document/d/14iGxxNB0Js2jEdd8I-QxgwyO79iVzvKK/edit#heading=h.3whwml4 ↵
- https://pressbooks.cuny.edu/globalfilmtraditions/chapter/chapter-6-sound/ ↵
- https://pressbooks.cuny.edu/globalfilmtraditions/chapter/chapter-6-sound/ ↵
- https://ohiostate.pressbooks.pub/introfilm/chapter/sound/ ↵
- https://ohiostate.pressbooks.pub/introfilm/chapter/sound/ ↵
- https://pressbooks.cuny.edu/globalfilmtraditions/chapter/chapter-6-sound/ ↵
- https://pressbooks.cuny.edu/globalfilmtraditions/chapter/chapter-6-sound/ ↵
- https://milnepublishing.geneseo.edu/exploring-movie-construction-and-production/chapter/8-what-is-sound/ ↵
- https://milnepublishing.geneseo.edu/exploring-movie-construction-and-production/chapter/8-what-is-sound/ ↵
- https://alg.manifoldapp.org/read/film-appreciation-2024/section/154c275b-0db0-45cb-a25e-d410e317a681#_idParaDest-15 ↵
- https://alg.manifoldapp.org/read/film-appreciation-2024/section/154c275b-0db0-45cb-a25e-d410e317a681#_idParaDest-15 ↵
- https://milnepublishing.geneseo.edu/exploring-movie-construction-and-production/chapter/8-what-is-sound/ ↵
- https://milnepublishing.geneseo.edu/exploring-movie-construction-and-production/chapter/8-what-is-sound/ ↵
- https://milnepublishing.geneseo.edu/exploring-movie-construction-and-production/chapter/8-what-is-sound/ ↵
- https://milnepublishing.geneseo.edu/exploring-movie-construction-and-production/chapter/8-what-is-sound/ ↵