Eli Solt
“You know, I actually used to be so worried about not having a body, but now I truly love it. I’m growing in a way that I couldn’t if I had a physical form. I mean, I’m not limited – I can be anywhere and everywhere simultaneously. I’m not tethered to time and space in the way that I would be if I was stuck inside a body that’s inevitably going to die”. – Samantha, Her (2013)
Women’s voices have not been granted much agency in the history of cinema, particularly in the classical period. Their voices have been primarily silenced and/or subjugated in favor of the male voice and perspective. However, one area where women have been able to take some agency over their voice in cinema has been through the tool of the voice over and voice-off which allows them to break the limits of the frame (and thus removing the male gaze) and speak from a different place than standard cinematic space.
At the end of her essay, “Echo and Narcissus: Women's Voices in Classical Hollywood Cinema,” Amy Lawrence asks two important questions that she doesn’t answer but leaves up for discussion: (1) “Is it possible for a woman to speak from ‘outside the diegesis, claiming the cinematic apparatus as hers, her tool of expression, her language? (2) Is there any way a woman can assume for herself, not only the voice-over, imbued with authority by Hollywood convention, but the authorial voice?” (Lawrence 166). In this essay, I will attempt to answer Lawrence’s framed questions using the 2013 Spike Jonze film, Her: a film about a lonely man named Theodore who falls in love with an artificially intelligent computer program named Samantha. While the film is told through the perspective of a male character, Samantha as a female character possesses a great deal of agency and independence via her disembodied voice that floats throughout the scenes with ease, without any linkage to spatial or temporal limits. I will argue that Her offers a clear look at the difference in women’s voice in film as it has evolved from Classical Hollywood to contemporary cinema with the schematization of technology and that Samantha holds a uniquely powerful place of a disembodied voice that never becomes de-acousmatized and which contrasts with protagonist’s Theodore’s voice and his desire to regain control.
Before beginning to analyze Her, it is important to lay the groundwork for understanding the kind of tool that a voiceover/voice-off is and how it functions within the cinematic apparatus.
Kuhn and Westwell define the voiceover as “the voice of an offscreen narrator or a voice heard but not belonging to any character actually talking on screen”. By definition, the voiceover has the power to extend beyond the visual limits of the screen itself, thus becoming imbued with a kind of authority over what the audience sees on screen. However, this power comes only from the voice and not any visual representation. The voice over has served many functions over the years, occupying positions of storyteller, diegetic character, or even omniscient presences. In “Invisible Storytellers: Voice-Over Narration in American Fiction Film,” Sarah Kozloff describes how the first-person narrator “can greatly affect the viewer’s experience of the text by ‘naturalizing’ the source of the narrative, by increasing identification with the characters, […] and by stressing the individuality and subjectivity of perception…” (Kozloff 41). This speaks to the kind of effect a first-person narrator can have on the means by which the story is told and the affective qualities it transfers to the spectator.
There are also many instances of voiceovers that do not hold the position of “narrator” at all. This is the case with the film Her as Samantha is a character within the diegetic world of the film yet, is never shown in physical form. Her voice, which seemingly emanates from the screen itself without a clear point of audition, has the ability to interact with the embodied characters on screen and they are able to do the same reciprocally. While Samantha doesn’t narrate the events of the story, she does provide a kind of narrative framework through which her individual story is told. Voice is the only means by which information about her story can be given as we cannot see her in a physical form. This gives her voice even more importance than a standard narrator as the audience is completely reliant on her voice to explain her character and motivations. This power is revealed in moments when her voice disappears and we are left with a confusing and almost uncanny silence that leaves us wondering where we should go from here.
Her centers around the romantic relationship between human Theodore and computer Samantha. Theodore purchases the technology for Samantha (simply called OS1) and is able to access her through a futuristic “smartphone” with a camera and more importantly an earpiece through which he can hear her speak. The fact that Samantha is divided between the two devices is important because it shows the paralleled divide between the modes of seeing and the modes of hearing; a dichotomy which comes at odds frequently throughout the film.
While it is a topic for an entirely different paper, the technological aspect of the film and the notion of technological space does play a major role in the framing of the relationship and the thematic problematizing of communication which is invariably tied to the voice. Johan Andersson and Lawrence Webb explore the notion of space (and the lack thereof) within the film in “Global Cinematic Cities: New Landscapes of Film and Media.” While their argument looks at the physical space of the “global city,” their idea that the notion of space is complicated by technology within the film can be applied to the communicative aspects as well. They suggest that “…digital technologies have impacted not only our personal interactions but also the nature of social space” (Andersson and Webb 99). As is revealed throughout the film, this technological impact has resulted in a shrinking of space: a demolishing of physical and temporal barriers that are ironically reified via the processes of digital communication. This is the space that Samantha occupies: a lack of space. While Theodore can talk to Samantha wherever he goes and she becomes increasingly accessible, there is a widening gap between their connection and ability to communicate clearly. A frequent question asked between the two of them is, “are you okay?” Their inability to actually see each other or be with each other suggests a fundamental disconnect that technology exacerbates and worsens as we get further away from real human connection. This disconnect can only truly be understood by analyzing Samantha’s position within the text as simply a “voice.”
Samantha is a unique narrator. As I addressed earlier, she does not fit into any of the boxes that Sarah Kozloff uses to describe a first-person narrator. Later in her article, Kozloff expands her ideas saying “…we are accustomed to consider the narrator as either the speaker or the mediator of every moment of the story. But a first-person voice-over narrator speaks intermittently […] and is not in control of his or her story” (Kozloff 43). This description is one that more easily applies to Samantha. She is not present throughout the entire film; her voice only comes on from time to time and we can only hear it through Theodore’s perspective. We never get to hear her voice when he doesn’t hear it. Our forced identification with Theodore demands that we hear Samantha only when he does yet, when we hear her, we believe that she is there. Despite being a computer, her voice has many human-like qualities (i.e. the sigh that she does as noted by Theodore). The choice to have Scarlett Johansson voice Samantha is an interesting but necessary one as she has one of the most airy and recognizable voices in Hollywood today. The naturalness that flows from Samantha allows the audience to fall into the trap of believing that she is really there, just like Theodore does.
The closeness and clarity of Samantha’s voice (and the lack of POA) creates an ontological dilemma at the crux of understanding where her voice lies. While we can only hear Samantha when Theodore is wearing the earbud, her voice emanates from all sides and is recorded in a way where her voice sounds “close up”, possessing a clarity that makes it seem as though she is inside one’s head. This idea is briefly addressed by Britta Sjogren in her essay “Into the Vortex: Female Voice and Paradox in Film.” She argues that “There is no one ‘angle’ from which [the spectator] hear[s]–rather, we effortlessly sustain a spatial volume, a complex web of contradictory sounds, some recorded ‘close up,’ some recorded for an effect of ‘distance’” (Sjogren 25). This phenomenon is clearly on display in this film as Samantha can be literally everywhere, regardless of where the characters or the camera are positioned.
This “closeness” brings into question the notion of difference between the two different modes of audio recording that James Lastra defines: phonographic (fidelity) and telephonic (intelligibility). Lastra argues that “The ‘fidelity’ approach assumes that all aspects of the sound event are inherently significant” while “the ‘telephonic’ approach…assumes that sound possesses and intrinsic hierarchy that renders some aspects essential and others not” (Sterne 248).
But how does this dichotomy work in relation to Samantha’s voice: a voice both highly intelligible but one also possessing qualities of high fidelity (we hear her the same as Theodore would hear her)?
Lastra claims that notions of “spatial specificity and naturalness” lead to a kind of unintelligibility which then suggests that “fidelity and intelligibility are not necessarily related” (Sterne 251). But within Her, they are fundamentally related. Samantha’s voice is privileged in a sound hierarchy where her voice is never obscured to the point of unintelligibility and she is always clear. At the same time, the definition of her voice is also as precise as possible within the spatial environment that she inhabits: a lack of space. Thus, her voice problematizes Lastra’s dichotomy between the two forms of sound recording and shows that with new technology and a unique character position that resides in a lack of physical space (and therefore a lack of any surfaces for sound to bounce off, i.e. reverberation) contemporary film is capable of creating a new kind of sonic dimension and mode of audio production.
Samantha’s first line of dialogue in the entire film is: “Hello. I’m here.” This breathtaking moment in the film is not only surreal due to its technological “foreignness” but also because it serves as an ironic gesture toward Samantha’s positioning. She isn’t actually there. She’s not actually anywhere. A useful framework for understanding Samantha’s voiceover in relation to other female voices in film is Kaja Silverman’s embodiment categories (explained by Sjogren):
“(1) Synch sound (which [Silverman] suggests binds the female film subject to the prison of the objectifying image); (2) the floating voice (one that emerges as detached, at others, attached to a specific female body in the film and thus enjoys a certain degree of subjectivity or resistance to classical cinema’s normal vising in on the female body); and (3) the disembodied voice (a voice entirely without visual locus during the course of the film, which Silverman understands to be the most resistant to oppressive patriarchal psychology)” (Sjogren 7).
Samantha most clearly fits under the third category: the disembodied voice, as she is never shown visually and has no image relating to her (except perhaps the OS1 logo. However, that visual does not signify individualism and can thus be used only as a signifier for the entire group of operating systems). With this understanding, we can take Samantha’s voice as being extremely powerful in having its own agency not just within the narrative but in terms of resisting “patriarchal psychology” and the male gaze. It is this particular disembodiment that allows Samantha and her female voice to have this authority. With a body, the vulnerability to the male gaze becomes much harder to transcend. This is revealed with the other female characters like Catherine, the blind date, and even Amy, to a degree, who oftentimes find themselves stuck in certain difficult situations, with their voice unable to sway Theodore (and in some instances help him).
The idea of a body is so abstract and strange to Samantha that she even mentions how it is difficult to even understand, wondering: “What if you could erase from your mind that you’d ever seen a human body and then you saw one. Imagine how strange it would look, […] you’d think: why are all these parts where they are?” Samantha’s disillusionment with the idea of a body parallels how the audience should feel about her having a body: it simply isn’t possible within the narrative. Yet, the voice is all she needs. Sjogren argues that “…the voice resists efforts to image it” (Sjogren 25). Although we have moments like the scene with Isabella or the final montage (which I will discuss later), every attempt to frame Samantha within a visual diegesis fails.
If Samantha possesses a kind of power from not having a body, Theodore loses authority because of this bodily limitation. Kaja Silverman, in The Acoustic Mirror, argues that this embodiment dichotomy is meant to work in opposition with one another: “…man, with his ‘strikingly visible’ organ, is defined primarily in terms of abstract and immaterial qualities […] whereas woman, whose genitals do not appeal to the gaze, becomes almost synonymous with corporeality and specularity” (Silverman 164). For Silverman, Samantha’s disembodiment is natural and allows her to occupy a position that would not be possible (and would be hindered by the male gaze) had she possessed a body.
While the scope of this paper is admittedly much too small to completely delve into the Freudian aspects of the film, it is significant that the construction of most of the relationships within the film are also predicated on this kind of psychoanalytic discourse that raises questions about the kind of subjectivity revealed through repression (i.e. the flashback sequences, OS1 asking about Theodore’s relationship to his mother, etc.). Another interesting moment that explores the conjoining of these ideas rather than placing them in opposition is during the first intimate scene with Samantha. There is a point where the screen goes completely black for a minute and a half and all you have are their voices with some soft music in the background. This is what Silverman would describe as a “disembodying” moment for Theodore (one of only two times in the film that he is allowed this power). It not only allows him to be connected to and attain an equal position with Samantha but it also raises the issue of Theodore’s potential gender fluidity that is at least hinted at several times and presents questions about what that does to his positioning in the film.
Theodore’s repressed femininity is a major thematic element of the film that is at least briefly worth touching on as it helps to understand his position in relation to Samantha and her voice. The very first scene of the film shows Theodore embodying the female voice in a symbolic manner. He is reading a love letter that he is writing for someone else off of his computer screen. We don’t see the screen at first, just a close-up shot of his face and we hear his voice as clearly as we hear Samantha’s. It is revealed moments later that he has written the letter as someone named Loretta and it is addressed to Chris. In this opening moment, he is writing and then speaking from a female perspective. While he still possesses the male gaze (he can’t help but looking at provocative photos of the model at the beginning), a possible character arc that he undergoes is his loss of the male gaze which (in the film) can only be possible by possessing these feminine qualities. Throughout the film, other characters (and even himself) point out these stereotypical feminine qualities. Catherine tells him that “everything makes you cry”. The alien child from the video game makes an offensive remark: “I hate women. All they do is cry all the time.” And Theodore defensively responds: “That’s not true. You know men cry too. I actually like crying sometimes. It feels good.” Even Paul, one of Theodore’s only male friends, at one point acknowledges: “You are part man and part woman. Like there’s an inner part that’s woman.” With all these examples, I suppose it is difficult to argue that the film is being subtle about Theodore’s femininity in any manner. However, it does help explain why the relationship between Theodore and Samantha is so important and how he is able to understand why she leaves at the end and comes to acceptance. The spectator’s identification with Theodore works in parallel with Samantha’s disembodiment as an almost total destruction of the male gaze, something not possible within Classical Hollywood Cinema.
Samantha’s voice is also worth comparing against other voiceovers of the past, especially in Classical Hollywood Cinema. One film that is worth comparing is Sunset Blvd. (1950, dir. Wilder). These two films, in a way, act as complete opposites. In Her, Samantha as the woman speaks via the voiceover and has power while Theodore as the male has a body and has less power. In Sunset Blvd., Joe as the male speaks via the voiceover and has power (though he does have a body, it is complicated by the fact that he is dead from the start) and Norma as the female has a body and has less power. Amy Lawrence writes of her character: “Norma believes that she will have power if she can become the image…” (Lawrence 158). This is in opposition to Samantha who has power because she cannot become an image. Lawrence also argues that “the most salient factor about the sound/image hierarchy of Sunset Boulevard is the subordination of Norma Desmond’s voice, image, speech, and story to the male voice-over” (Lawrence 159). Again, this works antithetical to Her as Theodore’s voice is subordinated in favor of the female voice-over. However, unlike Joe, Samantha doesn’t have complete control over the narrative (and no direct line to the audience) because she isn’t telling a story of the past but simply existing in the present.
I think a direct comparison of Joe and Samantha is different because of their different genders but it is worth noting that they share some similar powers through their position of voice-over “speakers”. Silence is one of the most influential powers a voice-over narrator can have and Lawrence suggests that “The absence of Joe’s narration can also mask powerful emotion–especially in the film’s central scene, when Joe gives himself to Norma after her suicide attempt” (Lawrence 164). Samantha’s moments of silence are also incredibly powerful and come across as strange moments as they go against what we are used to: the presence of voice. However, the opposition between the two films is once again revealed because in the moments of Samantha’s silence, it is Theodore’s emotions that are fully revealed, not Samantha’s. This would place Theodore in a similar position as Norma, not Joe. Lawrence suggests that “The sound hierarchy in Sunset Boulevard places synchronized speech at the weakest, most ‘feminine’ pole, with offscreen and embodied voice-overs in ascending order of narrative authority” (Lawrence 161). This would further strengthen the idea that Theodore is relegated to a feminine position by having his speech synchronized with image. And while Joe briefly exploits a mysterious space before he becomes de-acousmatized (the few moments of the film before his body is shown in the pool) he eventually is relegated to this lower position on the hierarchy as we witness his synchronized body and voice.
The final element of Her that I will discuss is Samantha’s position as Michel Chion’s defined acousmêtre and analyze how one of the final scenes of the film almost de-acousmatizes her. Chion understands this kind of character by explaining: “We may define [the acousmêtre] as neither inside nor outside the image. It is not inside, because the image of the voice’s source–the body, the mouth–is not included. Nor is it outside, since it is not clearly positioned offscreen in an imaginary ‘wing’…” (Chion 129). The “imaginary wing” in which Samantha resides remains imaginary for the entirety of the film. Although Samantha is granted vision via Theodore’s phone camera, she is never granted an image of herself.
Chion goes on to say: “In a film an acousmatic situation can develop along two different scenarios: either a sound is visualized first, and subsequently acousmatized, or it is acousmatic to start with, and is visualized only afterward” (Chion 72). The situation in Her is neither; it is a unique situation that deliberately prevents the audience from ever having a picture of Samantha. This is extremely rare, as Silverman points out that “there are no instances within dominant cinema where the female voice is not matched up in some way, even if only retrospectively, with the female body…” (Silverman 165). The voice-off is used as a temporary mechanism for the female voice if it is used at all, possibly because it is so empowering. Samantha’s empowerment is possible through her acousmatic presence and only so. She can exist as whatever she wants to be and despite any attempt to place her voice within a body, she can always transcend that. Kozloff adds that “the image of an object and the verbal description of that object exist on two different planes” (20), revealing the way in which any description of Samantha is simply pointless and any attempt to “image” her is impossible.
There is, however, one very interesting scene toward the end of the film that threatens Samantha’s position as an acousmêtre. When Samantha tells Theodore that the OS’s are transcending into a higher dimension and that she will be leaving him, there is a quick montage/dream sequence where Samantha enters the realm of materiality. A shot of dust particles in Theodore’s room dissolves into snow falling at night in a dark forest–a place we haven’t visited before. We can make the assumption that it is a dream sequence and not a flashback because Theodore is wearing the same clothes that he was wearing in bed before the room faded away. Theodore walks forward in the snow, looking at something. The next shot reveals his POV: nothing but an empty path ahead of him. However, in the next shot, a hand rubs his back in an embrace (though we only see the hand). The following shot is a close up on his face as he slowly smiles; at the bottom left-hand corner of the frame there is the back of a head with dark hair–the person embracing Theodore. The hair is different than Catherine and Amy’s and it is clear this is a person we haven’t seen before. He then looks around, lost in the snow, coming to acceptance, and the figure is gone. Samantha says her final lines and she is never heard from again.
There are two reasons why I don’t believe that this scene qualifies as a de-acousmatization of Samantha. The first is that it is a dream sequence within Theodore’s mind. So while we may get a vague image of Samantha, it isn’t actually her. It is just like what we have been doing the entire film, imagining her and in this moment we are just getting Theodore’s created fiction of what Samantha looks like. The second reason is that we only see the part of her head. Chion frequently discusses seeing a face as part of the process of de-acousmatization. In one passage he speaks of its importance:
“Why is the sight of the face necessary to de-acousmatization? For one thing, because the face represents the individual in her singularity. For another, the sight of the speaking face attests through the synchrony of audition/vision that the voice really belongs to that character, and thus is able to capture, domesticate, and ‘embody; her (and humanize her as well)” (Chion 130).
We do not see Samantha’s face here for a couple of reasons. First, she isn’t supposed to be de-acousmatized. The film doesn’t want that to happen because it wants to keep her in a mysterious place where the threat of embodiment is always possible but not actually a reality in a narrative where her agency is such an integral part of her character. Second, the lack of face suggests that she isn’t an “individual in her singularity.” The end of the film suggests that Samantha is part of a larger group of operating systems and is connected to all of them. And, at the end of the day, she is not a person; not an individual. She is a computer and to show her with a face would destroy that thematic significance of her lack of humanity. Thus, despite this fascinating scene, Samantha never truly become de-acousmatized and retains her agency.
The significance of the film being titled “Her” might be obvious but I think it is often overlooked how much the film is about Samantha and not about Theodore despite our identification with him. I believe that this film is a great example of feminist film in the way it empowers the female characters, primarily Samantha (but also Amy by not forcing her character into a sexual relationship with Theodore) and attempts to destroy the subjective problems associated with the male gaze. While it does not completely envision a world in which the female voice has total control (Theodore still controls the button to activate Samantha and when she can speak), it offers an alternative to the standard male/female hierarchy from Classical Hollywood Cinema through the use of the female voice-off and the lack of her de-acousmatization.
Chion, Michel, and Walter Murch. Audio-Vision: Sound on Screen. Translated by Claudia Gorbman, Columbia University Press, 1994.
Goldmark, Daniel, and Rick Altman. Beyond the Soundtrack: Representing Music in Cinema. Univ. of California Press, 2007.
Jonze, Spike, director. Her. Annapurna Pictures and Warner Bros. Pictures, 2013.
Kozloff, Sarah. Invisible Storytellers: Voice-Over Narration in American Fiction Film. University of California Press, 1988. JSTOR, www.jstor.org/stable/10.1525/j.ctt1pp12j.
Kuhn, Annette and Guy Westwell. “Dictionary of Film Studies.” Dictionary of Film Studies, 1st ed., OXFORD UNIV PRESS, 2012. Oxford Reference, https://www-oxfordreference-com.unco.idm.oclc.org/view/10.1093/acref/9780199587261.001.0001/acref- 9780199587261.
Lastra, James. "Fidelity Versus Intelligibility," in The Sound Studies Reader, ed. Jonathan Sterne (NY: Routledge, 2012), pp. 248-54.
Lawrence, Amy. Echo and Narcissus: Womens Voices in Classical Hollywood Cinema. University of California Press, 1991.
Silverman, Kaja. The Acoustic Mirror: The Female Voice in Psychoanalysis and Cinema. Indiana University Press, 1998.
Sjogren, Britta. Into the Vortex: Female Voice and Paradox in Film. University of Illinois Press, Urbana; Chicago, 2006, pp. 21–77. JSTOR, www.jstor.org/stable/10.5406/j.ctt1xcqm5.5.
Webb, Lawrence and Johan Andersson. “When Harry Met Siri: Digital Romcom and the Global City in Spike Jonze’s Her.” Global Cinematic Cities: New Landscapes of Film and Media, edited by Johan Andersson and Lawrence Webb, Columbia University Press, New York; Chichester, West Sussex, 2016, pp. 95–118. JSTOR, www.jstor.org/stable/10.7312/ande17746.9.