Roland Barthes, “The Grain of the Voice” (1972).
This essay by Barthes argues for a “grain” of the voice that, like the photographic punctum, marks the point of an individual voice’s greatest interest. In addition, this “grain” is a kind of indexical thing, referencing the materiality of the body from which the voice in question emerged. Barthes’s essay is foundational because it offers (borrowing from Kristeva) a structural theory of the voice that has since been criticized, debunked, reified, and romanticized.
Richard Bauman and Charles L. Briggs, Voices of Modernity: Language Ideologies and the Politics of Inequality (Cambridge: Cambridge University Press, 2003).
This text doesn’t address voice or sound or music in any detail. However, Bauman and Briggs apply Latourian method (We Have Never Been Modern) to gloss how major intellectuals from the 17th century onward have developed and deployed ideologies of language as strategies for state-making and nation building. Heavily invested in Anthropology and Folklore’s concerns with traditionality/modernity and orality/text, this book debunks these divides by illustrating how the institionalization and naturalization of them was propagated by emerging theories and cultures of ‘the word’ from Locke to Boas. As a scholar of voice, this text is essential reading, as the assembly of the modern world through the instrumentalization of language necessarily relied upon the scientific refashioning and ideological make-over of the very (modern) concept we have come to call “voice.”
Adriana Cavarero, For More Than One Voice: Toward a Philosophy of Vocal Expression (Stanford, Calif.: Stanford University Press, 2005.
This text was useful to me in that it offered philosophical background to the logocentric approach voice that permeates much humanistic studies of voice and language through which to discuss this.
Michel Chion, The Voice in Cinema (1982).
In this book, which forms a crucial intervention into the field of film (sound) studies, Chion traces different iterations of what he calls the “acousmatic voice” in cinema—a voice that wanders the screen in search of a body with which to synchronize. In following the acousmatic voice through a number of well-known films, Chion shows us how all synchronization is illusory, and how cinema itself is dependent on this particular illusion for its very functionality. He also argues (mostly implicitly) that “actual” voice and body, like cinematic voice and body, are held together through a kind of ventriloquism, the technicity of which is persistently effaced through processes of disavowal, among other mechanisms.
Chowning, John M. “The synthesis of complex audio spectra by means of frequency modulation.” Computer Music Journal (1977): 46-54.
A classic paper describing how to synthesize the singing voice using FM synthesis. This is the technique Chowning use in his influential computer music piece, Phoné.
Cohen GD, Perlstein S, Chapline J, Kelly J, Firth KM, Simmens S. (2006) The impact of professionally conducted cultural programs on the physical health, mental health, and social functioning of older adults. Gerontologist. 2006 Dec;46(6):726-34.
This article summarizes the first clinical trials that examined the impact of a one-year community choir program on the health and well-being of adults age 65 and older. Cohen and colleagues directed the multi-site, longitudinal “Creativity and Aging” study. Using a pre-post design, the authors reported higher ratings of physical health, higher morale, fewer doctor visits, fewer falls, less loneliness, and less over-the-counter medical use in the group that completed the choir, compared to a usual activity control group. They concluded that participating in choral singing as an older adult can have an impact on health.
Suzanne G. Cusick, “On Musical Performances of Gender and Sex. In Audible Traces: Gender, Identity and Music,” ed. E. Barkin and L. Hamessley (Los Angeles: Carciofoli Verlagshaus, 1999).
This text offered a model for breaking down and discussing the meeting point of the biological and cultural as it takes place in and through voice as materiality, as idea, and as identity.
Inspired by Judith Butler’s notion of performativity, musicologist Suzanne Cusick analyzes speech and song in Western culture as forms of discipline of the vocalizing bodies. Cusick characterizes speaking a form of “subordination to language” and demystifies the notion of singing as free expression of interiority, describing it instead as a production of gendered subjectivity through a negotiation of the boundaries of the body as required by culture. She explains the performance of gendered subjectivity through singing through two “cases studies from pop music, Eddie Vedder’s and the Indigo Girls’ use of voice.
In my early work on voice, Cusick’s denaturalizing approach right into my theoretical framework as I tried to discuss the voice as situated at the intersection of the cultural / symbolic and the biological. The support of my own view from a different field (musicology) which I regarded as more knowledgeable about the materiality of the voice, was crucial.
Dodge, Charles. On speech songs. In Current Directions in Computer Music Research. MIT Press, 1989.
Yet another composer describes how he put together a seminal piece of computer music, “speech Songs”. Dodge is especially clear and transparent about his compositional aims and how the techniques he used (LPC-based analysis/resynthesis) enabled him to realize them.
Dolar, Mladen. “His Master’s Voice.” The Symptom 13 (Summer 2012).
This challenging article anticipates and condenses many of the arguments of Dolar’s book-length psychoanalytic analysis of the semiotics and politics of the voice, A Voice and Nothing More (Cambridge: MIT Press, 2010). In both, Dolar discusses three orientations to the voice. First, focusing on linguistics, he shows that the voice cannot be reduced to language and meaning (virtual, a pure signified), nor body and physicality (material, sonic vibration as pure signifier). It is instead an object of what psychoanalysis terms the drives, which are the product of culture’s impact on nature, rather than natural in the way an instinct with biological aims and objects or physical forces are. Next, discussing the ethics of the voice and moral philosophy (Socrates, Kant, Heidegger), he shows that the voice of conscience belongs neither to the subject nor to the Other; every meaning assigned an injunction (the material aspect of voice as a commanding signifier inaugurating linguistic and other commerce) is a defense against an abyssal responsibility to the Other as a superego that cannot be satisfied, which arises from the wordless dimension of communication (what cannot be communicated as a message but only as a force to which one must respond without knowing how). Finally, discussing politics, Dolar shows that the voice is neither sovereign nor subject; it is a supplement to the letter of the Other’s law that authorizes it by “inspiriting” it but that also threatens to supplant it as something irrational in the voice of reason or illegal in the voice of conscience. The voice is thus the difference between the sign as “virtual,” a mere vehicle for meaning, as in some discourse and semiotic theories, and the sign as “material,” an aesthetic object whose singular “grain” is subject to fetishism, as for the opera lover, or a physical force that can be manipulated to act on the body without the subjective mediation of drives and desires, like an ear-splitting siren. Both are defenses against the desire of the Other and the voice as what Lacan calls an “objet a” that deconstructs such binaries as spirit/matter, signified/signifier, and Other/self. The voice is a void (the “lack” introduced by symbolic castration) where two different subjects intersect, which, though defensively filled up in “imaginary” fashion with linguistic, musicological, or scientific meanings and fantasies, sustains the subject as such because the latter cannot fully assimilate it, respond to it as entirely knowable or mastered, but instead listens for something that opens self and Other to an unknown future.
Cornelia Fales, “The Paradox of Timbre,” Ethnomusicology 46, no. 1 (2002): 58.
Fales’ careful psychoacoustic- and culturally situated reading of a vocal genre wherein the singer only whispers and the listener fills in the blanks to such extent that s/he projects a melody (that is not actually sung) over the voice, was a very useful reading that was carefully grounded, and which, moreover could be generalized to show that that which we conceive of as “the voice” appears within the dynamic between the vocalist and listener.
Feld, Steven. 1998. “They Repeatedly Lick Their Own Things.” Critical Inquiry 24 (2).
(See also related works by this scholar.)
This article is a scholarly and poetic meditation on intimacy, voice, and vocal knowledge. According to Feld (who identifies as both an ethnomusicologist and an anthropologist), this piece was written “by imagining Maurice Merleau-Ponty and Walter Benjamin accompanying me on a listening trip into the world of Bosavi storytelling in a Papua New Guinea Rainforest” (as he explains later in his 2012 monograph Jazz Cosmopolitanism in Accra: Five Musical Years in Ghana on p. 204). What I have found most useful in this piece is Feld’s notion of “intervocality,” which he introduces here and develops in later works (see, for example, Feld 2012 above). “Intervocality” links voice with the concepts of intertextuality and intersubjectivity. As he explains in a footnote in the article, “intervocality is a term I use to signify the inherently dialogic and embodied qualities of speaking and hearing. Intervocality underscores the link between the felt audition of one’s own voice, and the cumulatively embodied experience of aural resonance and memory” (471). Feld’s ideas have been most influential for me in thinking about various kinds of relationships that musicians forge through the phenomenology of voice, memory, and place.
Foucault, Michel. The History of Sexuality, Vol 1. New York: Vintage: 1980.
It might come as a surprise to see Foucault’s early work mentioned in this context. I am including this text because of the passage on the confession (well-known in Humanities circles). In this section, Foucault describes the confession as a project of subject formation: a speech act, in which a subject’s speech to a listener in power (priest, later therapist, teacher etc) creates his/her sense of self. Apart from a shift of the notion of expression of an inner life to that of the production of interiority through speech, the description also provides a crucial shift in the understanding of the positions of speaker and listener: while authority is usually associated with speaking (the master’s voice), Foucault locates it in the listener. His description of the scene of confession became my entry into voice studies in a paradoxical way: I became aware of the relevance of the voice to processes of subject formation by recognizing how many of the critical theories neglected the notion of vocality when thinking about speech acts.
Hailstone JC, Crutch SJ, Vestergaard MD, Patterson RD, Warren JD. (2009) Progressive associative phonagnosia: a neuropsychological analysis. Neuropsychologia, 48(4):1104-14,
Hailstone and colleagues present an in-depth study of two individuals who developed a progressive difficulty with recognizing voices, a clinical syndrome called “phonosagnosia”. Both patients completed an extensive battery of clinical tests assessing cognition and the processing of voices, faces, names, and sounds (environmental sounds and musical instruments). When presented with familiar voices, both patients had considerable difficulty recognizing familiar voices. Patient QR’s impairment in voice recognition was likely related to selective difficulty in associating familiar voices with other semantic knowledge about the people. The other patient, KL, appeared to have a deficit across different cognitive modalities (voices, faces, and names).
Kreiman, Jody, and Bruce R. Gerratt. 1998. “Validity of rating scale measures of voice quality.” The Journal of the Acoustical Society of America 104 (3): 1598-1608.
(See also related works by these scholars.)
While no explicit references are made, this scientific study joins other canonical contributions by cognitive ethnomusicologists (most notably, Cornelia Fales’ 2002 article, “The Paradox of Timbre,” Ethnomusicology, 46/1) in highlighting the perceptual discrepancies that listeners invariably have in how they listen to and judge vocal timbre (in the case of this article, the sounds of pathological voices). As Kreiman and Gerratt explain, “voice quality is an interaction between an acoustic voice stimulus and a listener; the acoustic signal itself does not possess vocal quality, it evokes it in the listener. For this reason, acoustic measures are meaningful primarily to the extent that they correspond to what listeners hear” (1598). In my own research, this work has provided a solid scientific basis for extrapolating cross-cultural models for voice quality perception that can never rely solely on field recordings, sonograms, or laryngoscopic visualization imagery to understand how voices are heard and judged by local and global listeners.
Jody Kreiman, Bruce R. Gerratt, and Mika Ito, “When and why listeners disagree in voice quality assessment tasks,” Journal of the Acoustical Society of America, Vol. 122 no. 4 (2007)
Jody Kreiman and Diana Sidtis, Foundations of Voice Studies: an Interdisciplinary Approach to Voice Production and Perception (Malden, MA: Wiley-Blackwell, 2011).
Incredibly useful text that carefully explains voice from a scientific vantage point, with high sensitivity to humanistic discourse and perspective. A foundation for the study of voice and a model in interdisciplinary conversation.
Levin, Theodore, and Valentina Süzükei. 2006. Where Rivers and Mountains Sing: Sound, Music, and Nomadism in Tuva and Beyond. Bloomington: Indiana University Press.
For nomadic herders and practitioners of throat-singing in Tuva (a Turkic-speaking republic located in south-central Siberia), Süzükei and Levin write, “harmonics represent not harmony, either cosmic or human, but, metaphorized as ‘voices,’ they are the sonic embodiment of landscapes, birds, and animals along with the spirits that inhabit them” (77). These voices, or ünner in Tuvan language, are produced as sonic praise and offerings to spirits inhabiting topographic features of the natural environment—mountains, rivers, caves, and animals. The voices emanate from both human and non-human forces. This work has without doubt been most influential in shaping my own research questions in approaching cross-cultural conceptualizations of the singing voice, traditional indigenous knowledge (TIK), and issues of vocal subjectivity in connection with human and non-human agency. I also find this work to be an excellent model for insider/outsider scholarly collaboration, as Levin (an American ethnomusicologist) co-wrote several chapters with Süzükei (an indigenous Tuvan musicologist).
Moten, Fred. 2003. In the Break: the Aesthetics of the Black Radical Tradition. Minneapolis: University of Minnesota Press.
Highly regarded for its theoretically nuanced readings and experimentally performative writing style, Moten’s book gathers a “polyphony of voices” to poetically think through the relationship between blackness, performance, phonic substance, and history created “in the break” between sound and writing about sound as well as, more literally, in the Middle Passage of African slavery and diaspora. As a work that draws from the intellectual traditions of philosophy, literature, history, and critical theory in order to take seriously the genre and “event” of performance, Moten’s book continues to inspire me to find innovative and improvisatory ways to write about music and sound.
Neumark, Norie. 2010. “Doing Things with Voices: Performativity and Voice.” In Voice: Vocal Aesthetics in Digital Arts and Media. Norie Neumark et al., eds. Cambridge, MA: MIT Press, 95-118.
Part of a recent compendium on voice-related research in the digital arts and new media, this work puts forth a critical synthesis of canonical theories of performativity (Austin, Butler), embodiment (LaBelle), and ontology (Cavarero) as they relate to voice. As Neumark writes, “I consider it still useful to approach voice as gesture and event—and to point to what voices do, how they create and disturb meaning and ‘identity’ rather than just conveying or expressing it… Embodied voices are always already mediated by culture: they are inherently modified by sex, gender, ethnicity, race, history, and so on. Through its performative quality, voice does not directly express or represent those cultural characteristics, it enacts them—it embodies them through its vocal actions” (Neumark 2010: 96; original emphasis).
Ong, Walter. 1982. Orality and Literacy: the Technologizing of the Word. New York: Routledge, 2nd edition.
A foundational text in the fields of media and sound studies, Ong’s study tracks not only the shift in consciousness from oral-based to written-based cultures but also comments upon, what he calls, a “second orality” evident in more recent communication technologies (e.g. television, telephones, phonographs). These analytical frameworks have been particularly useful in helping me re-imagine Filipino American phonography, as a form of writing sound, in relationship to performative possibilites of interactive communication technologies such as karaoke microphones.
Poizat, Michel. The Angel’s Cry: Beyond the Pleasure Principle in Opera. 1986. Trans. Arthur Denner. Ithaca: Cornell UP, 1992.
This book explores the enjoyment of the amateur opera fan from a psychoanalytic perspective, an enjoyment arising from a dialectic of affirming and transgressing the limits to pleasure (jouissance) that culture legislates. The pleasure of opera tends toward the “ec-static,” moving the listening subject outside himself and the language that constitutes him, as the etymology of the adjective suggests. It thus takes him beyond the pleasure principle and a homeostasis sustained by the moderation and regulation of pleasure through the work of language. The piercing and unintelligible high-pitched scream or cry for which the fan listens and toward which opera tends, according to Poizat, even though it is not usually written in the score (and thus is akin to silence), usually coincides with the diva’s death in Romanitc opera and prefigures the listener’s as it shatters the mastery of music by speech and disrupts the listening subject’s identity and social integration. It thereby affords access to non-being and an excessive jouissance or enjoyment that unsettles difference, including not only the human/animal binary (since Western philosophy sees language as the distinctive mark of the human), but also the human/divine–both animals and angels communicate, or, rather, “charm,” each other wordlessly, like Orpheus and his lyre, as seemingly insurmountable barriers between self and Other dissolve. This dimension of voice highlights its status as what Lacan terms a lost “objet-a,” the unidentifiable part of an object that drives desire for it and exceeds the power of language and image to specify it and how to relate to it in fantasies, including those of opera. It is this revolutionary power of voice that has impelled many religions to regulate the relation of singing and music to language. The prototype of such an excessive object is the mother’s breast, and the operatic cry resonates with the infant’s cry at its loss, a cry the mother makes meaningful by inscribing it with an interpretation arising from her desires and the fantasies giving them a shape. But for Lacan, there is no lost object to retrieve; the breast, the voice, the gaze, and the phallus are fantasmatic back-formations of language and the illusions of a perfectly satisfying relation to the world and Others, one without lack. The wordlessness of the cry, silence, and death invoke a return to a time before symbolization and the limits and differences it institutes, including sexual difference. Hence, for Poizat the most ravishing voices are androgynous or transsexual like angels, outside the logic of castration, desire, and a limited phallic pleasure, which accounts for the wild popularity of the castrato in the early modern period and, later, of sopranos in trouser roles.
Puckette, Miller. “Phase-bashed packet synthesis: a musical test.” Proceedings, ICMC (2006), pp. 507-510.
Roshanak Kheshti, Modernity’s Ear (forthcoming).
In the introduction to this manuscript Kheshti reads the cultural and sonic legacies inaugurated by female comparative musicologists like Frances Densmore. Taking up the famed photograph of Densmore making a wax cylinder recording of Blackfoot Moutain chief, Kheshti brilliantly reads the phonograph as a technology designed not so much to record sound, but the aural traces of power structured in moments of “cultural encounter.” Her thesis is that the phonograph, and sound recording in general (in particular the World Music industry), reproduces these structurations and normalizes them for a listening public in playback. This, through listening we are all filtered through the sonic perspective of what she terms “modernity’s ear.” Her analysis keenly parses the biopolitical technics of a sound, and identifies listening as a technique of government.
In “Musical Miscegenation” Kheshti traces the queer, racialized, and gendered politics of white male desire for rock music that sounds racially “mixed.” Working from Derrida’s theory of invagination, Kheshti identifies sound and the real and figurative bodies of black men and women as (re)productive sites for white hetero-normative power. A scholar whose work routinely unpacks the biopolitics of the aural, Kheshti provides much needed interventions to the fields of sound studies, cultural studies, and anthropology. Her work, though firmly rooted in the 20th and 21st centuries, presciently diagnoses the larger cultural symptoms of sound’s function in modernity.
John Shepherd and Peter Wicke, Music and Cultural Theory (Cambridge, England; Malden, Mass.: Polity Press; Published in the USA by Blackwell Publishers, 1997.
I found their conception of timbre through term and concept “the sonic saddle” immensely helpful. It offered a model for thinking through the relationship and dynamic between something that may appear for us (or perhaps that we conjure up in our minds?) and the values we ask for it to carry.
Diana Sidtis & Jody Kreiman, “In the Beginning Was the Familiar Voice: Personally Familiar Voices in the Evolutionary and Contemporary Biology of Communication,” Integrative Psychological and Behavioral Science, Vol. 42 no. 1 (2008)
Diana Sidtis & Jody Kreiman, “Voices & Listeners: Toward a Model of Voice Perception,” Acoustics Today, Vol. 7 no. 4 (2011)
Silverman, Kaja. The Acoustic Mirror (1988).
Silverman debunks the fantasy of the maternal voice that she finds in Chion, as well as some of his other gendered assumptions about the voice. She notes that women’s voices are often not permitted the power of asynchronicity: that is, women’s voices must always be restored to bodies, so as to prevent them accessing the omnipotent powers of the bodiless voice. She then argues for more feminine acousmêtres, among other radical cinematic gestures.
Sterne, Jonathan (2003). The Audible Past: Cultural Origins of Sound Reproduction. Durham, NC, Duke UP.
A path-breaking text. Cultural history of sound reproduction and listening, beginning with the stethoscope (instead of the phonograph, the machine often represented as the first “hearing machine” in cultural histories of sound). Sterne’s epistemological focus, i.e. the project of showing how sound was established as an object of knowledge, has been incredibly valuable to my own understanding of sound/voice, in particular the relationship of sound-body-knowledge in the chapter on the stethoscope, the discussion of sound fidelity and the problematization of sound reproduction vs the “live original” – I could go on and on…
Wan et al. (2010)
Weheliye, Alexander. 2005. Phonographies: Grooves in Sonic Afro-modernity. Durham: Duke University Press.
In this work, Weheliye deftly brings together literary works and contemporary popular culture in order to examine how black cultural production (specifically, “sonic Afro-modernity”) has been created in tandem with and against the main tenets of Western modernity, specifically the relationship between sound, recording technology, and writing. By juxtaposing literary and theoretical texts with sound recordings, Weheliye’s re-mix methodology has greatly helped me think through the multi-disciplinary category of voice (poetic, performance, recorded, and political).
Wehelyie examines the ‘numerous links and relays between twentieth-century black cultural production and sound technologies such as the phonograph and the Walkman. (2005: 3) and claims that ‘sonic blackness’ as well as black culture are central to to Western modernity. Weheliye detects the significance of sound and its technologies not only in musical works but also in theoretical and literary texts, e.g. W.E. DuBois, Ralph Ellison’s and Langston Hughes’ writings, and a variety of films. His analysis of examination of ‘Sonic Afro-Modernity’ demonstrates that the lively black cultural production of then 20th century has been made possible by sound technologies and makes clear the creative use of such technologies within Black cultural production.
Wheeler, Leslie (2208), Voicing American Poetry: Sound and Performance from the 1920s to the Present. Ithaca: Cornell UP.
A study of the elusive notion of “voice” in modern poetry at the intersection of what Wheeler class textual voice, voice used as metaphor, and voiced texts, the vocal performances of poem. The introductory chapters provide an excellent overview over the discussion of voice in literary theory, venturing into Sound Studies. IN individual chapters, the book discusses the role of voice in modern poetry through the examples of individual authors’ works in different performative contexts – from Edna St. Vincent Millay’s radio performances to contemporary poetry readings and slams. The study includes an interesting chapter on the impact of the institutionalized context of MFA programs on the role of voice in writing.
The study has been useful for my current work on writing voicing, although my own scholarship focuses on voice in narrative. There has not been very much interesting work available on voice in writing, and I share Wheeler’s approach to it as a question of different media practices (even though she does not explicitly call it that).
Available on voice in writing, and I share Wheeler’s approach to it as a question of different media practices (even though she does not explicitly call it that).
Wishart, Trevor. “The Composition of Vox-5”. Computer Music Journal (1988): 21-27.
Wishart describes the phase-vocoder-bassed techniques that went into his piece, Vox-5. Spoiler alert: he applies filterbanks to separate sinusoidal components of the sound form each other so that he can manipulate them independently.
Wong, Deborah. 2004. Speak It Louder: Asian Americans Making Music. New York/London: Routledge.
A widely recognized name in the field of ethnomusicology, Wong’s groundbreaking work is the only book-length study focused on Asian Americans and music. With an emphasis on Asian Americans making music (as process) versus “Asian American music” (as musical category), Wong covers a wide range of musical genres and events—hip hop, taiko drumming, karaoke singing, jazz improvisation, and outdoor festivals, to name a few—and argues for popular music’s performativity, its ability to constitute and not merely reflect social realities. Recognizing this potential in the study of popular music is central to my own work’s assertions.
Zarate JM. (2013) The neural control of singing. Frontiers in Human Neuroscience, 7:237, 1-12.
This publication is an excellent review of more than two decades of research about the neurobiological aspects of singing. The author discusses the process of vocalization, beginning with the vocal tract and ending up with discussion about the brain network involved in singing and integration with sensory feedback. Based on a review of the brain imaging studies, Zarate notes that singing engages both the vocal motor and sensory networks in highly complex and precise ways. She also makes the argument that vocal motor control in humans is both hierarchical and parallel based on observations of preserved and impaired skills in persons with brain damage. She also discusses how music training modifies the basic brain networks used for vocalization.